speechd-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Speech Dispatcher roadmap discussion.


From: kendell clark
Subject: Speech Dispatcher roadmap discussion.
Date: Wed, 08 Oct 2014 03:37:43 -0500

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

I'll plus one this. Just as an aside, in both mate and gnome, in sound
settings I see multiple speech-dispatcher entries, with no indication
as to what synth those respective volume sliders pertain. I usually
have to guess at which one my currently playing synth is  if I want to
adjust the volume, which changes at runtime. To de geek that, there
are multiple entries in sound settings calling themselves
speech-dispatcher, and if I want to adjust the volume of espeak,  I
have to try to guess which speech-dispatcher entry controlls espeak
and not, say, an inactive dummy, festival, or ibmtts module. This will
change every time I close and reopen speech-dispatcher. It might be
the middle entry one time, the last another, etc. I'll also add that
maybe we should think about implementing a system where  only the
active modules are loaded, leaving the rest inactive. THe dummy might
need to be left running just in case something happens to the active
one, but in practice this never works. If ibmtts crashes, I"m left
with no speech at all, instead of the dummy one kicking in like, I
think, it's supposed to. I'd guess that speech-dispatcher has no way
of figuring out that the speech synth it's trying to talk to has
crashed, so keeps trying to send to it. I have no idea how this would
be done though. It might also reduce memory usage slightly, although
speech-dispatcher doesn't usually use up much memory. What it will use
up a lot of at times is cpu, and I have no idea how or why this happens.
Thanks
Kendell clark



On 10/08/2014 02:32 AM, Luke Yelavich wrote:
> Hey folks. This has been a long time coming. I originally promised
> a roadmap shortly after taking up Speech Dispatcher maintainership.
> Unfortunately, as is often the case, real life and other work
> related tasks got in the way, however I am now able to give some
> attention to thinking about where to take the project from here. It
> should be noted that a lot of what is here is based on roadmap
> discussions back in 2010(1) and roadmap documents on the project
> website.(2) Since then, much has changed in the wider *nix
> ecosystem, and there have been some changes in underlying system
> services, and there are now additional requirements that need to be
> considered.
> 
> I haven't given any thought as to version numbering at this point,
> I'd say all of the below is 0.9. If we find any critical bugs that
> need fixing, we can always put out another 0.8 bugfix release in
> the meantime.
> 
> The roadmap items, as well as my thoughts are below.
> 
> * Implement event-based main loops in the server and modules
> 
> I don't think this requires much explanation. IMO this is one of
> the first things to be done, as it lays some important groundwork
> for other improvements as mentioned below. Since we use Glib, my
> proposal is to use the Glib main loop system. It is very flexible,
> and easy to work with.
> 
> * Assess whether the SSIP protocol needs to be extended to better
> support available synthesizer features
> 
> Two questions that often get asked in the wider community are: 1.
> Can I get Speech Dispatcher to write audio to a wav file? 2. How
> can I use eSpeak's extra voices for various languages?
> 
> We should have a look at the SSIP protocol, as well as the features
> offered by the synthesizers we support today, and determine whether
> we need to extend SSIP to support everything that the synthesizers
> have to offer. This may require changes or additions to the client
> API, particularly for the wav file audio output that prospective
> clients may wish to use.
> 
> * Assess DBus use for IPC between client and server
> 
> Brailcom raised this back in 2010, and the website mentions
> analysis being required, however I have no idea what they had in
> mind. Nevertheless, using DBus as the client-server IPC is worth
> considering, particularly with regards to application confinement,
> and client API, see below. Work is ongoing to put the core part of
> DBus into the kernel, so once that is done, performance should be
> much improved.
> 
> Its worth noting that DBus doesn't necessarily have to be used for
> everything. DBus could be used only to spawn the server daemon and
> nothing else, or the client API library could use DBus to initiate
> a connection via DBus, setting up a unix socket per client. I
> haven't thought this through, so I may be missing the mark on some
> of these ideas, but we should look at all options.
> 
> * SystemD/LoginD integration
> 
> In many Linux distros today, SystemD is used for system boot and
> service management. Part of this is the use of LoginD for user
> session/login management, which replaces ConsoleKit. The roadmap
> documentation on the project website goes into some detail as to
> why this is required, but an email from Hynek goes into even more
> detail.(3) Even though he talks about ConsoleKit, it is the same
> with LoginD.
> 
> I am aware that some distros still do not use LoginD, so we may
> need to implement things such that support or other systems can be
> used, i.e if ConsoleKit is still being used dispite its
> deprecation, then we should support it also. I don't think Gentoo
> uses SystemD, so if someone could enlighten me what Gentoo uses for
> session management, I would appreciate it.
> 
> * Support confined application environments
> 
> Like it or not, ensuring applications have access to only what they
> need is becoming more important, and even open source desktop
> environments are looking into implementing confinement for
> applications. Unfortunately no standard confinement framework is
> being used, so this will likely need to be modular to support
> apparmor/whatever GNOME is using. Apparmor is what Ubuntu is using
> for application confinement going forward.
> 
> * Rework of the settings mechanism to use DConf/GSettings
> 
> There was another good discussion about this back in 2010. You will
> find this discussion in the same link I linked to above with
> regards to Consolekit/LoginD. GSettings has seen many improvements
> since then, which will help in creating some sort of configuration
> application/interface for users to use to configure Speech
> Dispatcher, should they need to configure it at all. Using
> GSettings, a user can make a settings change, and it can be acted
> on immediately without a server or module restart. GSettings also
> solves the system/user configuration problem, in that if the user
> has not changed a setting, the system-wide setting is used as the
> default until the user changes that setting. We could also extend
> the client API to allow clients to have more control over Speech
> Dispatcher settings that affect them, and have those settings be
> applied on a client by client basis. I think we already have
> something like this now, but the client cannot change those
> settings via an API.
> 
> * Separate compilation and distribution of modules
> 
> As much as many of us prefer open source synthesizers, there are
> instances where users would prefer to use proprietary synthesizers.
> We cannot always hope to be able to provide a driver for all
> synthesizers, so Speech Dispatcher needs an interface to allow
> synthesizer driver developers to write support for Speech
> Dispatcher, and build it, outside the Speech Dispatcher source
> tree.
> 
> * Consider refactoring client API code such that we only have one
> client API codebase to maintain, i.e python bindings wrapping the C
> library etc
> 
> This is one that was not raised previously, but it is something I
> have been thinking about recently. At the moment, we have multiple
> implementations of the API for different languages, python and C
> come to mind. There are others, but this may not be applicable to
> them, i.e guile, java, etc.
> 
> I have been pondering whether it would save us work in maintenance
> if we only had one client API codebase to maintain, that being the
> C library. There are 2 ways to provide python bindings from a C
> library, and there may be more. Should we decide to go down this
> path, all should be considered. The two that come to mind are
> outlined below. I've also included some pros and cons, but their is
> likely more that I haven't thought of.
> 
> Using cython: Pros: * Provides both python 2 and 3 support *
> Produces a compiled module that works with the version of python it
> was built against, and should only require python itself as well as
> the Speech Dispatcher client library at runtime Cons: * Requires
> knowledge of cython and its syntax that mixes python and C *
> Requires extra code
> 
> Using GObject introspection: Pros: * Provides support for any
> language that has GObject introspection support, which immediately
> broadens the API's usefulness beyond python * Has good python 2 and
> 3 support * Little to no extra code needs to be written but does
> require that the C library be refactored, see below Cons: *
> Introduces more dependencies that need to be present at runtime *
> Requires the C library to be refactored to be a GObject based
> library and annotation is required to provide introspection
> support
> 
> My understanding of both options may be lacking, so I have likely
> missed something, please feel free to add to the above.
> 
> * Moving audio drivers from the modules to the server
> 
> Another one that was not raised previously, but needs to be
> considered. I thought about this after considering various use
> cases for Speech Dispatcher and its clients, particularly Orca.
> This is one that is likely going to benefit pulse users more than
> other audio driver users, but I am sure people can think of other
> reasons.
> 
> At the moment, when using pulseaudio, Speech Dispatcher connects to
> pulseaudio per synthesizer, and not per client. This means that if
> a user has Orca configured to use different synthesizers for say
> the system and hyperlink voices, then these synthesizers have
> individual connections to PulseAudio. When viewing a list of
> currently connected PulseAudio clients, you see names like
> sd_espeak, or sd_ibmtts, and not Orca, as you would expect.
> Furthermore, if you adjust the volume of one of these pulse
> clients, the change will only affect that particular speech
> synthesizer, and not the entire audio output of Orca. What is more,
> multiple Speech Dispatcher clients may be using that same
> synthesizer, so if volume is changed at the PulseAudio level, then
> an unknown number of Speech Dispatcher clients using that
> synthesizer are affected. In addition, if the user wishes to send
> Orca output to another audio device, then they have to change the
> output device for multiple Pulse clients, and as a result they may
> also be moving the output of another Speech Dispatcher client to a
> different audio device where they don't want it.
> 
> Actually, the choice of what sound device to use per Speech
> Dispatcher client can be applied to all audio output drivers. In
> other words, moving management of output audio to the server would
> allow us to offer clients the ability to choose the sound device
> that their audio is sent to.
> 
> Please feel free to respond with further discussion points about
> anything I have raised here, or if you have another suggestion for
> roadmap inclusion, I'd also love to hear it.
> 
> Luke
> 
> (1)
> http://lists.freebsoft.org/pipermail/speechd/2010q3/002360.html (2)
> http://devel.freebsoft.org/speechd-roadmap (3)
> http://lists.freebsoft.org/pipermail/speechd/2010q3/002406.html
> 
> _______________________________________________ Speechd mailing
> list Speechd at lists.freebsoft.org 
> http://lists.freebsoft.org/mailman/listinfo/speechd
> 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQIcBAEBCgAGBQJUNPfXAAoJEGYgJ5/kqBTdAngQAInLmNByQleM9laFULMMOrYA
bHzsNpI6Rnewgp8EeOcLw7uVAy8ddIuL9oxroYu45vUawURQ+ehJxB1VfhkWuegT
LTKHhk2mP+apN67XiwBwGrJsgzrTZzVuIYqwF8FL7ULPj+zRCmbDg2pyXnhnoNm4
X4/97uBnXiRiUpnYk84gtsB+6NquYAMNJEvVseiFzag86qCyIGuHxFqXFq78PaZL
OViV+U6tpb1yrzjwjUAumuUNYmATSMhUtgvHpytMHaMLKNQwllxnOvtTUX05Hv0e
G4kBbiMKbmCTXHojEd+MGWEFHztSO0LEF2qGFmogOupu2VqJQIE3cwoOoH0Kdpql
Uxn/H4elzWTpgdLbchJcpZ+wQX8Y+BMb0CBLxsSvzlDBPT0um34ZBs6l7nnposP7
jUBRlE3vj904XVLVDFXFkrDwQVmAPI/hsXDwRIL7q5w2/yjtu94qi35pZWrh/KB5
BxlSzV9u8Cb3GiLP7lIU6IgI2kxecByTtYEwpf054lJH+XOy+gXoO7Zjk3PMMC7t
B9VmL7E2QmF+MjHkceECPPCUpz9sK7E/NI8A/Nfs4SYqaCHqKmf4spLOFyq9ATPB
FBwjb+PBYq8e2ZIIB+D3nUA41LpSev2RwnWnPcpiCYwE/jUDQSiJBhvNM8532zId
14uAKxPXWa97cVBHOHSo
=ok2q
-----END PGP SIGNATURE-----



reply via email to

[Prev in Thread] Current Thread [Next in Thread]