speechd-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Speech Dispatcher roadmap discussion.


From: Trevor Saunders
Subject: Speech Dispatcher roadmap discussion.
Date: Thu, 9 Oct 2014 15:55:08 -0400

On Wed, Oct 08, 2014 at 06:32:09PM +1100, Luke Yelavich wrote:
> Hey folks.
> This has been a long time coming. I originally promised a roadmap shortly 
> after taking up Speech Dispatcher maintainership. Unfortunately, as is often 
> the case, real life and other work related tasks got in the way, however I am 
> now able to give some attention to thinking about where to take the project 
> from here. It should be noted that a lot of what is here is based on roadmap 
> discussions back in 2010(1) and roadmap documents on the project website.(2) 
> Since then, much has changed in the wider *nix ecosystem, and there have been 
> some changes in underlying system services, and there are now additional 
> requirements that need to be considered.
> 
> I haven't given any thought as to version numbering at this point, I'd say 
> all of the below is 0.9. If we find any critical bugs that need fixing, we 
> can always put out another 0.8 bugfix release in the meantime.
> 
> The roadmap items, as well as my thoughts are below.
> 
> * Implement event-based main loops in the server and modules
> 
> I don't think this requires much explanation. IMO this is one of the first 
> things to be done, as it lays some important groundwork for other 
> improvements as mentioned below. Since we use Glib, my proposal is to use the 
> Glib main loop system. It is very flexible, and easy to work with.

I'm not seeing how any of the below things actually depend on changing
this, or how your distinguishing select(2) from "event based".

> * Assess DBus use for IPC between client and server
> 
> Brailcom raised this back in 2010, and the website mentions analysis being 
> required, however I have no idea what they had in mind. Nevertheless, using 
> DBus as the client-server IPC is worth considering, particularly with regards 
> to application confinement, and client API, see below. Work is ongoing to put 
> the core part of DBus into the kernel, so once that is done, performance 
> should be much improved.
> 
> Its worth noting that DBus doesn't necessarily have to be used for 
> everything. DBus could be used only to spawn the server daemon and nothing 
> else, or the client API library could use DBus to initiate a connection via 
> DBus, setting up a unix socket per client. I haven't thought this through, so 
> I may be missing the mark on some of these ideas, but we should look at all 
> options.

I'm not really sure what the point would be, especially since we'd want
to keep unix sockets / tcp for backwards compat with things that don't
use libspeechd.  In theory using an existing IPC framework seems nice,
but given the dbus code I've read I'm not convinced its actually any
better.

> * Support confined application environments
> 
> Like it or not, ensuring applications have access to only what they need is 
> becoming more important, and even open source desktop environments are 
> looking into implementing confinement for applications. Unfortunately no 
> standard confinement framework is being used, so this will likely need to be 
> modular to support apparmor/whatever GNOME is using. Apparmor is what Ubuntu 
> is using for application confinement going forward.

Well, in principal it makes sense although getting that right on unix
within user ids is pretty footgun prone.  Anyway presumably people will
use policies that don't restrict things that don't specify what they
need, if you don't do that of course things will break, and I'll
probably say that's not my fault.

> * Rework of the settings mechanism to use DConf/GSettings
> 
> There was another good discussion about this back in 2010. You will find this 
> discussion in the same link I linked to above with regards to 
> Consolekit/LoginD. GSettings has seen many improvements since then, which 
> will help in creating some sort of configuration application/interface for 
> users to use to configure Speech Dispatcher, should they need to configure it 
> at all. Using GSettings, a user can make a settings change, and it can be 
> acted on immediately without a server or module restart. GSettings also 
> solves the system/user configuration problem, in that if the user has not 
> changed a setting, the system-wide setting is used as the default until the 
> user changes that setting. We could also extend the client API to allow 
> clients to have more control over Speech Dispatcher settings that affect 
> them, and have those settings be applied on a client by client basis. I think 
> we already have something like this now, but the client cannot change those 
> settings via an API.

So, I think we can classify the config options into 3 catagories.

* server config (socket to listen on, log file etc)

 I think if you want to change this sort of thing then you don't really
 care about a nice UI, and text files are fine.

* audio

I think this is somewhat the same as the previous, though maybe we need
to get better at automatically doing the right thing first.

* module stuff.

I think we should allow clients to control that and then rip out the
configuration options.  I think in practice the only time we see people
change this is when they want to control things they
can't do from Orca.

> * Separate compilation and distribution of modules
> 
> As much as many of us prefer open source synthesizers, there are instances 
> where users would prefer to use proprietary synthesizers. We cannot always 
> hope to be able to provide a driver for all synthesizers, so Speech 
> Dispatcher needs an interface to allow synthesizer driver developers to write 
> support for Speech Dispatcher, and build it, outside the Speech Dispatcher 
> source tree.

How is this not possible today? I expect if you drop an executable in
/usr/lib/speech-dispatcher-modules/ and ask to use it speech dispatcher
will use it, and the protocol is at least sort of documented.

> * Consider refactoring client API code such that we only have one client API 
> codebase to maintain, i.e python bindings wrapping the C library etc

I'm not sure there's much to win here, but if someone cares they're free
to prove me wrong ;)

> * Moving audio drivers from the modules to the server
> 
> Another one that was not raised previously, but needs to be considered. I 
> thought about this after considering various use cases for Speech Dispatcher 
> and its clients, particularly Orca. This is one that is likely going to 
> benefit pulse users more than other audio driver users, but I am sure people 
> can think of other reasons.
> 
> At the moment, when using pulseaudio, Speech Dispatcher connects to 
> pulseaudio per synthesizer, and not per client. This means that if a user has 
> Orca configured to use different synthesizers for say the system and 
> hyperlink voices, then these synthesizers have individual connections to 
> PulseAudio. When viewing a list of currently connected PulseAudio clients, 
> you see names like sd_espeak, or sd_ibmtts, and not Orca, as you would 
> expect. Furthermore, if you adjust the volume of one of these pulse clients, 
> the change will only affect that particular speech synthesizer, and not the 
> entire audio output of Orca. What is more, multiple Speech Dispatcher clients 
> may be using that same synthesizer, so if volume is changed at the PulseAudio 
> level, then an unknown number of Speech Dispatcher clients using that 
> synthesizer are affected. In addition, if the user wishes to send Orca output 
> to another audio device, then they have to change the output device for 
> multiple Pulse clients, and as a result they may also be moving the output of 
> another Speech Dispatcher client to a different audio device where they don't 
> want it.

The first part of this seems like a short comming of using the pulse
volume control instead of the one in orca, but anyway.

couldn't we accomplish the same thing with less movement of  lots of
data by changing when we connect modules to the audio output?

> Actually, the choice of what sound device to use per Speech Dispatcher client 
> can be applied to all audio output drivers. In other words, moving management 
> of output audio to the server would allow us to offer clients the ability to 
> choose the sound device that their audio is sent to.

I think allowing clients to choose where audio goes is valuable, and the
way to implement audio file retrieval, but it seems to me we can manage
the client -> audio output management in the server and just reconfig
the module when the client it is synthesizing for changes.

> Please feel free to respond with further discussion points about anything I 
> have raised here, or if you have another suggestion for roadmap inclusion, 
> I'd also love to hear it.

Well, I'm not actually sure if I think its a good idea or not, but I
know Chris has wished we used C++, and at this point I may agree with
him.

Trev

> 
> Luke
> 
> (1) http://lists.freebsoft.org/pipermail/speechd/2010q3/002360.html
> (2) http://devel.freebsoft.org/speechd-roadmap
> (3) http://lists.freebsoft.org/pipermail/speechd/2010q3/002406.html
> 
> _______________________________________________
> Speechd mailing list
> Speechd at lists.freebsoft.org
> http://lists.freebsoft.org/mailman/listinfo/speechd



reply via email to

[Prev in Thread] Current Thread [Next in Thread]