speechd-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Speech Dispatcher roadmap discussion.


From: Trevor Saunders
Subject: Speech Dispatcher roadmap discussion.
Date: Sat, 11 Oct 2014 18:24:51 -0400

On Fri, Oct 10, 2014 at 12:37:24PM +1100, Luke Yelavich wrote:
> On Fri, Oct 10, 2014 at 06:55:08AM AEDT, Trevor Saunders wrote:
> > On Wed, Oct 08, 2014 at 06:32:09PM +1100, Luke Yelavich wrote:
> > > Hey folks.
> > > This has been a long time coming. I originally promised a roadmap shortly 
> > > after taking up Speech Dispatcher maintainership. Unfortunately, as is 
> > > often the case, real life and other work related tasks got in the way, 
> > > however I am now able to give some attention to thinking about where to 
> > > take the project from here. It should be noted that a lot of what is here 
> > > is based on roadmap discussions back in 2010(1) and roadmap documents on 
> > > the project website.(2) Since then, much has changed in the wider *nix 
> > > ecosystem, and there have been some changes in underlying system 
> > > services, and there are now additional requirements that need to be 
> > > considered.
> > > 
> > > I haven't given any thought as to version numbering at this point, I'd 
> > > say all of the below is 0.9. If we find any critical bugs that need 
> > > fixing, we can always put out another 0.8 bugfix release in the meantime.
> > > 
> > > The roadmap items, as well as my thoughts are below.
> > > 
> > > * Implement event-based main loops in the server and modules
> > > 
> > > I don't think this requires much explanation. IMO this is one of the 
> > > first things to be done, as it lays some important groundwork for other 
> > > improvements as mentioned below. Since we use Glib, my proposal is to use 
> > > the Glib main loop system. It is very flexible, and easy to work with.
> > 
> > I'm not seeing how any of the below things actually depend on changing
> > this, or how your distinguishing select(2) from "event based".
> 
> Ok, currently we use select with no timeout, so the main server loop waits 
> for select to return activity on any of the file descriptors. We would have 
> to change the main loop implementation such that we can receive events when 
> the active session changes with LoginD/ConsoleKit, as well as any settings 
> change events when a setting is changed. Even if we were to still use a 
> file-based config system, we could use file monitoring via glib to watch for 
> file activity on the config files, and act on those events.

Ok, the session management part at least makes sense.  I'm not
particularly sold on the idea of automatically picking up config
changes.

> I am of the opinion that it is easier to use code that is already written as 
> part of one of the supporting libraries we use, rather than re-implement a 
> main loop ourselves, and thereby we can spend more time improving Speech 
> Dispatcher itself.

So long as its actually easier to rework everything and doesn't result
in less understandable code sure.

> > > * Assess DBus use for IPC between client and server
> > > 
> > > Brailcom raised this back in 2010, and the website mentions analysis 
> > > being required, however I have no idea what they had in mind. 
> > > Nevertheless, using DBus as the client-server IPC is worth considering, 
> > > particularly with regards to application confinement, and client API, see 
> > > below. Work is ongoing to put the core part of DBus into the kernel, so 
> > > once that is done, performance should be much improved.
> > > 
> > > Its worth noting that DBus doesn't necessarily have to be used for 
> > > everything. DBus could be used only to spawn the server daemon and 
> > > nothing else, or the client API library could use DBus to initiate a 
> > > connection via DBus, setting up a unix socket per client. I haven't 
> > > thought this through, so I may be missing the mark on some of these 
> > > ideas, but we should look at all options.
> > 
> > I'm not really sure what the point would be, especially since we'd want
> > to keep unix sockets / tcp for backwards compat with things that don't
> > use libspeechd.  In theory using an existing IPC framework seems nice,
> > but given the dbus code I've read I'm not convinced its actually any
> > better.
> 
> Yeah as I said in my reply to Halim, I don't personally agree with this, but 
> I added it since it was on the original roadmap.

yeah, sounds like it won't happen because nobody wants it or cares.

> > > * Support confined application environments
> > > 
> > > Like it or not, ensuring applications have access to only what they need 
> > > is becoming more important, and even open source desktop environments are 
> > > looking into implementing confinement for applications. Unfortunately no 
> > > standard confinement framework is being used, so this will likely need to 
> > > be modular to support apparmor/whatever GNOME is using. Apparmor is what 
> > > Ubuntu is using for application confinement going forward.
> > 
> > Well, in principal it makes sense although getting that right on unix
> > within user ids is pretty footgun prone.  Anyway presumably people will
> > use policies that don't restrict things that don't specify what they
> > need, if you don't do that of course things will break, and I'll
> > probably say that's not my fault.
> 
> From my understanding so far, confinement is something that also has support 
> at the kernel level, and I think it goes much beyond just user/group ID 
> restrictions, it even goes so far as preventing an application from using 
> particular services unless it absolutely and clearly defines that they are 
> needed for operation.

yes, some machinary for enforcing sandboxing lives in the kernel.
People certainly do try and restrict programs within a uid, my point is
just that that is something that's very hard to get right because it
changes a lot of assumptions.

Now some distro certainly can ship a default policy that requires all
programs to specify what resources they need.  I'm just saying its their
responsibility to make sure everything works in that configuration.  I
think its a valuable thing to get working eventually, but I don't see it
as my job to get it working.  That said I suspect most of what needs
done is just writing config files that specify what speechd uses, we
could maybe think about making the daemon chroot itself, but between
logging and audio I'm not sure how we'd make that work.

> I am still researching and trying to come to an understanding about 
> confinement, I just know that it is a thing for multiple desktop environments 
> going forward.
> 
> > > * Rework of the settings mechanism to use DConf/GSettings
> > > 
> > > There was another good discussion about this back in 2010. You will find 
> > > this discussion in the same link I linked to above with regards to 
> > > Consolekit/LoginD. GSettings has seen many improvements since then, which 
> > > will help in creating some sort of configuration application/interface 
> > > for users to use to configure Speech Dispatcher, should they need to 
> > > configure it at all. Using GSettings, a user can make a settings change, 
> > > and it can be acted on immediately without a server or module restart. 
> > > GSettings also solves the system/user configuration problem, in that if 
> > > the user has not changed a setting, the system-wide setting is used as 
> > > the default until the user changes that setting. We could also extend the 
> > > client API to allow clients to have more control over Speech Dispatcher 
> > > settings that affect them, and have those settings be applied on a client 
> > > by client basis. I think we already have something like this now, but the 
> > > client cannot change those settings via an API.
> > 
> > So, I think we can classify the config options into 3 catagories.
> > 
> > * server config (socket to listen on, log file etc)
> > 
> >  I think if you want to change this sort of thing then you don't really
> >  care about a nice UI, and text files are fine.
> 
> Or maybe even command-line only, and have a reasonable set of defaults set at 
> build time.

yeah, I don't actually see a reason having more than command line
options is valuable.

> 
> Having said that, there may be use cases where an admin is deploying systems 
> where tight control of logging content may be required. With the right 
> backend, gsettings values can be locked down such that users cannot change 
> their values. Dconf certainly supports this. A text file that is only stored 
> in a system location for these values also works, but gsettings also allows 
> for vendor patch files that can be put in place to set the defaults. A text 
> file would likely mean an admin has to edit a text file every time the system 
> or Speech Dispatcher package is updated.

When speechd runs as a user process I don't see how it makes sense for a
system configuration to block logging to a file it has permission to
write to, after all presumably the client could just log there itself
ignoring the configuration.

> > * audio
> > 
> > I think this is somewhat the same as the previous, though maybe we need
> > to get better at automatically doing the right thing first.
> 
> True, but I also think the above can be applied to this as well.

Well, I don't think it makes much sense for an admin to block speechd
using an audio device, but allow the user to use it directly.

> > > * Separate compilation and distribution of modules
> > > 
> > > As much as many of us prefer open source synthesizers, there are 
> > > instances where users would prefer to use proprietary synthesizers. We 
> > > cannot always hope to be able to provide a driver for all synthesizers, 
> > > so Speech Dispatcher needs an interface to allow synthesizer driver 
> > > developers to write support for Speech Dispatcher, and build it, outside 
> > > the Speech Dispatcher source tree.
> > 
> > How is this not possible today? I expect if you drop an executable in
> > /usr/lib/speech-dispatcher-modules/ and ask to use it speech dispatcher
> > will use it, and the protocol is at least sort of documented.
> 
> Sure, but this would allow for scenarios like Debian to be able to ship a 
> module for Speech Dispatcher that works with Pico, given Pico is non-free 
> according to Debian's guidelines. Atm Debian users have to use a generic 
> config file via sd_generic, or rebuild Speech Dispatcher themselves with pico 
> installed.

I don't understand why debian can't ship a pico module in a separate
package today, can you explain?

> > > * Moving audio drivers from the modules to the server
> > > 
> > > Another one that was not raised previously, but needs to be considered. I 
> > > thought about this after considering various use cases for Speech 
> > > Dispatcher and its clients, particularly Orca. This is one that is likely 
> > > going to benefit pulse users more than other audio driver users, but I am 
> > > sure people can think of other reasons.
> > > 
> > > At the moment, when using pulseaudio, Speech Dispatcher connects to 
> > > pulseaudio per synthesizer, and not per client. This means that if a user 
> > > has Orca configured to use different synthesizers for say the system and 
> > > hyperlink voices, then these synthesizers have individual connections to 
> > > PulseAudio. When viewing a list of currently connected PulseAudio 
> > > clients, you see names like sd_espeak, or sd_ibmtts, and not Orca, as you 
> > > would expect. Furthermore, if you adjust the volume of one of these pulse 
> > > clients, the change will only affect that particular speech synthesizer, 
> > > and not the entire audio output of Orca. What is more, multiple Speech 
> > > Dispatcher clients may be using that same synthesizer, so if volume is 
> > > changed at the PulseAudio level, then an unknown number of Speech 
> > > Dispatcher clients using that synthesizer are affected. In addition, if 
> > > the user wishes to send Orca output to another audio device, then they 
> > > have to change the output device for multiple Pulse clients, a
> >  nd as a result they may also be moving the output of another Speech 
> > Dispatcher client to a different audio device where they don't want it.
> > 
> > The first part of this seems like a short comming of using the pulse
> > volume control instead of the one in orca, but anyway.
> 
> I'd argue that it is has to do with the way audio output support is 
> implemented from the Speech Dispatcher side. This has nothing to do with 
> PulseAudio or any other audio output driver that is being used.
> > 
> > couldn't we accomplish the same thing with less movement of  lots of
> > data by changing when we connect modules to the audio output?
> 
> We certainly could, but we would need to extend the audio framework such that 
> we handle audio connections per client, and probably a separate thread per 
> client in the modules such that multiple lines of text can be synthesized 
> simultaneously if required. This is probably a better way to go as it is less 
> disruptive.

I'm not really sure how synthesizing multiple things at once will go,
but I guess we can figure it out when someone tries to implement some of
this.

> > 
> > > Actually, the choice of what sound device to use per Speech Dispatcher 
> > > client can be applied to all audio output drivers. In other words, moving 
> > > management of output audio to the server would allow us to offer clients 
> > > the ability to choose the sound device that their audio is sent to.
> > 
> > I think allowing clients to choose where audio goes is valuable, and the
> > way to implement audio file retrieval, but it seems to me we can manage
> > the client -> audio output management in the server and just reconfig
> > the module when the client it is synthesizing for changes.
> 
> Yep agreed.
> > 
> > > Please feel free to respond with further discussion points about anything 
> > > I have raised here, or if you have another suggestion for roadmap 
> > > inclusion, I'd also love to hear it.
> > 
> > Well, I'm not actually sure if I think its a good idea or not, but I
> > know Chris has wished we used C++, and at this point I may agree with
> > him.
> 
> If you can come up with good reasons why we should spend time rewriting 
> Speech Dispatcher in another language, have to deal with problems that we 
> have previously faced and solved etc, all whilst delivering improvements in a 
> reasonable amount of time, it would be worth considering. I was thinking the 
> same myself for a while, but I am no longer convinced it is worth the time 
> spent. Rather I think it would be time wasted, particularly since Speech 
> Dispatcher in its current form works reasonably well, it just needs some 
> improving.

I'm not really sure how much work it would be, I guess I can try a build
with -Wc++-compat and see what happens.

Trev

> 
> Luke
> 
> _______________________________________________
> Speechd mailing list
> Speechd at lists.freebsoft.org
> http://lists.freebsoft.org/mailman/listinfo/speechd



reply via email to

[Prev in Thread] Current Thread [Next in Thread]