speechd-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Audio server(s) support in Speech Dispatcher (Re: No output!)


From: Ricky Buchanan
Subject: Audio server(s) support in Speech Dispatcher (Re: No output!)
Date: Mon Sep 4 09:59:45 2006

[NOTE:  From Hynek!  Forwarded by Ricky because he sent it to the wrong
address.  Can we all point and laugh now?  *wink* ]

----- Begin forwarded message -----

On Wed, Dec 24, 2003 at 09:22:06AM +1100, Ricky Buchanan wrote:
> Hynek Hanke wrote:
> > now I understand the problem. It seems everything works correctly
> > except the resulting wave isn't played because /dev/dsp isn't
> > accessible as it's blocked by esd.

I spent some time looking for some decent sound system and
and sound library in GNU/Linux. The situation as it is today
is quite sad. On the other side, Speech Dispatcher can be made
to work in an environment where multiple sound sources access
the soundcard simultaneously without any modifications in the
code. Here is a more detailed description:

1) OSS (Open Audio System)

Curently Speech Dispatcher uses oss output through some Flite
sound library that writes directly to /dev/dsp. Any other application
with sound output are blocked, because OSS allows only one client
to access soundcard at once. So it's better to make OSS accessible
only to one of the sound daemons listed bellow.

2) Alsa

I don't know anything about it. But there are people on this mailing
list who do. Please could someone post more information about it?
Does it support synchronization and immediate stopping? Does it
have some decent interface library?

2) ESD (Enlightened Sound Daemon)

ESD is a daemon application trying to solve this issue by doing
similar things that Speech Dispatcher does to software synthesis.
It reads the streams from various clients, mixes them and sends them
to /dev/dsp (which is being blocked by esd). The old-style applications
that access /dev/dsp directly doesn't work when ESD is running.

However, there exists a program called esddsp, that links
the old-style programs with libesddsp at runtime and rewrites
all calls to /dev/dsp with calls to ESD. This way, the application
things it accesses /dev/dsp, but in reallity, it's talking to ESD
and everything works just fine. Speech Dispatcher can work with
this (tested), but with severe limitations...

Unfortunatelly, if I understand things correctly (which is hard to
say, because there is virtually *none* documentation to ESD), ESD
in it's current state can't fully work with Speech Dispatcher because
it doesn't provide any means of synchronization and the stop command
is *very* inacurate (like several seconds, while in Speech Dispatcher
we care about *fractions* of a second in many situations).

In general, ESD doesn't look as a very good choice to me. I don't
really understand why it's so widely used.

3) NAS (Network Audio System)

It does a similar thing to ESD, but NAS is much more powerful.
It supports synchronization and real-time stopping. It has a nice
C library for communication with it and easy support for volume
setting.

Old-style applications that open directly /dev/dsp can be made
to work with it through audiooss (a separate package) that does
exactly the same as esddsp does to esd -- it maps calls to /dev/dsp
to communication with NAS. Speech Dispatcher also works with NAS
through this method and I didn't notice any problem during testing
it.

You have to install NAS and audiooss, start NAS, then execute
        audiooss speechd -s
or
        audiooss speechd
or you can modify your init.d/speech-dispatcher script to do it for you.

In XMMS, ESD support is included by default, while NAS must be installed
additionally. I don't understand this either, but NAS is much less used.

--

4) libao interface library

libao is an interface library designed to talk to various sound systems,
but it's capabilities are highly limited. No synchronization, no
stopping. Without writing some middleware code, this is useless for real time
sound output.

5) libaio interface library

This is a new project, in it's early stage of developement, that aims to
provide interface to the different sound systems and provide the
functions necessary to real time sound output, like synchronization.

Unfortunately, it currently only has a driver for Alsa, so it doesn't
solve our problem very much. But the developement seems to be active,
so maybe something will come out of it.

Resume
======

Ricky, don't you want to consider NAS for your work? I think
it's a better alternative (there is also a lot of complaints
about ESD on internet). If not, you might want to try running
speechd as
        esddsp speechd -s
and see if it's limited functioning is enough for you. I don't
think we could archieve anything better with ESD :(

As for replacing the Flite sound library in Speech Dispatcher
with something better, I thing the NAS library looks good,
but it would require users to install yet an additional application.
If this is not what we want, I'd recommend to remain waiting
for something suitable (liba(i)o is not currently).

Looking for your comments!

Regards,
Hynek Hanke

----- End forwarded message -----


reply via email to

[Prev in Thread] Current Thread [Next in Thread]