speechd-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

WIP audio in server


From: Tim Cross
Subject: WIP audio in server
Date: Sat, 13 Feb 2016 19:40:20 +1100

Just a couple of comments from a lurka if that is OK?

I was a little concerned when I first read the suggestion that audio be
passed back tot he client for the client to do something with. While I
think this could be a very useful option for some situations, having played
with writing a few speechd clients, I'm very glad I don't have to worry
about how to actually use/play the audio. This is definitely something
which I want speech-dispatcher to handle - not only that, as a client, Id
on't want to care about whether the user prefers pulse, alsa or whatever -
I want to just assume they have a working speech-dispatcher, which includes
having it configured to send the generated audio to whatever the default
device is.

With respect to the mixing of metadata and audio in one stream - separate
sockets or pipes may be a solution, but the challenge could then be
synchronizing and ensuring the right metadata and audio data are linked.
Having mixed content may not be an issue if you have a well enough defined
protocol which supports mixed content. Provided your data can be parsed
easily and you can avoid situations where either it is too easy to get out
of synch or where lost/missed data can result in a client ending up in a
'where am i, what is this" situation, mixed content doesn't hame to be a
big issue. However, if we are talking about returning audio as an optional
data stream (for those clients which want to manage paying of it directly
rather than relying on speechd to process it, a separate stream would
probably make sense, especially if there may be contexts in which large
amounts of audio data are returned, but you want things to continue at a
metadata level etc.

The sockets v pipes question is an interesting one and I guess depends on
what contexts we want speechd to support. Something I have done in the past
which has been quite useful (but will admit have not needed in recent
years) was the ability to run a client on one system and have it
communicate with a speech server on another system. In this context, only
sockets will suffice. However, I don't know how important/common such a
requirement is an whether it even needs to be considered. From a security
perspective, pipes are probably a better choice.

Apologies if this is just off the mark or I have completely got hold of thw
rong end of the stick. I'm very new to using speech-dispatcher or writing a
cleint which uses it and have really just been lurking on the list.
However, I figure if I'm off topic or have missed the point, it isn't too
hard to just put my email in the rubbish and I'd rather that than not
mention something which is actually useful.

tim

On 13 February 2016 at 05:03, Jeremy Whiting <jpwhiting at kde.org> wrote:

> Luke,
>
> On Thu, Feb 11, 2016 at 11:16 PM, Luke Yelavich
> <luke.yelavich at canonical.com> wrote:
> > On Fri, Feb 12, 2016 at 02:45:55PM AEDT, Jeremy Whiting wrote:
> >> Hi Andrei,
> >>
> >> On Thu, Feb 11, 2016 at 1:51 PM, Andrei Kholodnyi
> >> <andrei.kholodnyi at gmail.com> wrote:
> >> > Hi Jeremy,
> >> >
> >> > I'm glad to see that we have common understanding on this topic.
> >> > server shall handle client connections, client shall handle data.
> >> >
> >> > Currently it is not like this, and I think we need to put efforts
> fixing it.
> >> > I really like your idea to get audio back from the modules, but it
> shall go
> >> > directly to the client.
> >>
> >> Yeah, sending the audio data back to each client makes sense.
> >> Especially as most libspeechd users likely have some sound output
> >> mechanism of their own. Recently a client evaluated speech-dispatcher
> >> and decided to write their own library that does most/some of what it
> >> does but gives them the audio back rather than playing it itself.
> >> There were other reasons they decided to write their own rather than
> >> use speech-dispatcher (proprietary speech synthesizer, etc.) but
> >> that's one of the reasons.
> >
> > Ok, so what about clients like Orca? Orca is getting support for playing
> audio for progress bar beeps, but that uses gstreamer, and likely is being
> developed such that latency is not a concern. I am pretty sure that it
> doesn't make sense for Orca to manage the audio for its speech.
>
> I don't think we should stop playing audio in speech-dispatcher
> completely. I do think it would be handy to send the audio to the
> client for those clients that already use pulse/alsa/oss/whatever and
> want to handle it themselves though. Short term I think having the
> server play all audio from all output modules is a good first step.
> Next would be making the server only start output modules as they are
> requested by the clients. Then after that we could also have some api
> to get the audio back in libspeechd c (and python) api for clients
> that want to handle it themselves.
>
> >> > Also I'm not sure we need to mix metadata and audio in one stream.
> >>
> >> Yeah, I don't like it mixing them either, but wasn't sure how to
> >> separate them. I guess we could have two sockets, one for metadata and
> >> the other for the raw data or something. Do you have something in
> >> mind?
> >
> > Yeah, sounds reasonable.
>
> Ok, I'll take a stab at that next then.
> >
> > Luke
>
> _______________________________________________
> Speechd mailing list
> Speechd at lists.freebsoft.org
> http://lists.freebsoft.org/mailman/listinfo/speechd
>



-- 
regards,

Tim

--
Tim Cross
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://lists.freebsoft.org/pipermail/speechd/attachments/20160213/15b52bff/attachment.html>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]