Design suggestion: The server just for synthesis

speechd-discuss

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Design suggestion: The server just for synthesis

From:	Andrei . Kholodnyi
Subject:	Design suggestion: The server just for synthesis
Date:	Sat, 13 Nov 2010 16:11:18 +0000

Hi William,

> I'm not sure how this would be done since the way we process this is
> synthesizer specific. What I mean is, espeak, for example, does all of
> the TTS and audio generating and allows us to play the audio. But,
> festival and cicero on the other hand do not give us any access to the
> tts processing or audio data. I'm not sure whether Ivona gives us
> access to the audio or not; honestly I haven't looked at that code in a
> while.

What I meant here is instead of calling spd_audio_play inside each synth  
module
we need to return "AudioTrack track" data via callback to the logic outside  
module
and then based on the client settings to deliver data to audio or to the  
client.
Of cause eg generic module and HW synth will only support audio playback,
but this can be discovered via capabilities.

> > besides that at the moment one instance of particular TTS engine is
> > used per multiple clients
> > which makes impossible to produce a separate audio stream per client.

> Being able to switch synthesizers on the fly is definitely something we
> should look into.

it is not only that. At the moment one module eg espeak mixes speech from  
different clients in one stream
and sends it to audio playback.
Proper way would be to create one instance of espeak per client and let  
them produce a speech stream independent each other.
then audio server will mix those streams.

> > And finally there was a TTS API developed some time ago and we could
> > try to use it instead of libspeechd API
> > and also between server and modules.
> I'm not sure I follow what you mean here. That code is written in
> python, so are we talking about rewriting everything in python?

No, we are talking about compliance to  
http://cvs.freebsoft.org/doc/tts-api/tts-api.html
AFAIU it Brailcom's goal was/is to establish this standard API.

ie what could to be done is to replace libspeechd API with TTS API, see  
http://cvs.freebsoft.org/doc/tts-api/tts-api.html#Index-of-Functions

Personally I doubt it is a good approach since it forces applications to  
deal with individual drivers and their capabilities.
where as I expect it shall be done by the TTS Provider.
This is what I have recently explained to Hynek during our discussion about  
system wide versus module wide voice discovery.

Andrei.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://lists.freebsoft.org/pipermail/speechd/attachments/20101113/4f292e1a/attachment.htm>

[Prev in Thread]

Current Thread

[Next in Thread]

Design suggestion: The server just for synthesis, (continued)

Prev by Date: [PATCH 2/2] remove output_modules_list
Next by Date: Design suggestion: The server just for synthesis
Previous by thread: Design suggestion: The server just for synthesis
Next by thread: Design suggestion: The server just for synthesis
Index(es):
- Date
- Thread