speechd-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Problems with "end" callback in Python


From: Lukas Loehrer
Subject: Problems with "end" callback in Python
Date: Sun, 20 Apr 2008 19:18:31 +0200

Hi James,

I attached an example that shows how to use SSML mark elements with
speech-dispatcher and the espeak output module. Unfortunately, the
presence of marks influence the way the speech sounds. This seems to
be a speech-dispatcher issue, because when using espeak directly, the
marks do not influence the speech. Also, I had to turn on the
SSML_MODE setting manually.

Actually, espeak itself supports reporting of word boundries, but I do
not know if there are python bindings for espeak. The nvda
screen-reader for Windows uses espeak from python, thus it might be
possible to benefit from their work when for creating python bindings for
espeak.

Best regards, Lukas

-------------- next part --------------
A non-text attachment was scrubbed...
Name: word_index.py
Type: application/octet-stream
Size: 1149 bytes
Desc: not available
Url : 
http://lists.freebsoft.org/pipermail/speechd/attachments/20080420/588d9000/attachment.obj
 
-------------- next part --------------


James Simmons writes ("Re: Problems with "end" callback in Python"):
> Hynek,
> 
> I looked at the specification for SSML and I'm puzzled.  Would it be 
> enough to put <mark> tags around each word, or would I need to make a 
> properly formed SSML document?  Also, I'm not seeing any reference to 
> SSML in the page that explains how to use speech-dispatcher with 
> Python.  A simple Python example would help lots.
> 
> The new implementation of SD sounds good.
> 
> James Simmons
> 
> 
> Hynek Hanke wrote:
> 
> > James Simmons napsal(a)
> >
> >> Unfortunately the words have long pauses between them.  It sounds 
> >> like the voice of Colossus having a really bad day.
> >
> > Yes, because in this way, each time you send something to the 
> > synthesizer, you must wait until the synthesizer synthesizes it. If 
> > you would send a longer text however, the synthesizer can work in 
> > advance to the audio output, so it is much faster. Also, there is much 
> > less network overhead etc. And as you pointed out very correctly, the 
> > resulting sound will have much better quality because it will contain 
> > intonation etc.
> >
> > So a better solution would be to insert index marks (see SSML 
> > specifications) into the text after each word and send it to Speech 
> > Dispatcher.
> >
> > Not even this is optimal because you have to guess where the word 
> > boundaries are. A new implementation of Speech Dispatcher (which will 
> > however still take some time to finish) has a full solution to the 
> > problem. You only send the original text and the synthesizer itself 
> > will find word and sentence boundaries and will notify the client 
> > application when they are reached (together with the exact position in 
> > the original text).
> >
> > The proposed insertion of custom SSML index marks into your text 
> > should however be a fairly good solution in your case. I hope you 
> > don't run into  another threading problems with pygtk. It is quite a 
> > problem that it is not thread safe.
> >
> > With regards,
> > Hynek
> 
> 
> 
> 
> _______________________________________________
> Speechd mailing list
> Speechd at lists.freebsoft.org
> http://lists.freebsoft.org/mailman/listinfo/speechd
> 


reply via email to

[Prev in Thread] Current Thread [Next in Thread]