gnuspeech-contact
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gnuspeech-contact] Quickstart for the latest Gnuspeech?


From: Marcelo Y. Matuda
Subject: Re: [gnuspeech-contact] Quickstart for the latest Gnuspeech?
Date: Sun, 8 Nov 2015 00:31:56 -0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0

Hi,

On 11/01/2015 09:53 PM, Advrk Aplmrkt wrote:
Thanks Marcelo for the exaplanation. So *that's* why Siri sounds so good!

I can see how articulatory synthesis, when fully developed, can be
more powerful because you don't need to pre-record everything!

And the user can (more or less) easily change the voices. Articulatory synthesis will (hopefully) allow users to change accent / intonation / emotion. Another application is singing synthesis (see Pavarobotti, http://www.cs.princeton.edu/~prc/SingingSynth.html and VocalTractLab).

Gnuspeech already allows changing the voices and testing custom intonation curves.

Articulatory synthesis also can be used to study the phonatory system, and can simulate speech problems.

Also, as a non-programmer and complete non-expert on the subject, how
can an user support and expedite development of Gnuspeech?

Users can tell other people about the advantages of Gnuspeech (while not hiding its disadvantages). For example, with Gnuspeech you can easily change the voices (vocal tract length, breathness, etc) and Gnuspeech is still the only _articulatory_ text-to-speech system (it converts english text to speech).

Finally, other than Gnuspeech are there other Free Software
text-to-speech software that can produce equal of better quality
synthesis? Thanks!!

The perceived quality depends on the person. I know these:

Espeak
Festival
Flite (Festival lite)
MaryTTS
RHVoice

Regards,
Marcelo


On 01/11/2015, Marcelo Y. Matuda <address@hidden> wrote:
Hi,

On 11/01/2015 03:45 PM, Advrk Aplmrkt wrote:
Thanks for the links, and I agree a proper man page or quickstart
guide would be super useful for end users! (and not just speech
synthesis researchers)

I checked out the YouTube videos, and I confess it was hard for me to
understand what Gnuspeech was saying... Is there a reason why it
doesn't sound nearly as natural as, say, Siri yet???

Siri uses a method called Unit Selection (AFAIK), which joins segments
of recorded speech. That is why the quality can be so good.

Gnuspeech uses articulatory synthesis, which uses a mathematical model
of the human vocal tract to synthesize the speech from scratch. It is
very difficult to adjust the many parameters. Also GnuspeechSA is a C++
port of the original TTS_Server (for NeXTSTEP), developed a long time
ago. It doesn't yet incorporate the research done in all these years.
Hopefully articulatory synthesis will reach the quality of unit
selection, but there is much work to do.

Regards,
Marcelo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]