Re: [gnuspeech-contact] Early GNUSpeech observations

gnuspeech-contact

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gnuspeech-contact] Early GNUSpeech observations

From:	David Hill
Subject:	Re: [gnuspeech-contact] Early GNUSpeech observations
Date:	Wed, 8 Apr 2009 15:37:09 -0700

Hi Jason,

On Apr 8, 2009, at 1:24 AM, Jason White wrote:

Having run the (very early) GNU/Linux version, I wish tocongratulate the
authors of GNUSpeech for having advanced the porting effort this far.


Thank you.

I notice that the version which I built doesn't pronounce names andrelativelyuncommon words - perhaps it is restricted to pronouncing words thatare in its
dictionary. I hear a "zzz" sound in place of each omitted word.

This sound was put in there deliberately by Steve Nygard to make sureit was clearly understood that the system was not dealing with partsof the input, because (as you guess) the parser (which does all kindsof things including dictionary derivatives, arranging numbers anddates to be spoken in the way people speak them, and so on) is by nomeans completely ported. It is probably the very next job because itmakes a big different to the overall quality of the spoken output.

There is a letter-to-sound component in there as should befunctioning. I don't think that's what causes the funny "zzzzzt"noises, though they are not very good rules and normally are notnormally used much because with a 70,000 word dictionary, with hand-crafted pronunciations, and facilities for a lot of derivative words,the letter-to-sound rules are rarely called in the complete system.They are based on work by McIroy at Bell Labs.

Have the letter to sound rules not been ported yet, or is it just abug? Ithink it is important for any synthesizer to have good letter tosound rules,
since there will inevitably be words in the text that aren't in the
dictionary.

Also, the dictionary should be expanded -- a project that got put onhold when the NeXT & NeXT software disappeared. All sorts of propernames/nouns need to be added, including city and country names,people's names, and so on. It has been more important recently toget the basic software up on GNU/Linux and the Mac.

I also find the intonation pattern interesting, and quite differentfrom theoriginal samples, but I'm sure that improving it is on the list oftasks to becompleted. It also seems to me that the tonal quality of the voiceis betterthan that of the sound files that David generously supplied on hisWeb site,
but this might be entirely my imagination.

Again, the intonation rules, based on the M.A.K. Halliday'sintonation scheme for British English, were being refined. Craig[Taube-] Schock wrote his thesis on the topic under my supervision("Intonation for Computer Speech Output" -- University of CalgaryDept. of Computer Science 1993) and received the Governor General'sGold Medal for it, but the method had already been greatly improvedwhen we released the new articulatory synthesis software in 1994-5.

The "Lumberjack" and "The Chaos" speech samples on my university website under "Gnuspeech material" are the untouched results of puttingpunctuated text into the original NeXTStep version of Gnuspeech(known then as the Trillium TextToSpeech kit). The "Pat-a-pan"sample was a Christmas teaser composed by our PhD musician LeonardManzara for Christmas 1994. There are no instruments in the piecewhich is a simulation of singing an old Burgundian carol in fourparts, with 16 voices, and set in an auditorium 30 feet square, withreverberations supplied by Leonard's acoustic imaging software (partof his PhD work). I attach a short write up on that piece forconvenience.


Hope this helps.


Thank you, again, for the excellent work so far.


You encouragement is much appreciated.

Warm regards.

david

---------

Pat-a-pan (only the first verse of this old Burgundian carol issynthesised)

Note that there is no instrumental accompaniment in this synthesis,only voice harmony.

[The sound files are on my university web site: http://pages.cpsc.ucalgary.ca/~hill under "Gnuspeech material"]


God and man this day are one,
Even more than fife and drum;
So these instruments we play,
Tu-re-lu-re-lu, pat-a-pat-a-pan,
So these instruments we play
For a joyful Christmas day!

This synthesis was produced as a pre-Christmas teaser for advertisingpuposes for Trillium Sound Research Inc in 1994. There are 16unaccompanied male voices in four parts—arranged by Leonard Manzara—and located in a virtual hall 20 metres by 30 metres using acousticimaging software developed by Leonard for the technical part of hisdoctoral thesis in music from the SUNY at Buffalo (Manzara 1990).Because it is a carol, the rhythm and intonation for the four partsare musically determined and not composed by the rhythm andintonation rules used for the other examples. Some variation wasintroduced between voices singing the same parts. Only the sopranossing the lyrics above, the other parts sing “pat-a-pan” in variousways. The composition was completed before the system was finalised,so there are some deficiencies, notably in the balance between voicedand unvoiced sound. The sixteen different voices and acoustic imagingrequired significant effort which has not been repeated since thesystem achieved release status.

References

Manzara LC (1990) The simulation of acoustical space by means ofphysical modeling. PhD Dissertation, Faculty of the Graduate Schoolof the State University of New York at Buffalo

[Prev in Thread]

Current Thread

[Next in Thread]

[gnuspeech-contact] Early GNUSpeech observations, Jason White, 2009/04/08
- Re: [gnuspeech-contact] Early GNUSpeech observations, David Hill <=
  - Re: [gnuspeech-contact] Early GNUSpeech observations, Jason White, 2009/04/09
    - Re: [gnuspeech-contact] Early GNUSpeech observations, David Hill, 2009/04/21
    - Re: [gnuspeech-contact] Early GNUSpeech observations, Marcelo Yassunori Matuda, 2009/04/21

Prev by Date: [gnuspeech-contact] Early GNUSpeech observations
Next by Date: Re: [gnuspeech-contact] Re: Thoughts on GNUSpeech and possible accessibility applications
Previous by thread: [gnuspeech-contact] Early GNUSpeech observations
Next by thread: Re: [gnuspeech-contact] Early GNUSpeech observations
Index(es):
- Date
- Thread