gnuspeech-contact
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gnuspeech-contact] source code organisation


From: David Hill
Subject: Re: [gnuspeech-contact] source code organisation
Date: Fri, 28 Apr 2006 18:06:16 -0700

Hi Eric,

I was away yesterday, as usual.

Several of your questions I shall only be able to answer properly/ fully when I've finished the Synthesizer port and moved onto the extraction & implementation of "real-time-monet".

On Apr 27, 2006, at 2:25 AM, Eric Zoerner wrote:

Where in the source for gnuspeech is the TTS logic, including where it selects an intonation contour based on the punctuation?

You can probably "read" the source code even better than I can. Don't forget that the input text is parsed and converted to Monet input format. At that point, the tone-group and foot boundaries are added, and the tone groups selected and the tonic chosen and this info is used to set the input format elements. It is done on the basis of dictionary look-up, to get the stresses, which provide the foot boundaries, and the punctuation, which selects the tone group boundaries and tone-groups. If there is no other information (e.g. bad punctuation) then defaults are used, so the tone group spans the whole sentence, the tonic is the last foot of the sentence, and the default tone group is tone group 1. The short answer to your question is "during the parsing of the input to Monet". The conversion to parameters is then carried out by methods within Monet, but there's no provision for changing the intonation/rhythm rules and data (unlike everything else), though they can be varied manually in particular syntheses.

Am I correct in saying that there are no tools that directly access the TTS functionality?

I am not sure what this question really means. Accessing the functionality of the TTS system is pretty well what Monet is all about. The bit that is missing are Synthesizer, which allows the tube itself to be manipulated direcly. But Monet allows all the other elements of the TTS system to be created/deleted/edited (well, the deletion for some things like rules is pretty kludgy since they are only renamed to a dead name). Even the intonation can be varied by using the intonation window. Perhaps you can be more explicit.

Is this part of the real-time Monet subsystem, and where is the source for that as well?

The original real-time monet is under "trillium/ObjectiveC/ Monet.realtime" in the archive:

        http://cvs.savannah.gnu.org/viewcvs/gnuspeech/?root=gnuspeech

Is there information somewhere that describes the organisation of the source code in the project?

Unfortunately, No, other than the self-documenting properties of the development environment and language -- unless Craig has kept whatever notes he made when he was writing the system.


At this point I have only checked out the "current" repository, which is just the OS X port, correct?

Correct


What is the complete list of CVS repositories, and what part of the project is contained in each?

You find out about this by visiting the repository as per URL above.

The names are pretty well self documenting, except where deliberate obscurantism was used (to hide password files, for example, and the directory names were not changed when the stuff was dumped on the Savannah site.

If you wanted to modify and recompile on the NeXT, you'd do well to consult with Craig &/or Len to see what extra paths need to be set up so ProjectBuilder can find everything. The MusicKit is one obvious component that needs to be accessible. Try compiling and see what the system complains it can't find and then figure out where the stuff is and add the paths to your search path. If you are only dealing with Monet, you'd be better dealing with the Mac version and that shouldn't cause any problem. If it does, Steve is you best source of info. The NeXT is quite slow, of course. You should find everything needed is on your NeXTStation if you really want to go that root.

Hope this helps.

All good wishes.

david





reply via email to

[Prev in Thread] Current Thread [Next in Thread]