|
From: | David Hill |
Subject: | Re: [gnuspeech-contact] system lexicon |
Date: | Tue, 30 May 2006 13:59:39 -0700 |
Hi Eric, Sorry for the delay -- there's more than just too many alligators in the swamp ;-) On May 25, 2006, at 9:47 AM, Eric Zoerner wrote: I found the dictionary files in gnuspeech/trillium/src/SpeechObject/Dictionary/. However, the dictionary files seem to be text files, are missing part of speech info, and they cannot be opened by PrEditor. Is there a system dictionary file somewhere in PrEditor format? Michael Forbes was working on an improved format for PrEditor dictionaries to include tempo information, AFAIR, and I am trying to reconstruct how far he'd got and what he did from old memos. I don't think I have enough of a handle to start sending any yet, but I do attach one memo >at the very end< that indicates some of what was going on. Fortunately I am a bit of a pack rat at heart! ------ The parts of speech information is in the Main Dictionary (2.0e is indeed the latest -- I hesitate to call it "best" ;-). Identifying letters follow the % sign at the end of each entry. The parts-of-speech key is as follows: NOUN 'a' VERB 'b' ADJECTIVE 'c' ADVERB 'd' PRONOUN 'e' ARTICLE 'f' PREPOSITION 'g' CONJUNCITON 'h' INTERJECTION 'i' UNKNOWN 'j' PROPERNAME (NOUN) 'k' LOCATIONNAME (NOUN) 'l' CONCEPTNAME (NOUN) 'm' ----------- Whilst on the subject, I found Craig's original email to me concerning the Monet syntax. Here it is. Note that it all refers to the original NeXT TTS system which is what Steve Nygard used as a model (but he doped it out from the code!) ----------- From: Craig-Richard Taube-Schock <address@hidden> Date: Sat, 20 Jul 96 18:46:53 0600 To: david r hill <uudavid!david> Subject: Re: MONET Syntax Hi David, I did actually get your earlier email, but there has been so much to get done... I haven't been able to reply :-( The syntax is fairly straight-forward. The main problem is that it is slightly optimised for MONET, which makes it a little difficult to dechiper sometimes. I've bolded some of the more important items in the text below. [drh: no bolding in this plain text version. it's just the "/c" items] To highlight the syntax, I will send the following utterances to the TTS-Server and point out the reply: hello there, this is a test. I would like to buy some cheese. This, of course, comprises two of my favorite utterances for synthesis. I hope you aren't too bored with them, yet! I wanted to send two utterances to point out how "chunking" works. The reply from the server is as follows: /c // /3 # /w h_e./_l_uh_uu /w /1 /*dher # ^ // /0 # /w /_dh_i_s /w i_z /w uh /w /1 /*test # // /c // /0 # /w /_ah_i /w /_w_u_d /w /lahik /w t_uu /w /_b_ah_i /w /_s_a_m /w /1 /*cheez # // /c [drh: note that there should be no new-lines in the above] The embedded symbols are: /w - word boundary /c - chunk boundary 1* - tonic placement /1 - last word in tone-group // - tone group boundaries /_ - foot boundary. Also implied by syllable boundary . - syllable boundary /<number> (<number>=0-4) /0 = Statement /1 = Exclaimation /2 = Question /3 = Continuation (ie comma or colon) /4 = Statement/Continuation hybrid. Used with semicolons. The important thing to note about chunks is that these are the units which are most closely associated with utterances. A chunk can have several tone groups (even several sentences) within it which will be synthesized as one unit. I've bolded the chunk markers in the above sentence to show their locations. [drh -- no bolding in this plain text version.] The general template I use is as follows: /c // /3 # %s # // /c where %s is replaced by all of the stuff I want to synthesize. Please note that you must include a tonic placement or you will crash the server. This is a bug, but since this is an internal standard, this is generally not a problem as all of our stuff conforms to this standard. It is also wise to put in the "#" characters; this gets rid of any popping which may occur due to extreme initial and final changes in the synthesizer parameters. You will not crash MONET or TTS_Server if you do not put them in, but you may get some pops. It is also important to point out that NO TESTING for validity is performed by MONET or TTS_Server. If you get the syntax wrong, you will most likely crash the server (although MONET is usually a little more robust). This is not a big problem, as the server will be restarted transparently, but you won't get any synthesis of the offending utterance. The reason that no testing is done is (for reaasons of optimisation) and that this interface is (for the most part) internal to Trillium. Hope this is enough information for you. I suggest playing around just a bit to see if there is anything I've forgotten. Craig -------- -------- And here's a memo about PrEditor dictionary formats. Not wonderfully complete, but gives a little flavour. I am working on getting more info. ------ Date: Wed, 9 Aug 1995 01:21:52 -0400 From: Michael Forbes <address@hidden m. ab. ca> To: uudavid!trilljum.ab.ca!address@hidden Subject: Re: PrEditor In-Reply-To: address@hidden> The version of preditor that I gave you should be able to open both .preditor and .ded files, then save them in either format (using the save as menu command), this should allow you to convert between file types. Also, you should be able to insert (I mean merge, but insert on the version you have) either type of file into any open file, no matter what the file type. As far as working with the user dictionary, the server uses an interface to the old PrDict dictionary object, which still works with the new format, but any tempo information in the new format will be turned into the pronunciation string "u_u_u_p_s" by the old conversion routines because they do recognize numbers (which the old version of PrEditor strips out of the dictionary entries). To use the words without tempi information, you must save them as a .preditor file first. I think that we should prevent the server from working with ded files right now because there are no checks to ensure that pronunciations will not cause invalid strings to be sent to the server. I am just testing the new interface functions to the new PrDict object and will probably give those to Craig tomorrow so that he can compile a new server which will be able to understand preditor files with tempo information. >From your message, it sounded like it was not possible to convert between file types. I am not sure what other features would be useful to work with the two file types "interchangeably". If the conversion between the two types is not working, please call me tomorrow and describe the problem as I am pretty sure it was working on the IBM PC. I did not test it with long dictionaries though. Michael. --------- |
[Prev in Thread] | Current Thread | [Next in Thread] |