Language settings for LSR (was: Re: SpeechDispatcher for LSR)

speechd-discuss

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Language settings for LSR (was: Re: SpeechDispatcher for LSR)

From:	Hynek Hanke
Subject:	Language settings for LSR (was: Re: SpeechDispatcher for LSR)
Date:	Mon Sep 4 09:59:52 2006

Peter Parente writes on Sat 08/19/2006:
> > I would add [language switching] to the Speech Dispatcher driver
> > and send you a patch, but I'm not sure about the exact format. [...]
>
> We didn't know what to do here either, and that's why the doc string
> is very vague. Basically, the design for supporting language is
> incomplete right now. We were looking at the languages supported by
> IBM TTS as a guide to get us started. For instance, there are two
> separate runtimes for UK English and US English. How would those two
> be exposed as separate options in your scheme? The language is clearly
> "en", but what's the distinguishing dialect? This is where we were
> planning on using the country code "us" versus "uk." Like you point
> out, this doesn't help with Czech where the country and language are
> the same, but there are still three dialects.

Maybe more important and obvious this is with Chinese. There is only
one ISO language code for Chinese because all the kinds use the same
script, but this includes both Mandarin Chinese and Cantonese Chinese,
two mutually ununderstandable languages, both (together with like 10
more languages and dialects) spoken in China.

> So what would you recommend? Is (country, language, dialect) too much?
> Can we treat "us" versus "uk" as different dialects and just have
> (language, dialect)?

Back to the example of Chinese, this is a FAQ about ISO639-2 and the
suggested answer is here:
http://www.loc.gov/standards/iso639-2/faq.html#24

It suggests to either use country codes (China for Mandarin and Taiwan
for Cantonese) which however seems to me as quite inapropriate (there is
no such geographical distinction in real, AFAIK), or to use RFC 3066
subtags with subtags registered by IANA:
        zh-mandarin
        zh-cantonese
(the register is here:
        http://www.iana.org/assignments/language-subtag-registry )

Quite often, the dialects of a language, or languages grouped under some
broader name (like Spanish for: Castilian, Catalan, Basque etc.), are
differentiated by teritory. But I think this is not a rule and also the
teritory is not necessarily a country and thus representable by ISO
country codes.

So I'd suggest to use (language, dialect) where language is the ISO
639-1 or ISO 639-2 string (thus two or three letters) and dialect is
a non-standardized string except for those already registered by IANA
for this purpose (this includes all country codes). If you want to
represent both fields in one string, then I think we should use the
format specified in RFC 3066, I'd however
prefer to have it in separate fields. So we could have
        (zh, mandarin)
        (zh, cantonese)
        (cs, standard)
        (cs, moravian)
        (en, uk)
        (en, us)
etc.

> We're implementing the settings dialog for LSR 0.3.0. About half of it
> is complete in CVS. The intention is to let the user select the
> language (and country and dialect if that's what the final design ends
> up being) from a list of choices supported by the currently configured
> speech library. This assumes the speech engine is able to report what
> languages are available. If it cannot (which I believe is the case for
> many engines), we'll have to list all the known possibilities and
> simply ignore choices that aren't supported by the engine.

Yes, I think the ultimate goal is that the engines report the supported
languages. If this is not possible, providing a country list for
selection is easy. I think it will however be impossible to provide a
dialect selection list as most items there will be non-standardized
(and even the standardized ones can have a different name in the
synthesizer -- e.g. en_british for en_uk), so the dialect field must
allow the user to type it in based on documentation of the synthesizer.

On a related topic, we must also be prepared that the user will want to
use different synthesizers for different languages (maybe also for
reasons of quality, but mainly just because the synthesizers support
different languages). If the output in use is Speech Dispatcher,
this is hidden away from the application (the application is
supposed to just set the language and Speech Dispatcher choses the
correct synthesizer based on user configuration) unless overriden
by the SET OUTPUT_MODULE command. SSIP is however still missing a LIST
LANGUAGES command for user selection, which we hope to get included
in the next release when we can use the capabilities provided by TTS
API.

With regards,
Hynek Hanke

_______________________________________________
Speechd mailing list
address@hidden
http://lists.freebsoft.org/mailman/listinfo/speechd

[Prev in Thread]

Current Thread

[Next in Thread]

SpeechDispatcher for LSR, Peter Parente, 2006/09/04
- Message not available
  - SpeechDispatcher for LSR, Peter Parente, 2006/09/04
    - Message not available
    - Message not available
    - Fwd: SpeechDispatcher for LSR, Peter Parente, 2006/09/04
    - Language settings for LSR (was: Re: SpeechDispatcher for LSR), Hynek Hanke <=

Prev by Date: Fwd: SpeechDispatcher for LSR
Next by Date: speechd-el and term ?
Previous by thread: Fwd: SpeechDispatcher for LSR
Next by thread: ibmtts module
Index(es):
- Date
- Thread