aspell-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Aspell-user] small bug soundslike and non-ascii


From: Pablo Saratxaga
Subject: [Aspell-user] small bug soundslike and non-ascii
Date: Fri, 21 Oct 2005 13:29:40 +0200
User-agent: Mutt/1.5.6i

Kaixo!

I discovered that soundslike just handles ASCII only; and converts
any non-ascii to some ascii value.
In most cases of existing *_phonet.dat it doesn't matters; but
in some cases it does.

French and Walloon are na example of that.

For example, "c" and "ç" are very different,
"ca" sounds "KA", but "ça" sounds "SA";
however, current phonet code handles "c" and "ç" just the same;
as a result, "ça" is viewed as sounding "KA" too...

another example is "e" vs "ê,é,è".
At the end of a word, "e" (without accent) is always mute,
eg: "livre" => "LIVR"
but not if it is accented, eg: "livré" => LIVRE
as a result, it is impossible to define some usefull soundslike
rules if they involve non-ascii chars in the language.

(I think also that it makes it impossible to defined soundslike rules
for languages for wich non-ascii letters are even more proeminent,
or even exclusively used; like Czeck, Esperanto, Russian,...)

the idea of matching fully accented chars with "ascii only" versions
is however a good one, but the match could involve several chars
(eg: "ö" -> "oe" in German, and not "ö" -> "o");
the possibility to define an "asciification" table could help
find the better suggestions when spell checking an unaccented
ascii-only text; that is particularly true for those languages
that, for lack of proper computer support, had been written in
ascii for a long time, like Esperanto and Romanian for example.

thanks

-- 
Ki ça vos våye bén,
Pablo Saratxaga

http://chanae.walon.org/pablo/          PGP Key available, key ID: 0xD9B85466
[you can write me in Walloon, Spanish, French, English, Catalan or Esperanto]
[min povas skribi en valona, esperanta, angla aux latinidaj lingvoj]

Attachment: pgpRghnxA_uH7.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]