m17n-list
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [m17n-list] ewts problems


From: Élie Roux
Subject: Re: [m17n-list] ewts problems
Date: Tue, 8 Jul 2014 17:17:04 +0200

The current version has a control variable "precomposed".
It is, by default, 1.  So typing "oM" produces a precomposed
character U+0F00 (ༀ).  But if a user customizes that variable
to 0, typing "oM" should produced the character sequence
U+0F6B U+0F7C U+F7E (ཨོཾ).  As your change deletes that
variable, such a customization stops working.

Indeed. In my experience, some Tibetan fonts (like the very expensive TCC fonts) have problems with U+0F6B U+0F7C U+F7E while U+0F00 always works, and 0FB2+0F80 works the same as U+0F76. Also, I was advised by some Tibetan font designers not to use precomposed glyphs, as the risk of failure is higher... Finally, some precomposed codepoints are deprecated in Unicode, and my guess (but it's only a guess) is that others will follow...
 
> If you talk about gh which used to produce the same as g+h, this is, in
> my opinion quite wrong, as there is no such thing in the EWTS
> descrption. This is explained in the commits I believe.

Where in your code is it explained?

This one is explained in the commit message of the file 0001-removing-gh-shortcut-not-in-ewts.patch sent in my first mail (the commit message is at the very beginning of the file, which is in git patch format).
 
I agree that it is important to respect the standard.  But
keeping backward compatibility is also important.  Are you
sure that there's no need of inputting such characters as
U+0F76?

Yes, I don't see why people would need it.
 
Here's a reply from Hugues.

> I've downloaded the no mim file. As for the 'brlabs' problem, perhaps
> there was a mistake in the file, as there is a possibility of wrong typing.

> But the most important in the new file is that it adds numerous
> possibilities in 'ambiguous cases with xxx as prefix' that are forbidden
> in tibetan spelling, such as 'grga' 'grnga' ...and so on. So there is NO
> tibetan word beginning with such prefixed syllables.
> I haven't checked other differences between the last version i gave and
> this new one.

Absolutely, and, again, this totally respects the standard: there is no list of valid tibetan syllables in the standard, so the new file respects the rules. There is a list of valid syllable forms here: http://www.tibet.columbia.edu/iats/it/IATS-X_Chilton_slides.pdf (last page), I can remove cases which are not here. It won't change much, but I admit I'd prefer keeping them.
 
> As for 'gh' case, which is a case of sanskrit letter, the problem with
> adding such possibilities (which are not in ewts if i remember well), is
> that it adds a lot different cases, i.e. buddha -> bud+dha ? buddha ?
> budd+ha ? bud+d+ha ? an so on with many sanskrit transcriptions. I think
> that why there is no such possibility in EWTS chart, because it make
> things simpler, but that's just my opinion.

There is a misunderstanding here: previous .mim file allowed gh as a shortcut for g+h, while new one forbids it, so I think we both agree on this...
 
As I have no knowledge about Tibetan, I can't decide what is
good as Tibetan input method.

Élie and Hugues, could you two please discuss what is best
for bo-ewts.mim and let me know the conclusion?  If the
current one has a bug, it should be fixed.  If the behavor
of the input method should be customized according to a
user's preference, such a facility should be implemented.
For such customizations, if you explain the detailed spec to
me, I'll implement them.

Thank you!
--
Elie

reply via email to

[Prev in Thread] Current Thread [Next in Thread]