[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: utf-8 input under X11
From: |
Dave Love |
Subject: |
Re: utf-8 input under X11 |
Date: |
04 Nov 2001 15:56:42 +0000 |
User-agent: |
Gnus/5.09 (Gnus v5.9.0) Emacs/21.0.107 |
>>>>> "EZ" == Eli Zaretskii <eliz@is.elta.co.il> writes:
EZ> Since using UTF-8 input automatically means all non-ASCII
EZ> characters in your buffers belong to the mule-unicode-* character
EZ> sets, we effectively limit users to files in either UTF-8 or
EZ> Latin-1. They will not be able, for example, produce KOI8-R or
EZ> Latin-3,
I can't believe I'm seeing this. It's no more true for utf-8 encoding
than it is for latin-2, say. That should be clear even without
implementing and testing such a language environment, as I did.
Experimentally, it didn't stop me editing arbitrarily-encoded files.
Emacs is multilingual.
EZ> unless they either install add-on packages such as Mule-UCS or
EZ> otherwise modify the code which encodes and decodes characters.
Or Emacs could just include the file which does that job.
EZ> And if they mix characters from files encoded in anything but
EZ> UTF-8 or Latin-1 with what they type,
mac-roman is a counter-example currently in Emacs. It's essentially
trivial to define other 8-bit coding systems in terms of Unicode.
I've posted 30+.
EZ> Emacs will confuse them by refusing to save the result in UTF-8.
You also have the contribution to extend what mule-utf-8 encodes.
EZ> This is because currently, mule-unicode-* characters are treated
EZ> as disjoint from the other character sets supported by Emacs.
Just like all the other internal charsets. They're not special --
it's simply misleading to suggest otherwise.
EZ> If UTF-8 input is the only reasonable input mode in such locales,
EZ> then using it would be a lesser evil than any other alternative.
EZ> But if users can reasonably use other input encoding, we might be
EZ> preventing them from having a more useful Emacs.
Indeed. I've worked on a number of Unicode input methods and I can't
input utf-8 directly.
>> I don't see why the user should have to do
>> something special (like set-keyboard-coding-system) if using utf-8,
>> but not if using koi8-r !
EZ> See above: the reason is that Unicode support is not yet complete
EZ> enough,
[Complete enough for what?]
EZ> so perhaps we shouldn't yet force it on the user.
I hope that isn't the reason, any more than for other locales.
It isn't forced on the user, anyhow -- they request it by specifying
the locale. koi8-r support is hardly complete either, but it's
invoked automatically at startup. (The coding system should be
completed using Unicode characters, or made completely Unicode-based.)
People should understand that the utf-8 coding system is essentially
the same as any other CCL-based one. Assuming anything else, e.g. in
Gnus, just causes lossage. There is too much FUD flying around.