gnu-emacs-sources
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

iso-8859/mule-unicode unification for Emacs 21


From: Dave Love
Subject: iso-8859/mule-unicode unification for Emacs 21
Date: 26 Oct 2001 17:41:00 +0100
User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.0.107

This package provides tables and a little code to `unify' equivalent
characters from Emacs's internal charsets.

For example, ?\xf69 ?\x8e9 are both 'Latin small letter e with acute',
which you might type with Latin-9 and Latin-1 input methods
respectively.  They are distinct because of the unfortunate 8-bit
European character set standards (ISO 8859) and the use of the
appropriate international standard (ISO 2022) that Emacs follows to
`multiplex' them together.  [Mule follows the relevant
European-originated standards, and predates a useful definition of
Unicode.]

Normally a buffer containing both of those Emacs characters can only
be encoded (saved) in a general -- more-or-less Emacs-specific --
encoding: iso-2022-{7,8}bit or emacs-mule.  With unification enabled,
and, say, preferred coding system Latin-9, a buffer containing only
those two non-ASCII characters will be saved as Latin-9.  [This sort
of situation is probably most relevant when responding to mail in a
different encoding to what you normally use for input.]  If the buffer
contains characters which aren't common to a single supported 8859
set, it should probably be saved as utf-8 (see below).

This package directly supports unification of ISO 8859 on encoding or
decoding, as ISO 2022 suggests.  It disproves the mythinformation that
it can't be done by Mule.  [Emacs 20 had the unification
(`translation') hooks.]

Companion changes to utf-8.el mentioned in the commentary enable the
utf-8 coding system to encode ISO-8859-N characters for N>1.  I expect
to post them after checking the released code.  Similarly for
latin1-disp.el.

More unification could be done, e.g. of the European characters in the
Far Eastern character sets Emacs supports, but that's probably of
little interest.

Especially if you edit multilingual code for Emacs, note the warning
in the commentary about munging multilingual files, such as this one!

Attachment: ucs-tables.el
Description: application/emacs-lisp


reply via email to

[Prev in Thread] Current Thread [Next in Thread]