emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [mew-int 01596] Re: windows 1252


From: 山本和彦
Subject: Re: [mew-int 01596] Re: windows 1252
Date: Mon, 10 Nov 2003 16:11:23 +0900 (JST)

Hello Handa-san,

Thank you for your explanation.

> (2) ctext (alias of compound-text)
> 
> On conversion, it works not fully compatible with the
> specification of X Compound Text because it encodes any
> Emacs characters while using an designation sequence for
> private character sets (please note that all Emacs charasets
> have a iso-final-char).  So, Big5 characters are preceded by
> ESC $ ( 0 or 1, mule-unicode-0100-24ff characters are
> preceded by ESC - 1.
              ^^^^^^^

Let me clarify. 

Q1) It seemes to me that Emacs encodes mule-unicode-0100-24ff with ESC
$ - 1. But the explanation above says ESC - 1. Which one is correct as
Emacs's spec?

Q2) I don't think it's not good idea to disclose the internal
representation "mule-unicode-0100-24ff" into a file. According to the
spec of ctext provided with XFree86, it has extension for UTF-8:

---
7.  The UTF-8 encoding

Unicode  characters  that  are  not  contained in one of the
approved standard encodings can be encoded using  the  UTF-8
encoding. The following escape sequences are used:

     01/11 02/05 04/07   switch into UTF-8 mode
     01/11 02/05 04/00   return from UTF-8 mode

The  first  is  the  ISO registered sequence for UTF-8 (ISO-
IR-196), the second  is  the  ISO-2022  ``standard  return''
sequence.  While  in UTF-8 mode, the UTF-8 encoding replaces
the currently designated GL and GR encodings.  After  return
from  UTF-8 mode, the previously designated GL and GR encod-
ings are reactivated.
---

How about using this to encode mule-unicode-0100-24ff?

> When it runs under emacs-unicode version, on writing the
> file, if all the characters can be encoded by ctext, keep
> using it.  If not (because, in emacs-unicode, some character
> doesn't belong to any charset that has iso-final-char), use
> utf-8.  And in both cases, add a coding tag.  On reading,
> check the coding tag at first.  If no coding tag, read by
> ctext, otherwise, read by the coding system specified in the
> tag.

I remember that, some years ago, Handa-san said to me, "The current
Emacs is using mule-unicode but will migrate to Unicode".  But I don't
know what exactly emacs-unicode refers to. Which versions? Or
a different source tree?

--Kazu




reply via email to

[Prev in Thread] Current Thread [Next in Thread]