[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: term/encoding problem
From: |
Andreas Politz |
Subject: |
Re: term/encoding problem |
Date: |
Fri, 19 Sep 2008 21:30:46 +0200 |
User-agent: |
Mozilla-Thunderbird 2.0.0.16 (X11/20080724) |
Peter Dyballa wrote:
Am 18.09.2008 um 20:16 schrieb Andreas Politz:
Note that I get a 'Invalid character' message, when I try to
insert it via quoted-insert and it's octal value
( C-q 22622 ).
Ahh! So you're with GNU Emacs 22.x? I can reproduce it in 22.2. Once I
check this character in Kermit's utf8.txt file it's described as:
Yes emacs 22.2.1 .
character: ▒ (299218, #o1110322, #x490d2, U+2592)
charset: mule-unicode-2500-33ff
(Unicode characters of the range U+2500..U+33FF.)
In UTF-8 presentation this character is encoded with these three bytes:
E2 96 92. These are in "ASCII" (rather an 8-bit "ASCII"): ‚ ñ í. Using
C-q 1 1 1 0 3 2 2 <some disturbance> I can insert HALF SHADE. Could be
this non-Unicode Emacs has to use some extras to handle this ...
If no-one on this list has an explanation I'd write a bug report (see
Help menu), also mentioning the 'Invalid character' message. Although it
looks as if GNU Emacs 22.x seems to recommend to use 1110322 instead of
22622 ...
--
From what I learned since my first mail, emacs22 uses it's own distinguished
encoding for it's buffers (mule), which explains the difference byte codes.
But, I think I found the problem. term uses `binary' as input coding.
After it has examined the input, it inserts the relevant/visible parts
of it into the buffer. Only at this point it decodes the bytes with
the apropriate coding (variable:locale-coding-system).
At some point it splits the input string, to make it suitable for the
with of the `terminal'. The problem is, that it measures bytes not
characters. So the 3-byte character in question in aptitude, which is mostly
on the last column, gets split in 2 strings a 1 and 2 byte. This 2
strings, when encoded and inserted independently, will result in
what was described as the problem.
I filed a bug report.
Thanks anyway.
-ap