emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ucs-normalize and diacritics


From: Robert Pluim
Subject: Re: ucs-normalize and diacritics
Date: Wed, 25 Jul 2018 16:45:03 +0200

Robert Pluim <address@hidden> writes:

> Eli Zaretskii <address@hidden> writes:
>
>>> From: Robert Pluim <address@hidden>
>>> Cc: address@hidden
>>> Date: Tue, 24 Jul 2018 22:48:50 +0200
>>> 
>>> auto-composition-mode seems not to be documented anywhere other than
>>> its doc string.
>>
>> Character composition is one of the few areas in Emacs that are
>> notoriously under-documented.  Patches to document that are most
>> welcome (let me know if I can help by pointing out to its other
>> interesting aspects).
>
> I think I¼ll start by putting pointers to auto-composition-mode in the
> manual and lispref.

(emacs)International Chars says:

       As a special case, if the character lies in the range 128 (0200
    octal) through 159 (0237 octal), it stands for a raw byte that does not
    correspond to any specific displayable character.  Such a character lies
    within the eight-bit-control character set, and is displayed as an
    escaped octal character code.  In this case, C-x = shows part of
    display ... instead of file.

I can't get that to ever happen. I do

emacs -Q
C-x C-f /tmp/bin.txt
C-x 8 RET 80
C-b
C-x =

which gives

Char: € (128, #o200, #x80, file ...) point=1 of 1 (0%) column=0

If I save that buffer, and re-read it using 'raw-text, the display
looks like

\301\200

and C-u C-x =  on the \200 gives:

             position: 2 of 3 (33%), column: 4
            character: € (displayed as €) (codepoint 4194176, #o17777600, 
#x3fff80)
    preferred charset: tis620-2533 (TIS620.2533)
code point in charset: 0x80
               syntax: w        which means: word
             category: L:Left-to-right (strong)
             to input: type "C-x 8 RET 3fff80"
          buffer code: #x80
            file code: #x80 (encoded by coding system raw-text-unix)
              display: no font available

Character code properties: customize what to show
  general-category: Cn (Other, Not Assigned)
  decomposition: (4194176) ('€')

I¼ve not been able to find any code that puts a 'display property on
the range 128 - 159 anywhere, so I wonder if the manual is out of
date?

Robert




reply via email to

[Prev in Thread] Current Thread [Next in Thread]