emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: More Cyrillic vs UTF-8


From: Simon Josefsson
Subject: Re: More Cyrillic vs UTF-8
Date: Sat, 26 Apr 2003 23:47:21 +0200
User-agent: Gnus/5.090019 (Oort Gnus v0.19) Emacs/21.3.50 (gnu/linux)

address@hidden (Kai Großjohann) writes:

> Simon Josefsson <address@hidden> writes:
>
>> address@hidden (Kai Großjohann) writes:
>>
>>> Simon Josefsson <address@hidden> writes:
>>>
>>>> Richard Stallman <address@hidden> writes:
>>>>
>>>>> Mentioning this in PROBLEMS seems like a good idea to me, but a useful
>>>>> entry needs to be stated in terms of what behavior the user sees.
>>>>> This text doesn't explain the practical consequences; a user would say
>>>>> "so what does that mean for me?"
>>>>
>>>> Is this better?
>>>
>>> Can you say what characters you're talking about, instead of just the
>>> code points?  I guess that most people haven't memorized the Unicode
>>> table (your truly included ;-).
>>
>> I agree, but I don't know which they are, and maybe the range includes
>> very many different kind of characters.  And as new characters are
>> added all the time, I fear that both the list of supported characters
>> and the list of unsupported characters would be too long to be useful.
>> Hm.
>
> Well, isn't Unicode divided into blocks so that one can list the
> blocks?  Hm.  Oh!  See http://www.unicode.org/charts/ -- looks quite
> promising.  Searching for the code blocks there and then giving the
> names ought to be useful.  WDYT?

The compiled list is below.  Does it really help anyone to list all of
them?

Supported:

Basic Latin     Optical Character Recognition
Latin-1 Supplement      Enclosed Alphanumerics
Latin Extended-A        Box Drawing
Latin Extended-B        Block Elements
IPA Extensions  Geometric Shapes
Spacing Modifier Letters        Miscellaneous Symbols
Combining Diacritical Marks     Dingbats
Greek   Miscellaneous Mathematical Symbols-A
Cyrillic        Supplemental Arrows-A
Cyrillic Supplement     Braille Patterns
Armenian        Supplemental Arrows-B
Hebrew  Miscellaneous Mathematical Symbols-B
Arabic  Supplemental Mathematical Operators
Syriac  CJK Radicals Supplement
Thaana  Kangxi Radicals
Devanagari      Ideographic Description Characters
Bengali         CJK Symbols and Punctuation
Gurmukhi        Hiragana
Gujarati        Katakana
Oriya   Bopomofo
Tamil   Hangul Compatibility Jamo
Telugu  Kanbun
Kannada         Bopomofo Extended
Malayalam       Enclosed CJK Letters and Months
Sinhala         CJK Compatibility
Thai    
Lao     
Tibetan         
Myanmar         
Georgian        
Hangul Jamo     
Ethiopic        
Cherokee        Private Use Area
Unified Canadian Aboriginal Syllabic    CJK Compatibility Ideographs
Ogham   Alphabetic Presentation Forms
Runic   Arabic Presentation Forms-A
Tagalog         Variation Selectors
Hanunoo         Combining Half Marks
Buhid   CJK Compatibility Forms
Tagbanwa        Small Form Variants
Khmer   Arabic Presentation Forms-B
Mongolian       Halfwidth and Fullwidth Forms
Latin Extended Additional       Specials
Greek Extended  
General Punctuation     
Superscripts and Subscripts     
Currency Symbols        
Combining Marks for Symbols     
Letterlike Symbols      
Number Forms    
Arrows  
Mathematical Operators  
Miscellaneous Technical         
Control Pictures        

Unsupported:

CJK Unified Ideographs Extension A (1.5MB)
CJK Unified Ideographs (5MB)
Yi Syllables
Yi Radicals
Hangul Syllables (7MB)
High Surrogates
Low Surrogates
Old Italic
Gothic
Deseret
Byzantine Musical Symbols
Musical Symbols
Mathematical Alphanumeric Symbols
CJK Unified Ideographs Extension B (13MB)
CJK Compatibility Ideographs Supplement
Tags
Supplementary Private Use Area-A
Supplementary Private Use Area-B





reply via email to

[Prev in Thread] Current Thread [Next in Thread]