[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Gnu-arch-users] [semi-OT] Unicode / han unification (was Re: Spaces
From: |
Tom Lord |
Subject: |
Re: [Gnu-arch-users] [semi-OT] Unicode / han unification (was Re: Spaces ...) |
Date: |
Wed, 21 Jan 2004 17:56:25 -0800 (PST) |
> From: David Brown <address@hidden>
> Korean is a bit more "annoying", since Unicode provides several
> different ways to encode a single glyph. There are two encodings in
> Unicode that take several Unicode code points and map to a single glyph.
> So, for example, 'Han' could be three code points, representing 'H',
> then 'A', then 'N'. There are two different encodings just for this.
> Then, given a complex set of rules, this can be spilled down to a single
> glyph for the syllable 'Han'. There is also a codepoint just for the
> symbol 'Han'.
> All this means is that, especially for Korean, determine if two strings
> are equal is quite complex.
You are talking about "canonical combining forms" and other
canonicalization issues, yes?
Those issues are not at all unique to CJK. And yes, they are complex
enough to require considerable care to make them manageable --- but I
don't see any way to make things simpler than Unicode already does, do
you? The Unicode consortium has at least provided some strong hints
about how to do it.
-t
Re: [Gnu-arch-users] Spaces in filenames ... will come soon!, Eric W. Biederman, 2004/01/20
- [Gnu-arch-users] Re: Spaces in filenames ... will come soon!, Miles Bader, 2004/01/20
- [Gnu-arch-users] Re: Spaces in filenames ... will come soon!, Eric W. Biederman, 2004/01/21
- Re: [Gnu-arch-users] Re: Spaces in filenames ... will come soon!, Andrew Suffield, 2004/01/21
- [Gnu-arch-users] [semi-OT] Unicode / han unification (was Re: Spaces ...), Tom Lord, 2004/01/21
- Re: [Gnu-arch-users] [semi-OT] Unicode / han unification (was Re: Spaces ...), David Brown, 2004/01/21
- Re: [Gnu-arch-users] [semi-OT] Unicode / han unification (was Re: Spaces ...),
Tom Lord <=
[Gnu-arch-users] Re: [semi-OT] Unicode / han unification (was Re: Spaces ...), Andrew Suffield, 2004/01/21
Re: [Gnu-arch-users] Re: [semi-OT] Unicode / han unification (was Re: Spaces ...), Tom Lord, 2004/01/21
Re: [Gnu-arch-users] Re: [semi-OT] Unicode / han unification (was Re: Spaces ...), Andrew Suffield, 2004/01/21
[Gnu-arch-users] Re: [semi-OT] Unicode / han unification (was Re: Spaces ...), Miles Bader, 2004/01/21
[Gnu-arch-users] Re: [semi-OT] Unicode / han unification (was Re: Spaces ...), Tom Lord, 2004/01/21
[Gnu-arch-users] Re: [semi-OT] Unicode / han unification (was Re: Spaces ...), Miles Bader, 2004/01/21
[Gnu-arch-users] Re: [semi-OT] Unicode / han unification (was Re: Spaces ...), Tom Lord, 2004/01/22
[Gnu-arch-users] Re: [semi-OT] Unicode / han unification (was Re: Spaces ...), Miles Bader, 2004/01/22
Re: [Gnu-arch-users] Re: [semi-OT] Unicode / han unification, Brian May, 2004/01/22
Re: [Gnu-arch-users] Re: [semi-OT] Unicode / han unification (was Re: Spaces ...), Florian Weimer, 2004/01/25