bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: accents


From: Andreas Schwab
Subject: Re: accents
Date: Mon, 16 May 2011 10:44:23 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.3 (gnu/linux)

Chet Ramey <chet.ramey@case.edu> writes:

> That's a non sequitor.  My point is that, as I read it, UTF-8 requires the
> use of the shortest sequence that can represent a particular character.

The character is U+0301, and the shorted sequence is 0xcc 0x81.

> In this case, that means that U+00E9 must be used to represent e with
> acute intead of e plus U+0301.

Those are two different Unicode character sequences.

> The point is that the utf-8 encodings of precomposed and decomposed
> unicode are different

Of course, they are different characters.  Encoding and normalisation
are different concepts, the former is about physical representation of
Unicode, the latter is about conversion between abstract Unicode values.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



reply via email to

[Prev in Thread] Current Thread [Next in Thread]