bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Improper UTF-8 combining character handling


From: Andreas Schwab
Subject: Re: Improper UTF-8 combining character handling
Date: Tue, 12 Jun 2007 21:40:52 +0200
User-agent: Gnus/5.110006 (No Gnus v0.6) Emacs/22.0.97 (gnu/linux)

Sean Burke <leftmostcat@gmail.com> writes:

> I've retried with 3.2-17 with the same results. Notably, the issue isn't
> (and has not been) that all multibyte characters are handled properly.
> Instead, sequences which contain combining characters seem to treat the
> sequence inconsistently. For example, the character that represents D
> WITH DOT ABOVE, U+1E0A, is handled properly. However, the equivalent
> sequence U+0044 + U+0307, consisting of D and COMBINING DOT ABOVE, is
> not handled properly. Backspacing through the sequence removes both
> characters with one backspace, but only the COMBINING DOT ABOVE glyph is
> removed.

That looks like a bug in your terminal emulator.  The sequence U+0044
U+0307 should occupy exactly one screen column by the fact that the
second character combines with the first one, ie. it should render
identical to U+1E0A.  This works correctly with current versions of
xterm or konsole.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."




reply via email to

[Prev in Thread] Current Thread [Next in Thread]