[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Improper UTF-8 combining character handling
From: |
Andreas Schwab |
Subject: |
Re: Improper UTF-8 combining character handling |
Date: |
Tue, 12 Jun 2007 21:40:52 +0200 |
User-agent: |
Gnus/5.110006 (No Gnus v0.6) Emacs/22.0.97 (gnu/linux) |
Sean Burke <leftmostcat@gmail.com> writes:
> I've retried with 3.2-17 with the same results. Notably, the issue isn't
> (and has not been) that all multibyte characters are handled properly.
> Instead, sequences which contain combining characters seem to treat the
> sequence inconsistently. For example, the character that represents D
> WITH DOT ABOVE, U+1E0A, is handled properly. However, the equivalent
> sequence U+0044 + U+0307, consisting of D and COMBINING DOT ABOVE, is
> not handled properly. Backspacing through the sequence removes both
> characters with one backspace, but only the COMBINING DOT ABOVE glyph is
> removed.
That looks like a bug in your terminal emulator. The sequence U+0044
U+0307 should occupy exactly one screen column by the fact that the
second character combines with the first one, ie. it should render
identical to U+1E0A. This works correctly with current versions of
xterm or konsole.
Andreas.
--
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."