[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: severe problems with composite characters

From: Kenichi Handa
Subject: Re: severe problems with composite characters
Date: Wed, 17 Sep 2003 15:49:00 +0900 (JST)
User-agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.2.92 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI)

In article <address@hidden>, Werner LEMBERG <address@hidden> writes:
> ======================================================================

> string-width() returns a wrong number if its argument string
> has composite characters.

> Consider two bytes strings 0xcd 0xeb, whose width is one since they
> are composed.

> On Emacs 20.7 string-width() returns 1.
> On Emacs 21.3.50 string-width() returns 2.

??? I've just confirmed this result with 21.3.50.

(string-width (decode-coding-string "\xcd\xeb" 'thai-tis620)) => 1

Please note that Emacs 21 doesn't have a composite character
anymore.  For instance, compose-region doesn't change the
characters in a region to a single composite character,
instead it just puts text property `composition'.  The
display routine checks this text property and display the
sequence correctly.

I suspect that you evaluated something like this:

        (string-width "__some_composed_text__")

in *scratch* buffer.  As the Lisp reader ignores any text
properties on reading a string expression in *scratch*
buffer, the string given to string-width doesn't have
`composition' property.

> ======================================================================

> Suppose that composite characters are stored to a file with a
> multi-lingual coding-system. An example is TIS-620 characters with
> UTF-8 (or ctext).

> When Emacs reads the file, the composite characters are not composed
> since there is no post-conv function associated to the multi-lingual
> coding-system.

> Is this a bug?

As such a post conv function is rather heavy, it is by
default turned off.  When you customize the variable
utf-8-compose-scripts to t, Thai characters should be
composed on decoding.

But, I've just found a bug in this facility, and installed a
fix.  Please update your working directory, and try again.
Don't forget to do "make autoloads" in "lisp" subdirectory.

Ken'ichi HANDA

reply via email to

[Prev in Thread] Current Thread [Next in Thread]