Re: utf-8 cjk translation bug?

From: Miles Bader
Subject: Re: utf-8 cjk translation bug?
Date: 06 Oct 2003 11:29:25 +0900

Jason Rumney <address@hidden> writes:
> > I would have expected them to have iso10646 fonts if they are using
> > utf-8 (for the sake of applications other than Emacs) but maybe that
> > isn't the case.
> I think the problem is not that they don't have iso10646 fonts, it is
> that the iso10646 fonts they do have do not contain any of the double
> width characters, including double width roman that is in the
> 2500-33ff range.

Yeah, that's definitely the case, and it's not just a problem with
double-width characters -- the coverage of many iso10646 fonts seems
completely crap.

E.g., see a post by `Danilo Segan' on this list.  It apparently contains
cyrillic characters encoded in UTF-8, which emacs dutifully tries to
render using an iso10646 font, but show up as square boxes on my

Here's the output of `C-u C-x =', in case anyone is interested:

     character: с (01212141, 332897, 0x51461, U+0441)
       charset: mule-unicode-0100-24ff
                (Unicode characters of the range U+0100..U+24FF.)
    code point: 40 97
        syntax: w       which means: word
      category: y:Cyrillic  
   buffer code: 0x9C 0xF4 0xA8 0xE1
     file code: 0x9C 0xF4 0xA8 0xE1 (encoded by coding system raw-text-unix)
       display: by this font (glyph code)
        -bitstream-bitstream vera sans 
mono-medium-r-normal--16-122-95-95-c-100-iso10646-1 (0x441)

`Suppose Korea goes to the World Cup final against Japan and wins,' Moon said.
`All the past could be forgiven.'   [NYT]

