freetype
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Aw: Re: Re: Re: Native TTF name sometimes contains crap


From: virtual_worlds
Subject: Aw: Re: Re: Re: Native TTF name sometimes contains crap
Date: Sat, 4 Sep 2021 19:41:24 +0200

Yes, for sure, hex values are more accurate.

So ftdump returns "\U+009E\U+004F" which is the correct name, so ftdump is 
doing something I do not know about.

When I call the get-name-function as shown, the returned value is 0x7e 0xd1 
0x4f 0x53 So when it is a Mojibake-problem - is ftdump workarounding this? If 
yes: how?


> Gesendet: Freitag, 03. September 2021 um 15:16 Uhr
> Von: "Werner LEMBERG" <wl@gnu.org>
> An: virtual_worlds@gmx.de
> Cc: freetype@nongnu.org
> Betreff: Re: Aw: Re: Re: Native TTF name sometimes contains crap
>
> > OK, so let's go through the font: when I decode it with ftdump, I > get the 
> > following entires for name and family: > > font family (ID 1) [Microsoft] 
> > (language=0x0804): > "\U+009E\U+004F" > full name (ID 4) [Microsoft] 
> > (language=0x0804): > "\U+009E\U+004F" > > When I read the related data via 
> > the freetype-functions, I get back > > string=žÑOSýýýýØ8hW Uh, oh, please 
> > tell us the byte values (in '0xXX' notation)! Everything else won't survice 
> > e-mail encoding/decoding without distortions. > string_len=4 > > ...means 
> > the žÑOS-part of the string is valid. But this in no case > decodes to 0x9E 
> > 0x00 0x4F 0x00! Welcome to Mojibake hell. The following possibilities come 
> > to my mind; there are certainly even more possibilities to screw up. (1) 
> > Wrong byte order. (2) Wrong encoding, for example interpreting GB2312 
> > characters as UTF-8. (3) Ditto, but mixing up with UCS4 – or vice versa. It 
> > can also be combination of (1) to (3). My advice: Forget it. Either 
> > suppress invalid data, or simply follow the 'garbage in, garbage out' 
> > principle. Werner



reply via email to

[Prev in Thread] Current Thread [Next in Thread]