[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Internationalising strings [was: Re: new string library]

From: John Darrington
Subject: Internationalising strings [was: Re: new string library]
Date: Sun, 11 Jun 2006 15:18:18 +0800
User-agent: Mutt/1.5.4i

On Sat, Jun 10, 2006 at 02:42:36PM -0700, Ben Pfaff wrote:

     > 2.  Obvously macros like CC_ALNUM are only correct for the C locale.
     >     Not a problem so long as everyone's aware of it, but naive
     >     programmers might make some mistakes ...
     I'm aware of the problem and trying to think of a good solution.

I've been thinking a bit about it too.  In the case of parsing input
syntax, I think the only solution is, to convert the syntax to 
(wchar_t *)  using mbstowcs before doing anything with it. 

Thus, functions like become:

  bool lex_is_id1(char c);  from data/identifier.c 

  bool lex_is_id1(wchar_t c);

testing for alphanumeric characters then is a matter of calling 
iswalnum from wctype.h

At a pinch, we could convert only strings to wchar_t* (ie things
inside "" or '') but it might be easier and just as effecient to
convert the entire syntax file or line.  Some strings will end up
being converted back again (eg: variable names) but I don't think this
is too great a price to pay.

     Unfortunately, i18n seems to be difficult no matter what you do.

Yes, it's hard. Largely because so many existing libraries don't
follow the rules. I was talking to a guy from Germany recently who had
a problem with a special purpose compiler.  It turned out that under
the de_DE locale, one layer of the compiler was producing decimal
commas when the next layer was expecting decimal points, and crashing
when it got a comma it didn't expect.


PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See or any PGP keyserver for public key.

Attachment: pgpPUQ0ZQUCdd.pgp
Description: PGP signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]