[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-gnulib] ISSLASH on Woe32

From: Bruno Haible
Subject: [bug-gnulib] ISSLASH on Woe32
Date: Wed, 27 Apr 2005 16:56:34 +0200
User-agent: KMail/1.5

Tor Lillqvist <address@hidden> brought this up:

The technique of searching for directory separators in strings through the
ISSLASH macro does, on Woe32, not support non-ASCII pathnames in most CJK
locale encodings. Why? ISSLASH looks for a _byte_ with value 0x5C. However,
in these locale encodings

  Japanese: CP932 SHIFT-JIS
  Chinese:  GBK GB18030 BIG5 BIG5-HKSCS CP950
  Korean:   JOHAB

the byte 0x5C occurs as second byte of some multibyte characters. If such a
character is used inside a directory name, code that uses ISSLASH does not
work correctly. All gnulib modules that use ISSLASH are affected.

What can we do?

  1) On Woe32, use 'wchar_t*' instead of 'char*' to denote pathnames.
     Use conditional macros like _TCHAR, _TEXT(), _tcslen() etc. that
     allow to accomodate these platform differences without too much #ifs.

  2) On Woe32, expect UTF-8 encoded 'char*' strings to denote pathnames.

  3) Use mbtowc() to step through pathnames while looking for a backslash.

  4) Document this as a limitation. The workaround for the user is to
     switch to an UTF-8 locale.

The drawbacks are:

  1) Tons of code that deals with pathnames has to be changed to use
     typedef'ed types. Also, support for WindowsME and older is dropped.

  2) Extra code must be added for every system call to convert pathname
     arguments from UTF-8 to UTF-16, and pathname results from UTF-16
     to UTF-8. Also, the user of the gnulib modules must be aware of the
     semantic difference. Also, support for WindowsME and older is dropped.

  3) Tons of code that deals with pathnames has to be changed to use
     mbtowc(), _mbschr(), _mbsrchr() etc.

  4) For users in CJK locales on Woe32, the contents of directories with
     some non-ASCII pathnames is inaccessible to GNU tools.

Microsoft recomments approach 1. GNOME has chosen approach 2. I would
favour answer 4.

What do you think?


reply via email to

[Prev in Thread] Current Thread [Next in Thread]