[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: locale encodings
From: |
Bruno Haible |
Subject: |
Re: locale encodings |
Date: |
Tue, 5 Jul 2011 23:11:22 +0200 |
User-agent: |
KMail/1.9.9 |
Eric Blake wrote:
> The next version of POSIX will be enforcing that '/' and '.' are
> unambiguous across all POSIX encodings supported by all locales on a
> system
We are already make use of it in lib/mbschr.c and lib/mbsrchr.c.
> There are, however, some non-POSIX encodings where '/' can appear as the
> second byte in a shift-state sequence encoder (ISO-2022-JP-2), although
> they are rare in practice these days.
They are nonexistent for many years already. In 1999, Stephen Turnbull
had a web page describing some of the weird effects that he got with
non-ASCII file names in a ISO-2022-JP-2 locale. I think this was enough
to convince everyone that locales with stateful encodings are not practical.
> Also, if you worry about systems where backslash is a directory
> separator, there are encodings such as Shift_JIS where '\\' can appear
> as a second byte within a multi-byte character (hence, '\\' is
> ambiguous, even though '/' is not).
Yes, such locales exist, even on glibc systems where such locales are not
ISO C 99 compliant.
Bruno
- Re: uuencode: multi-bytes char in remote file name contains bytes >0x80, (continued)
- Re: uuencode: multi-bytes char in remote file name contains bytes >0x80, Bruce Korb, 2011/07/05
- Re: uuencode: multi-bytes char in remote file name contains bytes >0x80, Eric Blake, 2011/07/05
- Re: uuencode: multi-bytes char in remote file name contains bytes >0x80, Bruce Korb, 2011/07/05
- Re: uuencode: multi-bytes char in remote file name contains bytes >0x80, Eric Blake, 2011/07/05
- Re: uuencode: multi-bytes char in remote file name contains bytes >0x80, Bruce Korb, 2011/07/05
- Re: uuencode: multi-bytes char in remote file name contains bytes >0x80, ��叁, 2011/07/06
- Re: uuencode: multi-bytes char in remote file name contains bytes >0x80, Eli Zaretskii, 2011/07/06
- Re: locale encodings,
Bruno Haible <=
Re: uuencode: multi-bytes char in remote file name contains bytes >0x80, Eric, 2011/07/06