[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug-gnulib] Re: strtok_r
From: |
Simon Josefsson |
Subject: |
[Bug-gnulib] Re: strtok_r |
Date: |
Fri, 12 Nov 2004 17:21:16 +0100 |
User-agent: |
Gnus/5.110003 (No Gnus v0.3) Emacs/21.3.50 (gnu/linux) |
Bruno Haible <address@hidden> writes:
> Simon Josefsson wrote:
>> considering
>> that, e.g., UCS-4 is a widely used multibyte encoding that is not
>> compatible with ASCII for any character.
>
> UCS-4 is not in the game here. A sequence of UCS-4 code points is not a
> char*, because
> 1) uint32_t[] and char[] have different alignment restrictions,
> 2) Even if you were to cast an uint32_t* to char*, strlen() of it is
> always <= 3, so it makes no sense to use the str* functions on them.
OK.
>> Can't we say:
>>
>> Caveat: It only support one-octet delimiters. With many character
>> sets, non-ASCII characters cannot be used as delimiters.
>
> No. The point I'm making is: ONLY the ASCII characters from 0x00..0x2F are
> usable as delimiters in a locale-independent way. Even ASCII delimiters
> such as '@', '\' or '_' are not usable with strtok_r, strsep etc. !
Ah, I get your point now. That is a rather serious problem. Sigh.
Fortunately, I wasn't using delimiters >= 0x30, but that might have
been just luck.
I reverted the doc, and filed an updated glibc bugzilla request.
Thanks.
Re: [Bug-gnulib] strtok_r, Paul Eggert, 2004/11/11