bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-gnulib] Re: strtok_r


From: Simon Josefsson
Subject: [Bug-gnulib] Re: strtok_r
Date: Fri, 12 Nov 2004 17:21:16 +0100
User-agent: Gnus/5.110003 (No Gnus v0.3) Emacs/21.3.50 (gnu/linux)

Bruno Haible <address@hidden> writes:

> Simon Josefsson wrote:
>> considering
>> that, e.g., UCS-4 is a widely used multibyte encoding that is not
>> compatible with ASCII for any character.
>
> UCS-4 is not in the game here. A sequence of UCS-4 code points is not a
> char*, because
>   1) uint32_t[] and char[] have different alignment restrictions,
>   2) Even if you were to cast an uint32_t* to char*, strlen() of it is
>      always <= 3, so it makes no sense to use the str* functions on them.

OK.

>> Can't we say:
>>
>>     Caveat: It only support one-octet delimiters.  With many character
>>             sets, non-ASCII characters cannot be used as delimiters.
>
> No. The point I'm making is: ONLY the ASCII characters from 0x00..0x2F are
> usable as delimiters in a locale-independent way. Even ASCII delimiters
> such as '@', '\' or '_' are not usable with strtok_r, strsep etc. !

Ah, I get your point now.  That is a rather serious problem.  Sigh.
Fortunately, I wasn't using delimiters >= 0x30, but that might have
been just luck.

I reverted the doc, and filed an updated glibc bugzilla request.

Thanks.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]