ctype.h functions on bytes 0x80..0xFF

bug-bash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

ctype.h functions on bytes 0x80..0xFF

From:	Grisha Levit
Subject:	ctype.h functions on bytes 0x80..0xFF
Date:	Fri, 26 May 2023 05:55:23 -0400

On Mon, May 1, 2023 at 11:48 AM Chet Ramey <chet.ramey@case.edu> wrote:
>
> (And once we get these issues straightened out, if you look back to your
> original example, 0x240 is a blank in my locale, en_US.UTF-8, and will be
> removed from the input stream by the parser unless it's quoted.)

On at least recent macos versions, it seems that the ctype.h functions
treat [0x80..0xFF] the same as wctype.h functions would.  So while
U+00A0 is a space character in the en_US.UTF-8 locale, and
iswspace(L'\u00A0') returns 1, it is also the case that isspace(0xA0)
returns 1.  But I don't think it's correct to actually rely on the
latter since the single byte 0xA0 doesn't represent _any_ character in
the locale, much less a space.

(I think that's the reason for the behavior Chet noted above from a
previous thread).

For example, these outputs would be correct with \uA0 in place of \xA0
below, but I don't think the current behaviour is expected:

$ eval $'printf "<%s>" [\xA0\xA0]'
<[><]>

[[ $'\xA0' == [[:space:]] ]]; echo $?
0

Perhaps on platforms like this it would be appropriate to mask ctype
results with something equivalent to `btowc(c) != WEOF'?

(See http://www.openradar.me/FB9973780 for an example of the issue in
an apple-supplied program)

[Prev in Thread]

Current Thread

[Next in Thread]

Re: heap-buffer-overflow in history_expand, Chet Ramey, 2023/05/01
- Re: heap-buffer-overflow in history_expand, Grisha Levit, 2023/05/25
  - Re: heap-buffer-overflow in history_expand, Chet Ramey, 2023/05/29
- ctype.h functions on bytes 0x80..0xFF, Grisha Levit <=
  - Re: ctype.h functions on bytes 0x80..0xFF, Grisha Levit, 2023/05/27
    - Re: ctype.h functions on bytes 0x80..0xFF, Chet Ramey, 2023/05/29

Prev by Date: \U expansion in single-byte locale
Next by Date: Re: EOF at PS2
Previous by thread: Re: heap-buffer-overflow in history_expand
Next by thread: Re: ctype.h functions on bytes 0x80..0xFF
Index(es):
- Date
- Thread