[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug binutils/27551] The default encoding of the strings utility does no
From: |
vincent-srcware at vinc17 dot net |
Subject: |
[Bug binutils/27551] The default encoding of the strings utility does not conform to POSIX: should honor the current locale. |
Date: |
Fri, 09 Apr 2021 16:48:20 +0000 |
https://sourceware.org/bugzilla/show_bug.cgi?id=27551
--- Comment #13 from Vincent Lefèvre <vincent-srcware at vinc17 dot net> ---
(In reply to Nick Clifton from comment #12)
> (In reply to Vincent Lefèvre from comment #10)
> Hi Vincent,
>
> > The bug is that:
> >
> > if (encoding == 's')
> > buf[0] = c & 0x7f;
> >
> > So the byte 0xc0 gets changed to 0x40, which is printable.
>
> No - this is the correct behaviour. The 's' encoding says that the
> characters in the file being examined are 7-bits long, not 8-bits. Hence
> when a byte is read only the bottom 7 bits should be considered when
> deciding if the character is printable.
Then the 's' encoding must not be the default for non-ASCII encodings.
> > % printf "\300\300\300\300" | ./strings | iconv
> > iconv: illegal input sequence at position 0
>
> But if we use your original test case and the patched strings:
>
> % printf "abcdéfghi" | ./strings | iconv
> abcdiconv: illegal input sequence at position 4
>
> % echo $LC_CTYPE
> C.UTF-8
With the patched strings, I get under Debian/unstable:
zira% printf "abcdéfghi" | ./strings | iconv
abcdéfghi
zira% echo $LC_CTYPE
C.UTF-8
Perhaps your system doesn't support the C.UTF-8 locale.
> Are you saying that the length parameter passed to mbtowc() should include
> the first NUL byte ?
No, mbtowc() needs the whole UTF-8 sequence. For "é", that's "c3 a9". This
means that mbtowc() should get MB_CUR_MAX bytes to be sure to support all
printable characters (4 is sufficient for Unicode characters encoded in UTF-8).
--
You are receiving this mail because:
You are on the CC list for the bug.
- [Bug binutils/27551] The default encoding of the strings utility does not conform to POSIX: should honor the current locale., (continued)
- [Bug binutils/27551] The default encoding of the strings utility does not conform to POSIX: should honor the current locale., vincent-srcware at vinc17 dot net, 2021/04/07
- [Bug binutils/27551] The default encoding of the strings utility does not conform to POSIX: should honor the current locale., vincent-srcware at vinc17 dot net, 2021/04/07
- [Bug binutils/27551] The default encoding of the strings utility does not conform to POSIX: should honor the current locale., nickc at redhat dot com, 2021/04/08
- [Bug binutils/27551] The default encoding of the strings utility does not conform to POSIX: should honor the current locale., address@hidden, 2021/04/08
- [Bug binutils/27551] The default encoding of the strings utility does not conform to POSIX: should honor the current locale., nickc at redhat dot com, 2021/04/08
- [Bug binutils/27551] The default encoding of the strings utility does not conform to POSIX: should honor the current locale., nickc at redhat dot com, 2021/04/08
- [Bug binutils/27551] The default encoding of the strings utility does not conform to POSIX: should honor the current locale., vincent-srcware at vinc17 dot net, 2021/04/08
- [Bug binutils/27551] The default encoding of the strings utility does not conform to POSIX: should honor the current locale., vincent-srcware at vinc17 dot net, 2021/04/08
- [Bug binutils/27551] The default encoding of the strings utility does not conform to POSIX: should honor the current locale., vincent-srcware at vinc17 dot net, 2021/04/08
- [Bug binutils/27551] The default encoding of the strings utility does not conform to POSIX: should honor the current locale., nickc at redhat dot com, 2021/04/09
- [Bug binutils/27551] The default encoding of the strings utility does not conform to POSIX: should honor the current locale.,
vincent-srcware at vinc17 dot net <=
- [Bug binutils/27551] The default encoding of the strings utility does not conform to POSIX: should honor the current locale., nickc at redhat dot com, 2021/04/14
- [Bug binutils/27551] The default encoding of the strings utility does not conform to POSIX: should honor the current locale., vincent-srcware at vinc17 dot net, 2021/04/14
- [Bug binutils/27551] The default encoding of the strings utility does not conform to POSIX: should honor the current locale., nickc at redhat dot com, 2021/04/15
- [Bug binutils/27551] The default encoding of the strings utility does not conform to POSIX: should honor the current locale., vincent-srcware at vinc17 dot net, 2021/04/16