bug-binutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug binutils/27551] The default encoding of the strings utility does no


From: vincent-srcware at vinc17 dot net
Subject: [Bug binutils/27551] The default encoding of the strings utility does not conform to POSIX: should honor the current locale.
Date: Wed, 14 Apr 2021 14:35:18 +0000

https://sourceware.org/bugzilla/show_bug.cgi?id=27551

--- Comment #15 from Vincent Lefèvre <vincent-srcware at vinc17 dot net> ---
(In reply to Nick Clifton from comment #14)
> But that is the point.  The encoding of characters in the file being scanned
> is not known.  Using LC_CTYPE is incorrect, because that specifies how to
> display characters, not to read them.

This is not what POSIX says. Read again:

LC_CTYPE
    Determine the locale for the interpretation of sequences of bytes of text
data as characters (for example, single-byte as opposed to multi-byte
characters in arguments and input files) and to identify printable strings.

It says "interpretation of sequences of bytes of text data as characters". Thus
that's precisely for reading (in addition to displaying).

Note that nowadays, UTF-8 is commonly used, so that's very useful. And if a
8-bit byte sequence matches a valid UTF-8 sequence, it is probably a real
character. In practice, false positives for UTF-8 are much rarer than false
positives for ASCII (i.e. sequences of 7-bit bytes that actually do not
correspond to text).

A user who wishes to stick with ASCII could still set LC_CTYPE to C.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]