[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: read fails on null-byte: v4.1.7 FreeBSD 8.0 (amd64)
From: |
Matthew Story |
Subject: |
Re: read fails on null-byte: v4.1.7 FreeBSD 8.0 (amd64) |
Date: |
Wed, 23 Nov 2011 21:44:58 -0500 |
On Nov 23, 2011, at 7:09 PM, Chet Ramey wrote:
> On 11/23/11 6:54 PM, Matthew Story wrote:
>> On Nov 23, 2011, at 4:47 PM, Chet Ramey wrote:
>>
>>> On 11/23/11 9:03 AM, Matthew Story wrote:
>>>> [... snip]
>
> Yes, sorry. That's what the "bash treats the line read as a C string"
> was intended to imply. Since the line read is a C string, the NUL
> terminates it and what remains is assigned to the named variables. I
> should have used `line' in my explanation instead of `foo'.
I understand that the underlying implementation of the bash builtins is `C',
and I understand that `C' stings are NUL terminated. It seems unreasonable to
me to expect understanding of this implementation detail when using bash to
read streams into variables via the `read' builtin. Further-more, neither the
man-page nor the gnu website document this behavior of bash:
read
read [-ers] [-a aname] [-d delim] [-i text] [-n nchars] [-N nchars]
[-p prompt] [-t timeout] [-u fd] [name ...]
One line is read from the standard input, or from the file descriptor fd
supplied as an argument to the -u option, and the first word is assigned to the
first name, the second word to the secondname, and so on, with leftover words
and their intervening separators assigned to the last name. If there are fewer
words read from the input stream than names, the remaining names are assigned
empty values. The characters in the value of the IFS variable are used to split
the line into words. The backslash character ‘\’ may be used to remove any
special meaning for the next character read and for line continuation. If no
names are supplied, the line read is assigned to the variable REPLY. The return
code is zero, unless end-of-file is encountered, read times out (in which case
the return code is greater than 128), or an invalid file descriptor is supplied
as the argument to -u.
I personally do not read "One line" as meaning "One string of characters
terminated either by a null byte or a new-line", I read it as "One string of
characters terminated by a new-line". But "One string of characters terminated
either by a null byte or a new line" is not the actual functionality. The
actual functionality is:
"One line is read from the standard input, or from the file descriptor fd
supplied as an argument to the -u option, then read byte-wise up to the first
contained NUL, or end of string, ..."
Furthermore, I do not see the use-case for this behavior ... I simply cannot
fathom a case of I/O redirection in shell where I would choose to inject a NUL
byte to coerce this sort of behavior from the read builtin, and can't imagine
that anyone is relying on this `C string' feature of read currently in bash,
especially considering that it is not consistent with NUL handling in other
assignments in bash:
[matt@matt0 ~]$ foo=`printf 'foo\0bar'`; echo "$foo" | od -a
0000000 f o o b a r nl
0000007
[bash ~]$ foo=$(printf 'foo\0bar'); echo "$foo" | od -a
0000000 f o o b a r nl
0000007
which strip NUL.
I see one of three possible resolutions here:
1. NUL bytes do not terminate variable assignment from `read', behavior of
echo/variable assignments persists as is
2. NUL bytes are stripped by read on assignment, and this functionality is
documented as expected.
3. the existing functionality of the system is documented in the man-page and
on gnu.org as expected
I would prefer the first, and would be happy to attempt in providing a patch,
if that's useful.
cheers,
-matt
>
> Chet
> --
> ``The lyf so short, the craft so long to lerne.'' - Chaucer
> ``Ars longa, vita brevis'' - Hippocrates
> Chet Ramey, ITS, CWRU chet@case.edu http://cnswww.cns.cwru.edu/~chet/
Additional Notes:
The only occurrence of the pattern `NUL' in the FreeBSD man-page for bash is:
Pattern Matching
Any character that appears in a pattern, other than the special pattern
characters described below, matches itself. The NUL character may not
occur in a pattern. A backslash escapes the following character; the
escaping backslash is discarded when matching. The special pattern
characters must be quoted if they are to be matched literally.
All other references in the man-page are to the null string (empty string) not
to an explicit NUL byte (e.g. ascii 0), the same is true of the gnu.org
documentation.
- read fails on null-byte: v4.1.7 FreeBSD 8.0 (amd64), Matthew Story, 2011/11/23
- Re: read fails on null-byte: v4.1.7 FreeBSD 8.0 (amd64), Greg Wooledge, 2011/11/23
- Re: read fails on null-byte: v4.1.7 FreeBSD 8.0 (amd64), Chet Ramey, 2011/11/23
- Re: read fails on null-byte: v4.1.7 FreeBSD 8.0 (amd64), Matthew Story, 2011/11/23
- Re: read fails on null-byte: v4.1.7 FreeBSD 8.0 (amd64), Chet Ramey, 2011/11/23
- Re: read fails on null-byte: v4.1.7 FreeBSD 8.0 (amd64),
Matthew Story <=
- Re: read fails on null-byte: v4.1.7 FreeBSD 8.0 (amd64), Chet Ramey, 2011/11/24
- Re: read fails on null-byte: v4.1.7 FreeBSD 8.0 (amd64), Matthew Story, 2011/11/28
- Re: read fails on null-byte: v4.1.7 FreeBSD 8.0 (amd64), Greg Wooledge, 2011/11/29
- Re: read fails on null-byte: v4.1.7 FreeBSD 8.0 (amd64), Chet Ramey, 2011/11/29
- Re: read fails on null-byte: v4.1.7 FreeBSD 8.0 (amd64), Matthew Story, 2011/11/29
- Re: read fails on null-byte: v4.1.7 FreeBSD 8.0 (amd64), Matthew Story, 2011/11/29
- Re: read fails on null-byte: v4.1.7 FreeBSD 8.0 (amd64), Matthew Story, 2011/11/23