bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: read fails on null-byte: v4.1.7 FreeBSD 8.0 (amd64)


From: Matthew Story
Subject: Re: read fails on null-byte: v4.1.7 FreeBSD 8.0 (amd64)
Date: Mon, 28 Nov 2011 23:00:50 -0500

Attached a patch to discard null-bytes while read, this preserves the 
functionality Greg demonstrated (not sure if this is desirable ...) wherein a 
delim of '' (e.g. -d '') will split on null byte.

With patch, read functions this way:

bash-4.2$ printf 'foo\0bar\n' | while read line; do echo "$line"; done
foobar
bash-4.2$ printf 'foo\0bar\0' | while read -d '' line; do echo "$line"; done
foo
bar

I find this behavior incongruent with what I expect from setting things like 
IFS to empty string (e.g. delim is every character), but it seems like it is 
already in use.  I have a patch to make terminate input line after every 
character for -d '', and after null-byte on -d '\0', if you are interested in 
that functionality, I'll send that patch for your consideration as well.


git am patch for read builtin

Attachment: 0001-Strip-null-bytes-from-read-when-DELIM-is-not.patch.gz
Description: GNU Zip compressed data


git am patch for man page and texi

Attachment: 0002-Update-documentation-both-man-and-info-to-reflect-re.patch.gz
Description: GNU Zip compressed data


I have patches for the generated documentation, but they are quite large, if 
you want them I'm happy to send them along as well.

cheers,
-matt



On Nov 24, 2011, at 12:08 AM, Chet Ramey wrote:

> On 11/23/11 9:44 PM, Matthew Story wrote:
>> 
>> On Nov 23, 2011, at 7:09 PM, Chet Ramey wrote:
>> 
>>> On 11/23/11 6:54 PM, Matthew Story wrote:
>>>> On Nov 23, 2011, at 4:47 PM, Chet Ramey wrote:
>>>> 
>>>>> On 11/23/11 9:03 AM, Matthew Story wrote:
>>>>>> [... snip]
>>> 
>>> Yes, sorry.  That's what the "bash treats the line read as a C string"
>>> was intended to imply.  Since the line read is a C string, the NUL
>>> terminates it and what remains is assigned to the named variables.  I
>>> should have used `line' in my explanation instead of `foo'.
>> 
>> I understand that the underlying implementation of the bash builtins is
>> `C', and I understand that `C' stings are NUL terminated.  It seems
>> unreasonable to me to expect understanding of this implementation detail
>> when using bash to read streams into variables via the `read' builtin.
> 
> I took a look around at Posix and some other shells.  Posix passes on
> the issue completely: the input to read may not contain NUL bytes at all.
> The Bourne shell, from v7 to SVR4.2, uses NUL as a line terminator.  Other
> shells, including ksh93, ash and pdksh derivatives like dash and mksh,
> discard NUL bytes in read.  zsh doesn't discard NULs and handles them
> pretty well, putting them into a variable's value.
> 
> The discard behavior seems fairly standard, and I will look at putting it
> into the next version of bash.
> 
> Chet
> 
> -- 
> ``The lyf so short, the craft so long to lerne.'' - Chaucer
>                ``Ars longa, vita brevis'' - Hippocrates
> Chet Ramey, ITS, CWRU    chet@case.edu    http://cnswww.cns.cwru.edu/~chet/
> 


reply via email to

[Prev in Thread] Current Thread [Next in Thread]