bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Possibly incorrect parsing of double doublequotes


From: Chet Ramey
Subject: Re: Possibly incorrect parsing of double doublequotes
Date: Wed, 11 May 2011 15:47:19 -0400
User-agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10

On 5/9/11 8:06 PM, Mårten Wikström wrote:

>> Bash Version: 4.2
>> Patch Level: 8
>> Release Status: release
>>
>> Description:
>>        When parsing double double-quotes (i.e. """") it will be replaced by 
>> the
>>        value 0x7f, if there are characters before or after it. In bash 4.1 
>> empty
>>        double-quotes were simply removed.
>>
>> Repeat-By:
>>        $ echo """"a >t
>>        $ hexdump -C t
>>        00000000  7f 61 0a                                          |.a.|
>>        00000003
>>
> 
> Fix:
> After some debugging it turns out that the problem lies in
> expand_word_internal()
> in subst.c. In 4.1.0 the "" will be removed in expand_word_internal()
> when we hit
> line 8040:

Thanks for your investigation and analysis.  You've correctly identified
the place in the code that changed between bash-4.1 and bash-4.2 and the
place that needs to be fixed.

> 
>         /* We do not want to add quoted nulls to strings that are only
>            partially quoted; we can throw them away. */
>         if (temp == 0 && quoted_state == PARTIALLY_QUOTED)
>           continue;
> 
> However, in 4.2.10 the "" will be converted to CTLNUL (0x7f). Because the 
> above
> code has changed into
> 
>         /* We do not want to add quoted nulls to strings that are only
>            partially quoted; we can throw them away. */
>         if (temp == 0 && quoted_state == PARTIALLY_QUOTED && (word->flags &
> (W_NOSPLIT|W_NOSPLIT2)))
>           continue;
> 
> which won't match our case (to only flag set in word->flags is
> W_QUOTED). So instead we fall down to add_quoted_string: and it will
> add the CTLNUL character. So we end up with two 0x7f bytes in the
> resulting string when we get back to shell_expand_word_list(). Later
> only the first 0x7f will be removed by
> word_list_remove_quoted_nulls().

Correct.

> 
> There are two problems/solutions here. The comment in the code above
> seems to indicate that the quotes should actually be thrown away as is
> done in 4.1. But on the other hand, word_list_remove_quoted_nulls()
> seems to indicate it should remove all nulls, not just the first.
> If I fix word_list_remove_quoted_nulls() to actually remove all
> consecutive nulls, the problem is
> solved. (At least my simple test-case works). If I revert the line
> above to the 4.1 version it also
> solves my problem.

Unfortunately, that will not work.  You can't throw away the empty strings
unless you're sure that you won't be performing word splitting.  The best
example is

f=" val" e=
echo "$e"$f

which should result in two fields, the first of which is the empty string.
Bash-4.1 got that wrong.

> 
> Alas, my understanding of the bash code is fairly limited so my fixes
> will likely break something. Perhaps someone with a little more
> insight could tell the right(tm) solution.
> 
> Anyway, here are the patches.
> Solution 1, fixing remove_quoted_nulls():
> 
> *** subst.c   2011-05-10 01:48:54.816322136 +0200
> --- ../bash-4.2-patched/subst.c       2011-05-10 01:53:31.350806960 +0200
> *************** remove_quoted_nulls (string)
> *** 3706,3712 ****
>           break;
>       }
>         else if (string[i] == CTLNUL)
> !     i++;
> 
>         prev_i = i;
>         ADVANCE_CHAR (string, slen, i);
> --- 3706,3713 ----
>           break;
>       }
>         else if (string[i] == CTLNUL)
> !         while (string[i] == CTLNUL)
> !           i++;
> 
>         prev_i = i;
>         ADVANCE_CHAR (string, slen, i);

It's the right place, but the wrong fix.  The code as it reads in bash-4.2
skips over each character immediately following a CTLNUL.  If a sequence
of CTLNULs appear, it skips every other one.  I attached a patch that does
the right thing.

This bug has been there for a long time -- I stopped looking when I got
back to bash-3.0.  It was just masked by the code in
expand_word_internal().

> Solution 2, reverting to 4.1 behaviour:

As above, that doesn't do the right thing in all cases.

Chet
-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRU    chet@case.edu    http://cnswww.cns.cwru.edu/~chet/

Attachment: quoted-nulls.patch
Description: Source code patch


reply via email to

[Prev in Thread] Current Thread [Next in Thread]