bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unexpected behavior observed when using ${parameter[@]/pattern/strin


From: Stahlman Family
Subject: Re: Unexpected behavior observed when using ${parameter[@]/pattern/string} construct
Date: Sat, 20 Jan 2007 12:00:43 -0600


> Note that the goal in the examples below is to prepend "-iname '" (portion
> within double quotes only) to each of the 2 elements in the original array,
> without changing the number of words. i.e., the new array should contain the
> following 2 words:
> -iname 'abc
> -iname 'def

Thanks for your careful analysis and detailed description of the problem.
With that information, I have been able to fix the bugs reported in your
message.

Excellent! Thanks.


In the meantime, you can use
eval a2="${a[@]/#/"-iname '"}"

to get the behavior you want.

Assuming you meant (note the added parens)

eval a2="(${a[@]/#/"-iname '"})"

which, according to shell-expand-line, expands to

eval a2=("-iname '"abc "-iname '"def)

(i.e., 2 words assigned to the array)

I agree that this works. However, it relies on the fact that the `stripdq'
argument is *not* set in a call to string_extract_double_quoted when a patsub
rhs is being parsed. (In fact, string_extract_double_quoted does not appear to
be called at all for the nested double quoted string in the rhs of a patsub.)
When the rhs of a "+", "-", or "=" param construct is being processed,
parameter_brace_expand_rhs calls string_extract_double_quoted with the
`stripdq' arg set; the stripped string is eventually processed by
expand_word_internal. For a patsub, however, the nested string in the rhs will
reach expand_word_internal without a prior call to
string_extract_double_quoted with `stripdq' arg set (call sequence:
parameter_brace_patsub -> expand_string_to_string_internal ->
expand_string_unsplit -> expand_string_internal -> call_expand_word_internal
-> expand_word_internal), and expand_word_internal *never* calls
string_extract_double_quoted with the `stripdq' arg set. The nested double
quotes are simply added to the string via the add_character mechanism.

The result of all this is that for the "+", "-", and "=" parameter expansion
constructs, nested double-quote pairs in the rhs are stripped, but for patsub
constructs they are not. Is this difference by design?

Here are a couple of command-line examples showing the different handling of
nested double-quotes in two different types of parameter expansions:

$ a=(1 2 3)

# Note that the nested double quotes are preserved in this case...
$ echo "${a[@]/#/"var"}"
"var"1 "var"2 "var"3

# ... but are stripped in this case.
$ echo "${undef_var-"var"}"
var


I apologize for the delay; the demands on my time are such that I do not
generally respond to bug reports until I have had a chance to investigate
them.

I understand completely. I appreciate your having taken the time to look into
it...


[...]

The posix section on "double-quotes" requires that both single and double
quotes be balanced within the rhs of the parameter construct. This at least
implies that something akin to single and double quoted strings are permitted
within the parameter replacement. However, both the examples I have tried and
inspection of param_brace_expand_rhs indicate that if the entire parameter is
within double quotes, nested quotes don't really begin a nested string. The
rhs is parsed as a double quoted string in which unquoted double quotes are
simply discarded, and single quotes are retained literally. The only reason I
can see for using double quotes within the replacement is to allow an
unbalanced single quote to appear in the rhs. Single quotes themselves are not
treated specially at all. This begs the question: If single quotes are not
special within the rhs of a double quoted parameter construct, why are they
required to be balanced? Perhaps it is to allow the same rule to be applied to
the case of double-quoted and non-double-quoted parameter constructs?

It's much simpler (and more complicated) than that.  The Posix rule is intended
to allow the historical Bourne shell double-quoted string parsing, which, to a
great extent, is maintained in the Korn shell.  The Bourne shell has a single
quoting context, regardless of whether or not it's parsing a parameter
expansion within braces.  A double quote within the braces terminates a
double-quoted string begun outside the braces.

Bash, on the other hand, begins a new quoting context inside braces.  Single-
and double-quoted strings are recursively parsed according to the appropriate
rules.

Understood. (Provided that the phrase "the appropriate rules" does not refer to
the respective top-level rules.) I wonder whether anyone has ever floated the
idea of supporting a `compatible' option, which would determine whether Bourne
shell compatibility is required in such contexts. With such an option, one
could put

set +o compatible

at the top of a script to enable a more natural form of nested string
processing. For example,

var=foo

# Non-compatible parsing
set +o compatible
echo "${undef_var-'$var'}"
$var

# Compatible parsing (current behavior)
set -o compatible
echo "${undef_var-'$var'}"
'foo'

I realize you may be too busy even to contemplate a change like this now
(especially in light of the time and frustration that was surely involved in
attaining the current level of sh compatibility); however, it's just something
for the wish-list...



However, there is a catch:  we have to deal with the sh parsing artifact.
That is the reason for the `stripdq' argument to string_extract_double_quoted
and the immediate call to that function in parameter_brace_expand_rhs.  It
took a long time and lot of experimenting to get that "right enough".

Understood, but see earlier note on the difference between ${parameter-word}
and ${parameter/pattern/string} constructs with respect to
string_extract_double_quoted and the `stripdq' argument.

Thanks for all your help,
Brett S.


Chet

/~chet/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]