bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Changing the way bash expands associative array subscripts


From: Koichi Murase
Subject: Re: Changing the way bash expands associative array subscripts
Date: Sat, 17 Apr 2021 01:11:28 +0900

2021年4月16日(金) 1:04 Chet Ramey <chet.ramey@case.edu>:
> On 4/13/21 11:11 PM, Koichi Murase wrote:
> > 2021年4月14日(水) 0:24 Chet Ramey <chet.ramey@case.edu>:
> >> On 4/13/21 5:01 AM, Koichi Murase wrote:
> >>> But I expected some design consideration enabling a[$key] for an
> >>> arbitrary key in the indirect expansions and namerefs.
> >>
> >> Why? Why should the shell carry around (and expect the user to remember)
> >> information about the original value assigned to a variable pre-expansion?
> >
> > I didn't mean we should preserve the syntactic information, which I
> > don't either think is a good idea.  Maybe we could always treat
> > `assoc[@]' as a single element reference.
>
> Do you mean, for example, in assignment statements?

In that paragraph, I was just thinking of indirect expansions, name
references, and unset, but this was a subset of my suggestion in
https://lists.gnu.org/archive/html/bug-bash/2021-04/msg00123.html
where I thought of changing 'assoc[@]' to a single element reference
in indirect expansions, namerefs, `unset -v', `test -v', `printf -v',
etc., while keeping the current interpretation in assignments,
parameter expansions, and arithmetic expressions (see the above link
for details).  But this is of course not a perfect solution.

> I can see it in that
> case, and for  `unset', and maybe for `test', because of the semantics of
> builtin commands.
>
> But if you take it farther than that, then there is no way to get the
> elements of an associative array without introducing yet more syntax. My
> guess is there are many, many more cases where ${assoc[@]} is intended to
> mean "all the array elements" instead of "the element '@'", and I'm not
> interested in that kind of incompatibility.

I agree that we shouldn't change the current behavior of ${assoc[@]}
being expanded to all the elements.

> >> Under what circumstances should they not be expanded? Because the
> >> indirect expansions of array subscripts still undergo a single set
> >> of word expansions.
> >
> >  From the users' point of view, indirect expansions and name references
> > currently undergo "double expansions" in assigning time and in
> > reference time; I mean naive users will write as « iref=a[$key] »
> > instead of « iref='a[$key]' » and run « echo "${!iref}" » to find that
> > the original right-hand side of the assignment is finally
> > double-expanded until the reference time.
>
> That's the point of indirect expansions! Even in the most basic use case:
>
> foo=bar
> bar=qux
> v=$foo
> echo ${!v}
>
> nobody should be surprised to see a `double expansion'.

In that case, I agree that no one would be surprised by the double
expansion of the *variable names* because it's the purpose of the
namerefs.  But I don't think everyone would necessarily expect the
*subscripts* would also be double-expanded.

> I suppose you can't protect people from their bad assumptions,

Right.

> Don't limit yourself to `unset'. The same thing happens with test and
> other builtin commands because of the set of word expansions they undergo
> before the builtin even sees its arguments.

Yes. I think the other builtins that take variable names should also
behave the same as `unset', i.e., they should expand the variable
names by itself so that « test -v 'a[$key]' », « printf -v 'a[$key]'
... », « read 'a[$key]' », etc. would work.

> > I actually agree with konsolebox that assoc_expand_once for unset
> > shouldn't be defaulted.  The option `assoc_expand_once' is incomplete
> > in the sense that the behavior of `a[@]' and `a[*]' are subtle.  I see
> > the current default behavior (with `assoc_expand_once' turned off)
> > more consistent and clean.
>
> Yeah, maybe. But explaining the requirements for quoting things in multiple
> ways is confusing, even to experienced users, and leads to knee-jerk
> overreactions like "don't use associative arrays ever" that don't help
> anyone. That's one of the motivations for this entire discussion.

Yes, but if `unset' is defaulted to `assoc_expand_once' behavior while
indirect expansions and namerefs aren't changed, users still need to
do two different ways of quoting: « unset "a[$key]" » versus «
iref='a[$key]'; echo "${!iref}" » and « declare -n nref='a[$key]';
echo "$nref" ». My initial understanding was that indirect expansions
and namerefs would also be changed to unify the quoting rules to «
"a[$key]" » but not to « 'a[$key]' ».

> > and we can tell users to always write « unset 'a[$key]' », «
> > iref='a[$key]'; echo "${!iref}" » and « declare -n nref='a[$key]';
> > echo "$nref" ».
>
> That's what I thought people would do with arrays in the first place. It
> obviously hasn't worked out the way I anticipated.
>
> So the current state has people doing
>
> declare -A assoc
> key='x]'
>
> assoc[$key]=hello
> declare -p assoc
>
> unset assoc["$key"]

Those whom I called ``naive users'' in previous replies include people
writing `unset' in the above way.  In this sense, to me, making «
unset "assoc[$key]" » safer seems the change to make the behavior
friendly to naive users.

> declare -p assoc
>
> and it just plain doesn't work, even with assoc_expand_once.
>
> `assoc_expand_once' wasn't intended to "make the behavior more friendly
> to naive users." It was intended to make the behavior safer and avoid
> unintended word expansions that possibly included command substitutions.

In my point of view, indirect expansions and name references are also
the places where unintended word expansions (including command
substitutions) could be caused by naive users.  For example, something
like

key=$(< untrustworty-file.txt) # can be e.g. key='$(echo injected >&2)'
iref=a[$key]
echo "${!iref}"

Anyway, I'd like to vote for the "rational" semantics for the default
behavior in which we can write « unset -v 'a[$key]' », « test -v
'a[$key]' », « printf -v 'a[$key]' ...», « read 'a[$key]' », «
iref='a[$key]' », « declare -n nref='a[$key]' », etc. (though I admit
that the details of "rational" semantics can be slightly different for
each person).  For the option `assoc_expand_once', I don't know a
perfectly consistent solution, but one of possible (incomplete) ways
might be treating 'assoc[@]' in many places as the single element
reference with `shopt -s assoc_expand_once':

  https://lists.gnu.org/archive/html/bug-bash/2021-04/msg00123.html

Or another solution might be introducing special syntactic treatment
of `unset' arguments:

  https://lists.gnu.org/archive/html/bug-bash/2021-03/msg00068.html
  https://lists.gnu.org/archive/html/bug-bash/2021-04/msg00058.html
  https://lists.gnu.org/archive/html/bug-bash/2021-04/msg00066.html
  https://lists.gnu.org/archive/html/bug-bash/2021-04/msg00134.html

--
Koichi



reply via email to

[Prev in Thread] Current Thread [Next in Thread]