[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Incorrect alias expansion within command substitution

From: Robert Elz
Subject: Re: Incorrect alias expansion within command substitution
Date: Thu, 03 Feb 2022 01:30:48 +0700

    Date:        Wed, 2 Feb 2022 11:38:30 -0500
    From:        Chet Ramey <chet.ramey@case.edu>
    Message-ID:  <7c422cbb-bba8-0a57-a565-eeb115120e45@case.edu>

  | > How accurately can you reconstitute?   That is, can you maintain the
  | > difference between $(a b) and $( a b ) for example ?   How about $(a  b) ?
  | Does it make a semantic difference?

No, but that's not the point.

  | The only thing I can think of that
  | might mess it up is when you have something like
  | alias l='(' r=')'
  | echo $(l 4+4 r)
  | and you intend, for whatever perverse reason, to have this parsed as an
  | arithmetic expansion. I'm not sure that's worth supporting.

Supporting that would be wrong.   Aliases are only expanded in the command
work position (which includes the 'l' there but never the 'r', it would need
a newline before it, but you could add that it wouldn't affect the 
interpretation as arithmetic).   But to be the command word position for
either of them, the $( must be a command substitution, that decision must
already have been made, and once it is, it is...   It cannot turn back into
arithmetic because the l happens to become '(', leaving two (( chars together,
tokenisation does not go back and re-examine chars that were already examined.
The $( was already found, the next ( is a whole new token.

That is, you should never save the alias expanded input as literal text,
to be combined with anything else, and scanned again later.  That's simply
wrong, expand aliases, the right side can combine with what follows, as
that hasn't been read yet (beyond it ending the token which turns out to
be the alias) but there can never be a joining to the left.

So, no, definitely not worth "supporting", that can never be arithmetic.

The case I had in question with the question about $( a b ) was this

        cat <<$( a b )
        $( a b )

which works in bash 5.1.16, but doesn't in 5.2-(December 16).  In the
latter the end line needs to be $(a b).   The spaces are lost - reconstituting
really doesn't work for this purpose.   (nb: no aliases involved anywhere 

Reconstituting is fine for -x tracing, and other similar purposes, but
not here where the exact literal text is needed.

For what it is worth, ksh93 (the version I have anyway) doesn't even
get started...

ksh93 $ cat <<$( a b )
/usr/pkg/bin/ksh93: syntax error at line 1: `<<b' here-document not contained 
within command substitution

And I hate to even imagine what state it got itself into to produce that

yash and zsh get this example correct, along with released bash.   Nothing
else I tested does - everything other than bosh (everything includes the
NetBSD sh) generates an error of some kind on the here doc redirect.   Bosh
generates one at the first extra newline after the apparent end delimiter
(ie: given the above input, it doesn't complain, just sits there waiting
for more - type a \n at it, and then it complains about a syntax error.

  | You need to be able to reconstruct the text of arbitrary commands for
  | `set -x'. It's a very short step from that to rebuilding a command
  | substitution.

There are some things that -x typically does not show anything like
as input (in bash, try a case statement for example).   What appears
is fine for -x output, but not even close for reconstitution.

This one works in released bash, but not in 5.2-xxx (and no white space
tricks with this one to mess things up --- and I also did not try to guess
what the end delimiter might actually work, if anything):

cat <<$(case $x in a) echo found A;; b) echo found B;& *) echo found $x ;; esac)
$(case $x in a) echo found A;; b) echo found B;& *) echo found $x ;; esac)

(Doesn't matter what 'x' is for this, the "command substitution" is never
actually executed - all that is simply text.)  yash cannot parse that one
properly, only released bash and zsh get this right, ksh93 failed in a
similar way to above, didn't bother testing any of the others, none will work.

Bash's -x output doesn't include redirections either, so unsurprisingly
this one doesn't work in 5.2-xxx

cat <<$(cat</dev/null)

It does in zsh, released bash, and perhaps surprisingly, ksh93 (nothing else).
Our shell does include redirections in -x output, but that output is 
post-expansion, and with all the redirects appended after the command,
rather than whereever they appeared originally, so useless to reconstitute the
original, but fine for -x purposes (even better than the original, as the
rearrangement can expose bugs hidden by the way it was originally written).
And no, we do not include here docs in -x output, just (with -x enabled,
and PS4 initially ''):

sh $ PS4=+
sh $ cat <<foo
+cat <<...
sh $ <<foo cat
+cat <<...
sh $ 

  | You were in default mode, since you did not take the active step to run in
  | posix mode.

OK, I wasn't sure what was being meant by default mode - wondered about
possible shopt settings, or whatever...   But yes, definitely not in posix

  | Yes, since aliases get expanded while reading the WORD that is the here-doc
  | delimiter. That's what we're talking about changing here.

Yes, that's what I was expecting to happen, after seeing Matijn's example,
and why I asked about (and tested) that version (and why I was anticipating
the alias expanded form to work as the end of the here-doc).

If you abandoned reconstituting, and simply saved the string, however
difficult that might be none of these issues would arise.

It must be possible, the lexer is reading chars from somewhere, all it
needs to happen is to be told when to start saving, and when to stop).

This much I can easily make work in the NetBSD sh - my problem is where
reasonably (and cheaply) to save those bytes - the "usual location" is
building the parse tree, and cannot simply have random text written on
top of it.   This is so little used that an expensive solution isn't worth
it (since it needs to happen for all here doc end words, just in case,
but most of which are simple text and are already being saved, like any
other command word would be).

  | It works in
  | previous versions of bash because those versions don't expand aliases while
  | (ad-hoc) parsing the command substitutions at all. It will work in default
  | (non-posix) mode versions going forward because I'm going to err on the
  | side of backwards compatibility, at least for a while.

This one needs to work in posix mode as well, to be posix compat.
There's very little point being posix compat in default mode, but not
in posix mode, is there?

The question here isn't about whether aliases are expanded in command
substitutions, but that posix requires the exact text entered by the
user to be the end delimiter.

  | But it's good to have the discussion about command substitution
  | parsing and when alias expansion happens nevertheless.

Yes, implementors need to worry about obscure cases like this, even though
no real users who ever encounter them, because there are Martijns around
who keep finding stuff like this!


reply via email to

[Prev in Thread] Current Thread [Next in Thread]