bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unclosed quotes on heredoc mode


From: Robert Elz
Subject: Re: Unclosed quotes on heredoc mode
Date: Sun, 21 Nov 2021 05:54:10 +0700

    Date:        Sat, 20 Nov 2021 15:19:33 -0500
    From:        Chet Ramey <chet.ramey@case.edu>
    Message-ID:  <c804ce20-5b65-14e2-9601-616abedae715@case.edu>

  | Right. Purposeful.

There's a difference between done intentionally for pragmatic reasons,
and done intentionally because it is the right thing to do and people
should depend upon it remaining that way.

  | How about this. You show me examples where bash (devel bash) does what you
  | think is the wrong thing, and we agree it's a bug, I'll fix it.

I'll run our tests against the newest (released) bash (5.1.12(1)-release)
[what does the (1) represent??   It always seems to be (1) in versions I see.]

  | The devel bash already does this.

What the devel one does is unknown to me, I don't think I even have
the means to obtain it (I have nothing at all git related, and no interest
in changing that state of affairs).

  | > and a newline token in the middle of
  | > a command substitution counts for a here doc operator that occurred before
  | > it, 
  |
  | What does `counts' mean? You're not really reading the lines as shell
  | words,

"counts" means "is the one that matters"  (ie: do not ignore this one).

But, no, not this...

  | cat << EOF
  | echo $(echo this EOF is
  | not the end of
  | the command substitution
  | EOF
  | but it is the end of the
  | here-document
  | )

though that is a mildly interesting case, and I agree on how that
gets parsed (the contents of the here doc are not examined until it
is expanded when used for a redirection).   That should result in a
redirection error for cat, then (probably) "but: not found" (if the
shell didn't already exit), "here-document: not found" and a syntax
error on the ')'.  (The "not found" errors are, naturally, assuming
that commands of those names aren't found in a PATH search).

What I meant was this one:

        cat <<EOF && grep $(
         foobar
        EOF
        echo barfoo) *.c

where bash just sits at a PS2 prompt.  Or this one

        cat $( cat <<FILES ) >/dev/null
        abc
        def
        FILES

which doesn't get beyond the first line...

jinx$ cat $( cat <<FILES ) >/dev/null
bash: warning: here-document at line 13 delimited by end-of-file (wanted 
`FILES')
bash: warning: here-document at line 1 delimited by end-of-file (wanted `FILES')

2 warnings??   There's just one heredoc redirection present.

>From the line numbers, I assume the first is when scanning the outer cat
command, and detecting its cmdsub arg, and the 2nd is from rescanning the
command substitution.   The first one clearly knows there is a heredoc,
it also knows it is yet to encounter a newline token (or any newline in
this example) hence the heredoc data cannot possibly be expected yet, it
must wait until after that newline - eventually it gets past >/dev/null,
finds the newline (token), and should start reading the heredoc text.
At that point it looks to see where the << redirection occurred (the first on
the line since this is the first heredoc read) and associates the data
with that redirection operator.   When the cmdsub is ready to be executed
it finds the heredoc data already read and available.

I never got to enter the lines starting "abc" ... (I could have, but I know
I would have just seen 3 command not found errors, one for each line, so I
didn't bother.)

In both of those, the first newline token following the << operator (and its
word) is the one at the end of the first line (of each).  The heredoc data
for each therefore starts on the 2nd line.

What should happen:

[jinx]{3}$ cat <<EOF && grep $(
>  foobar
> EOF
> echo barfoo) *.c
 foobar
[jinx]{3}$ cat $( cat <<FILES ) >/dev/null
> abc
> def
> FILES
cat: abc: No such file or directory
cat: def: No such file or directory

For the first there are a couple of .c files in $PWD but they don't contain 
"barfoo", Neither "abc" nor "def" exist in $PWD


  | > and a here doc operator in a command substitution might not encounter
  | > a newline until after the cmdsub text has ended - the next following 
newline
  | > token provides there here doc text.
  |
  | I can't imagine a useful example of this that isn't an error.

That's the 2nd example above, and a very normal thing to want to do, very
short command substitutions (most of them) prefer to be complete within 1 line.

Note that neither in POSIX, nor anywhere else, has there ever been any
requirement on the heredoc data other than that it comes after the next
newline (which should, we agree, be newline token, not newline character).
Since heredocs are a lexical object, this processing is totally unaffected
by whatever semantics the grammar is extracting from the tokens the lexer is
returning to it, the grammar just increments the "number of heredocs needed"
counter, supplies the end words for each, and the lexer takes care of the rest.

And then there is of course the combination of the two of those examples:

cat <<EOF && grep xyx $( cat <<END 

I'll stop it there, probably what follows is ')' on the same
line, but whatever happens next (assumed syntactically corrrect),
if your requirement is that END precedes EOF in what follows you're
clearly wrong, as POSIX is quite clear that the order in which the
heredocs are to be read is left to right across the line (regardless
of which commands they're attached to), so the EOF ending one *must*
appear first, and the END ending one second.   And the two of them
follow one newline token.

kre

ps: none of this stops people writing, if they prefer

cat <<EOF &&
 foobar
EOF
grep $( echo barfoo ) *.c

but in that form it is much harder to see immediately what is
the command that comes after the "&&" (particularly if the heredoc
is a long one - perhaps hundreds of lines).

It could of course be

cat <<EOF && grep $( echo barfoo ) *.c
 foobar
EOF

which is probably what I'd use in a simple case like
that where the grep all fits in the initial line, but
if that gets ugly, for example if the cmdsub gets to
be a long one, then the earlier form is nicer.


Similarly

cat $( cat <<FILES
abc
def
FILES
) >/dev/null

but that's just plain ugly (even if the > redirect is moved before the cmdsub).





reply via email to

[Prev in Thread] Current Thread [Next in Thread]