[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unclosed quotes on heredoc mode

From: Chet Ramey
Subject: Re: Unclosed quotes on heredoc mode
Date: Tue, 23 Nov 2021 11:09:51 -0500
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.2.1

On 11/20/21 5:54 PM, Robert Elz wrote:
     Date:        Sat, 20 Nov 2021 15:19:33 -0500

   | How about this. You show me examples where bash (devel bash) does what you
   | think is the wrong thing, and we agree it's a bug, I'll fix it.

I'll run our tests against the newest (released) bash (5.1.12(1)-release)

OK. However, since, as I said, the devel branch has a completely different
implementation, this is not particularly useful.

[what does the (1) represent??   It always seems to be (1) in versions I see.]

It's the build version: how many times have you built in this build tree?
I get into the hundreds before I recycle it.

   | The devel bash already does this.

What the devel one does is unknown to me, I don't think I even have
the means to obtain it (I have nothing at all git related, and no interest
in changing that state of affairs).

Whatever. You do you. Don't be surprised if many of my answers turn out to
be "that's already fixed in the devel branch."

It just seems like a tremendous amount of wasted effort to point out things
that have already been changed.

What I meant was this one:

        cat <<EOF && grep $(
        echo barfoo) *.c

where bash just sits at a PS2 prompt.

So does everyone else, except the netbsd shell. Refer to my previous
message about the reading-full-lines strategy. If you run it as a script,
everyone who lets EOF terminate a here-document produces some variant of

"foobar: not found
EOF: not found"

        cat $( cat <<FILES ) >/dev/null

which doesn't get beyond the first line...

The waiting-for-more-input from cat in the command substitution is common
to many shells, including bash-5.1 (dash, yash, zsh, etc.) It's not just
the ones that allow EOF to terminate the here document.

The devel branch produces

TRACE: pid 78934: parse_comsub: need_here_doc = 1 after yyparse()?
cat: abc: No such file or directory
cat: def: No such file or directory

which is the result of a conscious choice indicated by the debug message.
It's kind of inconsistent on my part -- see below -- but the message
reminds me of the choice I made and where to change it.

Interestingly, ksh93 makes it a syntax error -- one of the few places it
doesn't allow EOF to terminate a here-document.

jinx$ cat $( cat <<FILES ) >/dev/null
bash: warning: here-document at line 13 delimited by end-of-file (wanted 
bash: warning: here-document at line 1 delimited by end-of-file (wanted `FILES')

2 warnings??   There's just one heredoc redirection present.

The devel branch produces a single warning, because the here-document in
the command substitution is not closed.

I never got to enter the lines starting "abc" ... (I could have, but I know
I would have just seen 3 command not found errors, one for each line, so I
didn't bother.)

You wouldn't have, since it was waiting for `cat'. It already gave up on
the command substitution at that point, since it ended before terminating
the here-document.

In both of those, the first newline token following the << operator (and its
word) is the one at the end of the first line (of each).  The heredoc data
for each therefore starts on the 2nd line.

We talked about this. The command substitution starts a new parsing context
to implement the "any valid shell script" part of the standard.

What should happen:

[jinx]{3}$ cat <<EOF && grep $(
echo barfoo) *.c

The netbsd shell appears to be the outlier here. The parser reads the
command substitution so it can parse the entire and-or list before trying
to gather any here-documents.

[jinx]{3}$ cat $( cat <<FILES ) >/dev/null
cat: abc: No such file or directory
cat: def: No such file or directory

See above. This is making me reconsider that choice.

For the first there are a couple of .c files in $PWD but they don't contain
"barfoo", Neither "abc" nor "def" exist in $PWD

   | > and a here doc operator in a command substitution might not encounter
   | > a newline until after the cmdsub text has ended - the next following 
   | > token provides there here doc text.
   | I can't imagine a useful example of this that isn't an error.

That's the 2nd example above, and a very normal thing to want to do, very
short command substitutions (most of them) prefer to be complete within 1 line.

If you want the text of the here-document to apply to the command
substitution, put it inside the command substitution. Otherwise, you
violate the "any valid shell script" clause and the behavior varies there.

Note that neither in POSIX, nor anywhere else, has there ever been any
requirement on the heredoc data other than that it comes after the next
newline (which should, we agree, be newline token, not newline character).


Since heredocs are a lexical object, this processing is totally unaffected
by whatever semantics the grammar is extracting from the tokens the lexer is
returning to it, the grammar just increments the "number of heredocs needed"
counter, supplies the end words for each, and the lexer takes care of the rest.

The fundamental point of disagreement is what to do if the lexer (after,
presumably, calling the parser recursively) finds that it still has here-
documents to read after reading the end of the command substitution.

What happens in the command substitution stays in the command substitution.
If you subscribe to that, you need to specify your here-documents inside
the command substitution.

And then there is of course the combination of the two of those examples:

cat <<EOF && grep xyx $( cat <<END

Which has the same fundamental disagreement.

I'll stop it there, probably what follows is ')' on the same
line, but whatever happens next (assumed syntactically corrrect),
if your requirement is that END precedes EOF in what follows you're
clearly wrong, as POSIX is quite clear that the order in which the
heredocs are to be read is left to right across the line (regardless
of which commands they're attached to), so the EOF ending one *must*
appear first, and the END ending one second.   And the two of them
follow one newline token.

The logical conclusion of this line of thinking is that a `done' in a
command substitution can terminate a `for' loop that starts outside it.
Either you reset the parsing state, including the "hey I need a here-
document now," or you don't (or you do some partial half-assed job of it
that doesn't help anyone).

``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    chet@case.edu    http://tiswww.cwru.edu/~chet/

reply via email to

[Prev in Thread] Current Thread [Next in Thread]