bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unclosed quotes on heredoc mode


From: Chet Ramey
Subject: Re: Unclosed quotes on heredoc mode
Date: Thu, 18 Nov 2021 15:46:10 -0500
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.3.0

On 11/17/21 7:01 PM, Robert Elz wrote:
>     Date:        Wed, 17 Nov 2021 15:47:37 -0500
>     From:        Chet Ramey <chet.ramey@case.edu>
>     Message-ID:  <420281e7-f3c4-8054-d390-9378080c2815@case.edu>
> 
>   | Every modern shell uses `$PATH' as the here-document delimiter
> 
> Depends what you call modern shells - some ash derived shells (at least)
> don't, because they parse the $PATH into an internal form (in all words
> where that makes sense, before knowing what the word is to be used for)
> and then cannot match that properly.   While that isn't actually expanding
> the word, it still makes things fail badly.

Yeah, that's a bug. But it's probably baked in.

> But:
> 
> [D] sh-current $ cat foo <<$PATH
> sh: 80: Syntax error: Illegal eof marker for << redirection
> 
> at least we error out when the user tries, not just fail to ever
> find the end of the here doc.

OK, that's clearly a bug. Is this specific to the literal string `$PATH',
or are there more things that trigger it? The error message uses some
sloppy language, but that's neither here nor there -- this is a perfectly
valid script:

cat <<$PATH
hello
$PATH

that should echo `hello'.


>   | > First, the EOF should not work, that's a bash bug (IMO) - that should
>   | > generate an error, not just a warning.
>   |
>   | It's not. The historical shells used for the basis of the POSIX standard
> 
> I didn't say it was a standards violation, I said it was a bug.
> That the same bug exists in some other ancient shells isn't a justification.

"Some other ancient shells?" Like dash, or (the current and actively-
developed) ksh93, or the FreeBSD sh, or zsh? The ones I listed in the part
of my message you chopped? You're certainly free to consider it a bug, and
not to consider the compatibility concerns that inspired its inclusion in
bash in the first place, but let's not pretend that this is something that
died out a long time ago.

> 
> Further, no-one (not anyone I
> have ever seen) deliberately relies upon the here doc ending at EOF, not
> even if a here doc is in a -c command string or similar).

You never really know, do you?

>   | > OK, here we have another of the oddities of shell syntax.   The spec
>   | > says that a here document starts at the next newline after the << 
> operator,
>   | > but that's not what it really means. 
>   |
>   | I think the intent there is that the here document starts at the next
>   | newline after the delimiter.
> 
> You mean at the newline after the ola" in the example given?   Really?

Which instance of `ola"'? The first or the second? This cannot be a serious
question unless you mean the second. The delimiter is a `word', and we both
know what a shell word is. In other messages from both of us, we agree that
the delimiter is "ola\nI,\nola\nola". The here document body starts at the
next newline following that delimiter. If you want to reject it because the
delimiter contains a newline, that's fine, but let's also not pretend we
don't know what the delimiter is.

> Surely it must mean newline token, not newline character, mustn't it?

The newline after the delimiter is both, but sure, newline token would
probably work better.

> (Even then, there are more, messier, issues, which I know you're aware of;
> if we could make it as simple as "after the lexically next newline token"
> it would make everything much simpler - that's what it should be.)
> 
>   | > Being able to do that (include embedded newline characters
>   | > do in some other shells).
>   |
>   | I couldn't fine one where it does.
> 
> They work in (at least) the NetBSD shell, FreeBSD too I expect, since the
> two use essentially the same mechanism for recognising the end of the
> here doc -- (effectively) after a newline, read chars (from a buffer) one
> at a time, comparing them with the end delimiter, until either there is a
> match failure, or until the end of the end delimiter (after which one more
> char from the buffer is compared to \n).   (Add tab stripping as required).

So it doesn't read `lines' in the POSIX sense? Huh. Who knew?

Chet

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    chet@case.edu    http://tiswww.cwru.edu/~chet/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]