[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Unclosed quotes on heredoc mode
From: |
Alex fxmbsw7 Ratchev |
Subject: |
Re: Unclosed quotes on heredoc mode |
Date: |
Sun, 28 Nov 2021 20:51:33 +0100 |
a small comment on that /bin in PATH code.. is invalid, you need to match
first non : beginning ahe not : ending end
case :$PATH: would fix it
On Sun, Nov 28, 2021, 20:31 Robert Elz <kre@munnari.oz.au> wrote:
> Date: Sat, 27 Nov 2021 13:57:57 -0500
> From: Chet Ramey <chet.ramey@case.edu>
> Message-ID: <5217c48e-c989-a163-5673-38995e35a14b@case.edu>
>
> Warning: long message follows, give yourself time to digest it.
>
> | OK, if you do end up building the devel branch, I'd be interested
> | in these results.
>
> Assuming that happens, I shall certainly let you know.
>
> | > Once, of course ... why would I ever build it again?
> |
> | Patches exist. There are vendors who take the original release, apply
> their
> | own special-sauce patches, then apply the patches I release as they
> come
> | out, as part of their own distribution release process.
>
> Of course, NetBSD pkgsrc (used on other systems as well) does that too.
> But your patches appear about every 5-6 months, so I end up doing one
> build every 5-6 months. Keeping the object files (even the unpacked
> sources) sitting around waiting for the next patches, in order to save
> perhaps 2-3 minutes of build time isn't worth the bother. Once built
> and installed it all gets trashed.
> [I have also contemplated doing builds in an MFS (or tmpfs)
> which would vanish on a reboot (or just umount) and I do tend
> to reboot more often than bash patches are released ... but I've
> yet to actually do that, for bash, the build time saved
> wouldn't
> be worth the bother - for some other apps, it might be].
>
> pkgsrc doesn't encourage attempting to retain anything in any case - it
> probably isn't a problem for bash (at least I've never see it, not that
> I ever looked either) but other applications have a habit of deleting files
> from their distributions - and unless one starts from an empty directory,
> unpacking a tarball doesn't cause those files to be removed ... further,
> some build systems don't pay attention to what is supposed to be there,
> and manage to link all the .o files they can find.
>
> It is easier, and more reliable, to simply start clean every time.
>
> But of course that doesn't apply when you're developing and building
> several times a day (or sometimes, dozens of times an hour). That just
> doesn't apply to me with bash.
>
> | Usually, that's ok. In this instance, where we're discussing a feature
> | whose implementation is substantially different between the released
> and
> | development versions, it's more relevant.
>
> Sure, though I didn't know this part was changed so much in the
> devel version until you told me just recently (I do not watch what happens
> there).
>
> | So the ultimate question is whether or not the act of reading a command
> | substitution should reset this requirement. That's where we disagree.
> | The grammar is, at that point, reading a different command.
>
> "command" is a loaded word in sh terminology, it is used for all kinds of
> things, but in general it is not at all unusual for here document text to
> appear while a command other than the one with the redirection operator is
> being processed (no command substitutions necessarily involved). What the
> grammar is doing after a here doc redirection operator has been processed,
> until the next newline (token) is encountered is irrelevant - the spec
> imposes no requirements upon that at all.
>
>
> | > Then we get to whether heredoc data is part of a valid shell script
> | > in that sense - when there is yet to be a newline token to introduce
> it.
> |
> | What does this mean? In all cases, the here-documents are not read
> until
> | after a newline token. That's not the issue.
>
> Sure, but that's not what I meant. I treat heredoc data as much the same
> as a \newline - something that the lexer deals with, and the grammar never
> knows happened. Heredoc data doesn't appear at all in the sh grammar,
> as nothing in the grammar cares in the slightest about them (once they're
> queued). What I meant was that from that perspective, whether a sh script
> (or sh script fragment) is valid or not, is determined by the grammar, and
> given that here doc data does not appear there, it cannot have any impact
> upon the decision whether some particular part of the sh input is valid or
> not. Of course, if the script ends (completely) without a newline token
> after the last redirect operator then that's an error - but of a subtly
> different kind (more like an unterminated string (mismatched quotes) or
> here doc data without its required terminating word -- all lexical
> constructs).
>
> So, if one does
>
> $( cmd <<END )
>
> there's nothing invalid about that, unless EOF follows that ')' before
> a newline token appears. And if that happens, it isn't the grammar that
> complains, but something beyond that. The syntax "word redirect" is
> perfectly valid, and "<< word" is a perfectly valid redirect. The data
> doesn't need to appear there, if no newline has yet appeared, any more
> than it does in
>
> cmd << EOF ; ...
>
> where the data doesn't need to appear there, when a newline has not yet
> appeared.
>
> You seem to be hung up on the way you have chosen to implement $( )
> (which of itself is OK, but it is not required to be done that way)
> where (it seems) you parse the command inside the $() as if there was no
> world at all outside it. As far as getting the grammar correct that's
> fine, but it doesn't work with here doc data.
>
>
> | > | The netbsd shell appears to be the outlier here. The parser
> reads the
> | > | command substitution so it can parse the entire and-or list
> before trying
> | > | to gather any here-documents.
> | >
> | > You cannot possibly really mean that I hope. That is, in
> | >
> | > cmd1 <<EOF &&
> | > data
> | > EOF
> | > cmd2
> | >
> | > you do agree that "data" is stdin to cmd1, that is, the herdoc data
> | > appears splat in the middle of the and-or list. That's certainly
> the
> | > way it appears to work (in bash) to me.
> |
> | There is no command substitution in this example.
>
> I know. But go back and read the quote from you (still here, above, in
> this message) again: "The parser reads the command substitution so it can
> parse the entire and-or list before trying to gather any here-documents"
>
> ** parse the entire and-or list before trying to gather any here documents
> **
>
> I don't believe that you really meant that, it isn't the way bash behaves
> (unless this is something different in the devel version, but I doubt that)
> and I was just pointing out that poor phraseology.
>
> | So, again, the question is whether or not input data that is logically
> | part of the command substitution (it appears between the opening and
> | closing parentheses) should affect the `outer' command. That's the
> | question. We have different answers.
>
> We do, because I don't view here doc data as affecting anything except the
> command for which it is input. As far as the script goes, it is just a
> rather weird method (kind of like the original implementation) of creating
> an anonymous file and then passing that file as input (usually stdin, but
> not required to be) to a command.
>
> Consider this alternative, which is (one possibility for) what would be
> needed if here-docs did not exist:
>
> printf '%s\n' 'data' >/tmp/hidden.data.$$
> cmd </tmp/hidden.data.$$
> rm /tmp/hidden.data.$$
>
> whereas with here-docs, we do instead
>
> cmd <<'END'
> data
> END
>
> That's all fine, and either of those would (more or less) work
> with any shell.
>
> Now consider instead that cmd is to be run in a command substitution.
>
> One can certainly do
>
> ... $(
> printf "%s\n" 'data' >/tmp/hidden.data.$$
> cmd </tmp/hidden.data.$$
> rm /tmp/hidden.data.$$
> ) ...
>
> which is the rough equivalent of
>
> ... $( cmd <<END
> data
> END
> ) ...
>
> and that should work. No question.
>
> But one can also do
>
> printf "%s\n" 'data' >/tmp/hidden.data.$$
> .... $( cmd </tmp/hidden.data.$$ ) ...
> rm rm /tmp/hidden.data.$$
>
> and that would also work everywhere, right? That is, the data for the
> command in the command substitution is created (and removed, but that bit
> of it is generally irrelevant here) outside the command substitution.
>
> This is the rough equivalent of
>
> ... $( cmd << \END ) ...
> data
> END
>
> And then once you allow that to work (which you're apparently now doing
> in the devel version), there cannot really be any objection to
>
> cmd <<END $( cmd1 &&
> data
> END
> cmd2 )
>
> as that's really just the same principle being applied in the other
> direction. Furthermore that means that in
>
> cmd <<END1 $( cmd1 <<END2 &&
>
> (with a newline after the "&&") the data that follows is
>
> data1
> END1
> data2
> END2
>
> keeping the left to right across the input line is the order
> that the standard requires here document data to appear in.
>
> Here "input line" is really a logical line, rather than a physical
> one. as we have already agreed that here docs don't appear in the
> middle of quoted strings, and nor do they appear after elided newlines
> (\newline pairs) which are removed, neither of which generates a newline
> token. But it is "line" not "command", or anything else related to the
> grammar which is specified:
>
> The redirection operators "<<" and "<<-" both allow redirection
> of subsequent lines
>
> "subsequent lines" ie: "lines after the current line"
>
> If more than one "<<" or "<<-" operator is specified on a line,
> the here-document associated with the first operator shall be
> supplied first by the application and shall be read first by the
> shell.
>
> Note: "line", not grammatical command, or script, or and-or list, or
> anything related to the grammar at all. (The grammar generally ignores
> lines, a newline token is almost just a ';' - except we're allowed as
> many newlines as we like, where just one ';' (sometimes none) is
> permitted).
>
> Another example (no cmdsubs again) that is kind of weird, and unlikely,
> but should be permitted, and should work:
>
> cat << END; case $PATH
> data
> END
> in
> *:/bin:*) echo /bin is in PATH! ;;
> esac
>
> Bash (5.1.xx) allows that, so does everything else (aside from some old,
> and not even all that old, ash derived shells which had a bug not relevant
> here). The heredoc data for cat appears splat in the middle of the
> unrelated case statement. No problems, it all works, as it should - but
> probably would not if here-doc data was something known to the grammar.
> But it isn't, the lexer removes it, as far as the grammar & its parser are
> concerned the "data" and "END" lines are not there at all.
>
> kre
>
>
>
- Re: Unclosed quotes on heredoc mode, (continued)
- Re: Unclosed quotes on heredoc mode, Martijn Dekker, 2021/11/23
- Re: Unclosed quotes on heredoc mode, David, 2021/11/23
- Re: Unclosed quotes on heredoc mode, Lawrence Velázquez, 2021/11/23
- Re: Unclosed quotes on heredoc mode, Robert Elz, 2021/11/24
- Re: Unclosed quotes on heredoc mode, Chet Ramey, 2021/11/27
- Re: Unclosed quotes on heredoc mode, Robert Elz, 2021/11/28
- Re: Unclosed quotes on heredoc mode,
Alex fxmbsw7 Ratchev <=
- Re: Unclosed quotes on heredoc mode, Robert Elz, 2021/11/28
- Re: Unclosed quotes on heredoc mode, Alex fxmbsw7 Ratchev, 2021/11/28
- Re: Unclosed quotes on heredoc mode, Greg Wooledge, 2021/11/17
Re: Unclosed quotes on heredoc mode, Chet Ramey, 2021/11/17