help-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: What does `echo xxx 1>&2xxx` do?


From: Chet Ramey
Subject: Re: What does `echo xxx 1>&2xxx` do?
Date: Sat, 8 May 2021 17:50:04 -0400
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.9.1

On 5/8/21 12:22 PM, Peng Yu wrote:
https://git.savannah.gnu.org/cgit/bash.git/tree/parse.y#n346

How does yylex() know "1" should be treated as a <number> as in "1>&2"
and "1" should be treated as a <word> as in "1 >&2"? Could anybody
explain how this context-dependency is resolved by yylex() in detail?

The short answer is that tokens are delimited by metacharacters. Space and
`>' are both metacharacters that delimit the token "1".

You can get a long way by simply reading the POSIX grammar and rules for
recognizing and classifying tokens. These are the two relevant rules from

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_03 :

6. If the current character is not quoted and can be used as the first
   character of a new operator, the current token (if any) shall be
   delimited. The current character shall be used as the beginning of the
   next (operator) token.

7. If the current character is an unquoted <blank>, any token containing
   the previous character is delimited and the current character shall be
   discarded.

The `>' delimited token follows this POSIX rule from

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_10_01 :

2. If the string consists solely of digits and the delimiter character is
   one of '<' or '>', the token identifier IO_NUMBER shall be returned.

If you want to see where that happens, look at where read_token_word()
returns NUMBER.

The space-delimited token is a TOKEN and follows this rule:

3. Otherwise, the token identifier TOKEN results.

The TOKEN is further classified as the grammar requires. POSIX puts it like
this:

"Further distinction on TOKEN is context-dependent. It may be that the same
TOKEN yields WORD, a NAME, an ASSIGNMENT_WORD, or one of the reserved words
below, dependent upon the context."

In this case, it's a WORD.


How the parsing is done?

If we literally interpret the manual, the distinction between
filenames and file descriptors is not processed at the parsing level.

Correct, for the most part. Read

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_07_05

for instance; its description of `word' is basically identical to the bash
manual text.

The bash tokenizer takes a shortcut and skips the expansion for a `word'
consisting entirely of digits and immediately classifies it as a NUMBER
if it's in the right place in a redirection operator.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    chet@case.edu    http://tiswww.cwru.edu/~chet/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]