bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: !(.pattern) can match . and .. if dotglob is enabled


From: Chet Ramey
Subject: Re: !(.pattern) can match . and .. if dotglob is enabled
Date: Wed, 2 Jun 2021 15:42:24 -0400
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.10.2

On 5/31/21 11:23 AM, Nora Platiel wrote:

How would you improve the wording? What do you think is most important to
cover?

Here is the full paragraph for reference:
When a pattern is used for filename expansion, the character `.' at the
start of a filename or immediately following a slash must be matched
explicitly, unless the shell option dotglob is set. The filenames `.'
and `..' must always be matched explicitly, even if dotglob is set. In
other cases, the `.' character is not treated specially.

First:
The filenames `.' and `..' must always be matched explicitly, even if
dotglob is set.

I agree with gregrwm here.
(https://lists.gnu.org/archive/html/bug-bash/2021-01/msg00251.html)
The sentence seems to imply that you need 2 literal dots to match `..'.

I read your answer and I understand: if that was the case, then the only 
(non-extended) pattern capable of matching `..' would be `..*'.
But my understanding of the expression "matched explicitly" is: matched in its 
entirety via characters that stand for themselves (i.e. not via special pattern 
characters).

The "matched explicitly" refers to the previous sentence, which talks about
the `.' at the start of a filename or path component needing to be matched
explicitly by a pattern beginning with a `.' or containing a `.' at the
right spot (after a `/'). I can add language to clarify that.

The dotglob option basically eliminates that restriction for files that are
not named `.' and `..' (or it tries).

The intersection of dotglob and extended globbing is where the
implementation gets tricky.

Next, there's nothing in the docs about dot treatment in the specific context 
of extended globbing.

That's true. It's one of the questions we're considering here. One option
is to say that the effect of dotglob on `.' and `..' in extended pattern
matching is ignored. (I am not advocating that.)


I expect @(P1|P2) to expand to the union of the matches of the separate 
subpatterns P1 and P2.

This is not unreasonable. You can say similar things about the rest of the
extended globbing operators.

But let's talk specifically about the treatment of `.' and `..'.


Example of expected results:
$ touch .foo bar
$ shopt -s dotglob
$ echo @(.foo|*)
.foo bar

I can see this. It's consistent with the policy that `.' and `..' can only
be matched by a pattern beginning with a literal `.'.

$ echo !(.foo)
bar

There is an equally compelling argument to be made that `.' and `..' should
be included in the results from the second example, since they do not match
the pattern `.foo'. The question is how much `not matching' you want.
`dotglob' only affects the `matching' state. That's the essence of where we
started with this.


It's not intuitive. The dotglob causes all files starting with `.' to be
in the list, the .foo pattern keeps `.' and `..' from being discarded,
and the `*' matches it (since dotglob disables the requirement that an
initial `.' be matched explicitly).

Ok, if things happen in the way and order you described here, then I can 
understand it.

If you want to look at it from an implementation perspective, think of it
this way:

Given the POSIX fnmatch() interface used to match strings, turning on
`dotglob' causes calls to fnmatch *not* to include the FNM_PERIOD flag.
This removes all special treatment of `.'; it's mostly intended to be used
when not matching pathnames, so there's a special FNM_PATHNAME flag to
use with it.

https://pubs.opengroup.org/onlinepubs/9699919799/functions/fnmatch.html#tag_16_154

There's no direct way to treat `.' and `..' specially here.

The bash extglob implementation uses its fnmatch-workalike internally to
match each pattern. Hence the use of heuristics to include and omit `.'
and `..'.

But yes, it's not intuitive. It seems totally arbitrary to me that alternative 
subpatterns in a pattern-list influence each other's behavior concerning `.' 
and `..', but not concerning dot-files in general.

Not being familiar with the actual implementation, I've never considered the exclusion of dot-files and the exclusion of `.' and `..' as two different mechanisms.

The special behavior regarding `.' and `..' is the special case. Using the
standard interfaces, you either have all files beginning with `.', or you
don't. You have to check for them separately, and you can do it at a couple
of different levels depending on your policy.


I would still prefer the behavior I was expecting from the start. I'm having a 
hard time finding good words to document the current behavior, which is 
probably an indication that it is too complex (or that my English sucks :D).

I'm not averse to changing the current behavior. This is a niche case.
Then instead of figuring out language to describe the current behavior,
let's figure out language to describe the desired behavior.

Another minor observation:
it is not documented that the dot in the pattern must be also at the beginning to be able 
to "match explicitly".
$ shopt -u dotglob; echo *.foo  # doesn't match `.foo'
$ shopt -s dotglob; echo *..    # doesn't match `..'
Even though you could say that the dot is "matched explicitly" in both cases.

See above.

I find it also interesting that:
$ shopt -u dotglob; echo !(bar).foo  # doesn't match `.foo'

The pattern doesn't begin with a `.', and it's not special-cased. Bash,
ksh93, and mksh agree on this.

$ shopt -u dotglob; echo ?(bar).foo  # matches `.foo'
$ shopt -u dotglob; echo *(bar).foo  # matches `.foo'

Behavior varies. Bash chooses to go the ksh93 compatibility route, so these
get special-cased because they can match zero times.

Your English is fine. You want to take a shot at a sentence or two
describing your desired behavior? It should not take more than that.

However it works out, I'll have to write more special-case code to
implement it.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    chet@case.edu    http://tiswww.cwru.edu/~chet/









reply via email to

[Prev in Thread] Current Thread [Next in Thread]