bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Filename Expansion bug


From: Mickael KENIKSSI
Subject: Re: Filename Expansion bug
Date: Thu, 9 Jan 2020 12:09:22 +0100

Thanks for your comment.

I understand this may not sound of primary importance for you since they
are canonically equivalent, but sometimes what we really all care about is
the path as a literal string (be it well- or ill-formed), and not the
filesystem object it points to.

Normalization upon filename expansion is not the default Bash behavior, so
I see no reason why it should be considered acceptable to have it –
partially – happen on what is no more than an edge case in the end.

zsh (and ksh) provide the expected result:

$ mkdir -p a/b/c d/e/f g/h/e; zsh -c 'printf %s\\n .////a//../*///////*'
> .////a//../a///////b
> .////a//../d///////e
> .////a//../g///////h
>

I suppose it all comes down to an implementation question.

Best,
Mickaël

On Wed, Jan 8, 2020 at 4:09 PM Chet Ramey <chet.ramey@case.edu> wrote:

> On 1/8/20 2:34 AM, Mickael KENIKSSI wrote:
> > Hello,
> >
> > I found a bug regarding how pathnames are processed during filename
> > expansion. The result for non-normalized path-patterns may get mangled
> in a
> > such a way that it becomes inconsistent and unpredictable, making it
> > useless for string comparison or any kind of string manipulation where
> > having it in the exact same form as the pattern is required.
> >
> > How to reproduce :
> >
> > $ mkdir -p a/b/c d/e/f g/h/e; printf '%s\n' .////*//*///////*
> >> .////a/b/c
> >> .////d/e/f
> >> .////g/h/e
> >>
> >
> > This is correct from a filesystem perspective but not from a string
> > perspective, where you'd need each of the computed path as-is:
> >
> > .////a//b///////c
> >> .////d//e///////f
> >> .////g//h///////i
>
> You're not going to get the path with multiple slashes preceding
> pattern characters, because the pathname has single slashes, those
> slashes are, as POSIX says, "explicitly matched by using one or
> more <slash> characters in the pattern," and the matched pathnames
> that replace the pattern don't have multiple slashes.
>
> The reason that the three leading slashes aren't removed is that those
> directory names don't have any pattern characters and are left
> unchanged. Since the kernel's filename resolution treats multiple
> slashes the same as a single slash, the constructed pathname matches
> what's in the file system.
>
> That means, for instance, you have a directory `.////' and a pattern `*'.
> You opendir `////' and read it for every filename matching `*' (a, d, g),
> construct the pathnames, and go on with the rest of the pattern.
>
> The intermediate runs of multiple slashes get removed as part of the
> matching algorithm, as described above. They're essentially null pathname
> components.
>
>
> --
> ``The lyf so short, the craft so long to lerne.'' - Chaucer
>                  ``Ars longa, vita brevis'' - Hippocrates
> Chet Ramey, UTech, CWRU    chet@case.edu    http://tiswww.cwru.edu/~chet/
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]