[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Incorrect parsing of DOS/Windows paths ??

From: Paul Smith
Subject: Re: Incorrect parsing of DOS/Windows paths ??
Date: Sun, 18 Dec 2016 13:29:56 -0500

I should have provided more details, sorry about that.

On Sun, 2016-12-18 at 19:33 +0200, Eli Zaretskii wrote:
> > However the content of the if-statement looks weird to me as well; I've
> > checked and it's been like this almost forever though.  We're trying to
> > find the end of the current path.  Why do we keep iterating as long as
> > there's a colon followed by a slash or backslash?
> Because there's no formal definition of what such a line could
> contain, AFAIR.  All that is known for sure is that on Posix hosts we
> stop at the first colon or blank.  The loop tries to find such a place
> that is not a drive letter, without assuming anything about what else
> can or cannot be on that line.

This function is called wherever the next thing the parser wants to see
is a list of files, and it wants to break them up into multiple
individual filenames.  However, most of those places don't treat colons
specially.  For example in the wildcard function we don't check for
colons at all: only whitespace is considered to be a stop-char.

The only place where we treat colons specially is in parsing the targets
of rules, where we have to stop if we find the colon between the target
and the prerequisite, because this syntax is valid:

  foo:bar ; echo hi

which should be parsed as the same as:

  foo : bar ; echo hi

You can see this because there is only one place we pass MAP_COLON as a
stop-char to parse_file_seq().

> > E.g., from what I can see this will accept the following as a valid,
> > single pathname:
> > 
> >   foo:/bar:\biz
> > 
> > ???
> Yes, it will.  And then Make will properly display an error message
> where the results are used as file names.  Is that wrong?  Won't the
> same happen on Unix, if a file /bar:\biz doesn't exist?

No, in UNIX the above will be broken into separate files like this:


Maybe it would be clearer what the issue was if we removed the second

  foo:/bar ; @echo hi

In UNIX this would be "foo : /bar ; @echo hi" which is a valid rule.

If I understand this code correctly (I haven't tested this) then on
Windows with DOS_PATHS make would parse the target as "foo:/bar" and
then give a "missing separator" error.

> > Why wouldn't the correct algorithm be: if we stopped due to a drive
> > specifier (the pathname starts with "[A-Za-z]:")
> A nit: DOS drive letters can be A-z, including the few characters
> between Z and a.

You can really have a drive named "\" or "]"?  That's crazy.  Well, I
can use that implementation if you like.

> >   #ifdef HAVE_DOS_PATHS
> >     /* If we stopped due to a drive specifier, skip it.
> >        Tokens separated by spaces are treated as separate paths since make
> >        doesn't allow path names with spaces */
> >     if (p && p == s+1 && p[0] == ':' && isalpha ((unsigned char)s[0]))
> >         p = find_char_unquote (p+1, stopmap|MAP_VMSCOMMA|MAP_BLANK);
> >   #endif
> If you say so.  IMO, there are too many assumptions here that only a
> GNU Make maintainer can make ;-)  E.g., who said that the drive spec
> must begin at s?  Can't there be whitespace there?

No, if you look at the code where "s" is set the top of the loop goes
like this:

  if (STOP_SET (*p, stopmap))

  s = p;

The NEXT_TOKEN() skips all whitespace stuff so p definitely points to a
"real" character, then s points to p.

> > Note that this doesn't require the drive specifier to be followed by a
> > slash/backslash: e.g., this:
> > 
> >   C:foo.o:C:foo.c
> > 
> > Breaks down as:
> > 
> >   C:foo.o
> >   C:foo.c
> Which I'd rather not support, because it could also be an error, with
> the real target being just C.  IOW, it's ambiguous, and Make had
> better not second-guessed the luser.

OK, we can make that rule (that the only valid drive specifier syntax in
target list must have a slash or backslash after the colon).  Note,
though, that this won't apply to other places where we parse filenames,
because those places don't treat colons specially so this test would
never succeed (p never equals ':' because colon is not a stopchar).

So, the only place make would enforce the rule that you must use "C:/"
and cannot use "C:foo" would be in the target list: not prerequisites,
not wildcard arguments, etc.

> Bottom line: can you tell what can and cannot be on lines that are
> parsed by that function?  Does it have to be one or two valid file
> names?  Otherwise I feel like we are talking about an issue at least
> one of us -- myself -- doesn't fully understand.

If it's still not clear after my explanation above let me know.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]