[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Incorrect parsing of DOS/Windows paths ??

From: Paul Smith
Subject: Re: Incorrect parsing of DOS/Windows paths ??
Date: Tue, 20 Dec 2016 00:31:50 -0500

On Sun, 2016-12-18 at 22:10 +0200, Eli Zaretskii wrote:
> > Maybe it would be clearer what the issue was if we removed the second
> > colon:
> > 
> >   foo:/bar ; @echo hi
> > 
> > In UNIX this would be "foo : /bar ; @echo hi" which is a valid rule.
> On Windows, this is ambiguous, i.e. it invokes "undefined behavior".
> There's no reason for Make on Windows to second-guess that the user
> meant "foo : /bar".  (In practice, no one writes such rules, certainly
> not on Windows.)

Well, that's a perfectly valid makefile and the POSIX standard specifies
exactly how it should be parsed.  I can't see any reason why Windows
should not following the standard in this case.  It's simple for make to
realize that "foo:/bar" is not a valid path on Windows and hence, it
must be parsed as "foo : /bar".

Now, if the rule was this:

  c:/bar ; @echo hi

that would be a lot harder to recognize; in theory we could do so, since
there's only one legal way to parse this.  But I don't propose to
attempt to do that (mainly because it would be extremely difficult); in
this situation we will treat the first word as a single filename and
emit a missing separator error.

> Yes.  And why is that a problem?  Support for target/file names with
> drive letters is already a certain violation of the Makefile grammar,
> so this is uncharted territory.  We can set any rules that are
> convenient to us.

Sure, but insofar as the standard rules are easy to follow why not just
follow them?

> Well, one issue is that a target name doesn't have to be a valid file
> name, right?  Do we want to support such target names?

It doesn't have to be a valid filename, no.  But we still have to parse
the line.  In the absence of any particular reason (e.g., supporting
drive letters when HAVE_DOS_PATHS is set) I'd prefer to have the parsing
of the line behave the same on Windows and UNIX.

> And I still feel we should have some kind of formal definition of what
> constitutes a valid line parsed by that function.  Leaving this to the
> code to tell is less desirable, IMO.

I can improve the comments for that function; unfortunately while it
does explain a lot it doesn't mention the stopmap argument.

The string parsed by that function is interpreted a sequence of
whitespace-separated filenames.  The parsing of the string will stop at
the end of the string, or else at one of the characters specified by the
stopmap (which may or may not be set), whichever comes first.

The behavior I propose to implement is that in the case where
HAVE_DOS_PATHS is set, and where we can determine that we stopped at
the second character in the current filename which is equal to ":", and
the first character was [A-Za-z], and the third character is "\" or "/"
(by your request), then we'll assume that this colon is actually part of
the filename and we won't stop parsing the string, but instead we'll
continue until the next valid stopping point; if that's another ":" in
the string before the end of the current filename then we won't treat
that one specially and will recognize it as a stop character.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]