[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Incorrect parsing of DOS/Windows paths ??

From: Paul Smith
Subject: Incorrect parsing of DOS/Windows paths ??
Date: Sun, 18 Dec 2016 11:44:52 -0500

Hi all (especially Eli! :)).

A bug https://savannah.gnu.org/bugs/?49115 came in about the way we
parse filenames in the read.c:parse_file_seq.  There is a loop that's
supposed to chop a string into individual filenames, and each time
through the loop we search for the end of the string like this:

      /* There are names left, so find the end of the next name.
         Throughout this iteration S points to the start.  */
      s = p;
      p = find_char_unquote (p, stopmap|MAP_VMSCOMMA|MAP_BLANK);

Then if we're parsing DOS paths we check to see if the stopmap contains
a colon and if so, we have to determine if we stopped because of a drive
specifier; the idea, I think, is to support things like this correctly:


should parse as two paths:


The code is:

    /* For DOS paths, skip a "C:\..." or a "C:/..." until we find the
       first colon which isn't followed by a slash or a backslash.
       Note that tokens separated by spaces should be treated as separate
       tokens since make doesn't allow path names with spaces */
    if (stopmap & MAP_COLON)
      while (p != 0 && !ISSPACE (*p) &&
             (p[1] == '\\' || p[1] == '/') && isalpha ((unsigned char)p[-1]))
        p = find_char_unquote (p + 1, stopmap|MAP_VMSCOMMA|MAP_BLANK);

As the bug points out the if is clearly broken; it will always be true.

However the content of the if-statement looks weird to me as well; I've
checked and it's been like this almost forever though.  We're trying to
find the end of the current path.  Why do we keep iterating as long as
there's a colon followed by a slash or backslash?

E.g., from what I can see this will accept the following as a valid,
single pathname:



Did I misread this code, or is there some reason to accept ":/" and ":\"
in the middle of a path in Windows/DOS that I'm not aware of (I'm not a
guru with Windows filesystems)?

Why wouldn't the correct algorithm be: if we stopped due to a drive
specifier (the pathname starts with "[A-Za-z]:") then look once more
until the next stopchar and then we're done?  E.g., I would think it
should look something like:

    /* If we stopped due to a drive specifier, skip it.
       Tokens separated by spaces are treated as separate paths since make
       doesn't allow path names with spaces */
    if (p && p == s+1 && p[0] == ':' && isalpha ((unsigned char)s[0]))
        p = find_char_unquote (p+1, stopmap|MAP_VMSCOMMA|MAP_BLANK);

Note that this doesn't require the drive specifier to be followed by a
slash/backslash: e.g., this:


Breaks down as:


reply via email to

[Prev in Thread] Current Thread [Next in Thread]