bug-cvs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gnulib] New GNULIB glob module?


From: Derek Price
Subject: Re: [bug-gnulib] New GNULIB glob module?
Date: Wed, 01 Jun 2005 12:19:34 -0400
User-agent: Mozilla Thunderbird 1.0.2 (Windows/20050317)

Paul Eggert wrote:

>Derek Price <derek@ximbiot.com> writes:
>
>  
>
>>   1. Corrects an incorrect check for a successful return from
>>      getlogin_r to assume only 0 means success, per the POSIX2 spec:
>>      
>> <http://www.opengroup.org/onlinepubs/009695399/functions/getlogin_r.html>.
>>   2. Moves the check for GLOB_MARK directory status (and the append of
>>      `/') into glob_in_dir, where it is more efficient than performing
>>      a second pass and sometimes calling stat a second time on each
>>      file or directory.  All calls to stat are avoided when
>>      dirent->d_type is available.  No call to realloc of the directory
>>      name is ever necessary since room for the slash can be allocated
>>      in the first pass.
>>    
>>
>
>These changes sound reasonable, though we should submit them as
>separate patches.
>  
>

The attached  glibc-understand-getlogin_r.diff may come off as a little
fuzzy, but it should work for (1).  I've also reattached the patch for 2
& 3 minus the above chunk, as glibc-glob-list-links2.diff.

>Is (2) independent of (3)?  (Please see below for why this is important.)
>  
>

(2) could be separated with a little more work, but lets finish the
discussion below before I provide a second patch.

>>   3. Ignores broken links only when GLOB_ONLYDIR is set.  With glibc
>>      versions 2.3.3 through 2.3.5, the following in an empty directory
>>      would return nothing:
>>
>>          ln -s doesnt-exist linkname
>>          glob ("*", ...)
>>
>>      This fix syncs with the comments in the file, syncs with the
>>      POSIX2 spec, restores the pre-glibc-2.3.3 behavior, and simply
>>      makes more sense - why should `ls *' fail to list broken links?
>>    
>>
>
>This change sounds controversial to me.  glibc 2.3.5 behaves similarly
>  
>

It may be.  It looks like the change was intentional
(http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/generic/glob.c?rev=1.52&content-type=text/x-cvsweb-markup&cvsroot=glibc),
but I still disagree.

>to Solaris 8 and to Solaris 10 -- I just checked, with the following
>program and with the working directory containing only a dangling
>symlink:
>
>  #include <glob.h>
>  #include <stdio.h>
>
>  int
>  main (void)
>  {
>    glob_t g;
>    int r = glob ("*", 0, NULL, &g);
>    int i;
>
>    if (r != 0)
>      {
>        fprintf (stderr, "glob failed (%s)\n",
>                 r == GLOB_NOMATCH ? "GLOB_NOMATCH"
>                 : r == GLOB_NOSPACE ? "GLOB_NOSPACE"
>                 : "other glob failure");
>        return 1;
>      }
>
>    for (i = 0; i < g.gl_pathc; i++)
>      puts (g.gl_pathv[i]);
>    return 0;
>  }
>
>Solaris 8 and 10 both report "glob failed (GLOB_NOMATCH)".
>
>Let's separate (3) into a separate patch and think about it more
>carefully before submitting it.
>  
>

This would involve creating at least one new function for (2) which
would just need to be removed again for (3), and the POSIX2 glob spec
states that glob returns GLOB_NOMATCH only when, "the pattern does not
match any existing pathname, and GLOB_NOCHECK was not set in flags." 
(http://www.opengroup.org/onlinepubs/009695399/functions/glob.html)

Nowhere does the POSIX2 glob spec specify that a broken symlink should
not be considered an "existing pathname".  After all, the link exists,
only its target does not.  Interpreted in this way, glob cannot be used
by a program which, for instance, wished to verify that all links
matching a pattern had valid targets since the broken links would not be
returned by glob.

If glob is only going to consider a link as if it is its target, then
what about the case where a matching link points to a file in the same
directory that also matches the pattern?  Should glob only return one or
the other?

Perhaps a GNU extension similar to GLOB_ONLYDIR is in order
(GLOB_VERIFY_SYMLINKS?), but I do not think glob should be making these
value judgements when the user did not request it.  It certainly does
not appear to be implied by the POSIX spec as I read it.

>Have you investigated with FreeBSD glob does?  
>

No, but I ran your test program on NetBSD 1.6.1 and it does return the
broken symlink.

I copied and pasted most of my above commentary on why this should be
considered a bug into
<https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=126460>, where I
found the discussion that apparently caused the original "bug fix" that
broke this.

Cheers,

Derek

--- ../glob-glibc2gnulib/lib/glob.c     2005-05-31 18:12:01.000000000 -0400
+++ lib/glob.c  2005-05-27 13:52:28.000000000 -0400
@@ -538,7 +539,7 @@ glob (const char *pattern, int flags,
                buflen = 20;
              name = __alloca (buflen);
 
-             success = getlogin_r (name, buflen) >= 0;
+             success = getlogin_r (name, buflen) == 0;
              if (success)
                {
                  struct passwd *p;
--- ../glob-glibc2gnulib/lib/glob.c     2005-05-31 18:12:01.000000000 -0400
+++ lib/glob.c  2005-05-27 13:52:28.000000000 -0400
@@ -175,6 +175,7 @@
 # define __glob_pattern_p      glob_pattern_p
 #endif /* _LIBC */
 
+#include <stdbool.h>
 #include <fnmatch.h>
 
 #ifdef _SC_GETPW_R_SIZE_MAX
@@ -868,35 +869,6 @@ glob (const char *pattern, int flags,
        }
     }
 
-  if (flags & GLOB_MARK)
-    {
-      /* Append slashes to directory names.  */
-      size_t i;
-      struct stat st;
-#ifdef HAVE_STAT64
-      struct stat64 st64;
-#endif
-
-      for (i = oldcount; i < pglob->gl_pathc + pglob->gl_offs; ++i)
-       if (((flags & GLOB_ALTDIRFUNC)
-            ? ((*pglob->gl_stat) (pglob->gl_pathv[i], &st) == 0
-               && S_ISDIR (st.st_mode))
-            : (__stat64 (pglob->gl_pathv[i], &st64) == 0
-               && S_ISDIR (st64.st_mode))))
-         {
-           size_t len = strlen (pglob->gl_pathv[i]) + 2;
-           char *new = realloc (pglob->gl_pathv[i], len);
-           if (new == NULL)
-             {
-               globfree (pglob);
-               pglob->gl_pathc = 0;
-               return GLOB_NOSPACE;
-             }
-           strcpy (&new[len - 2], "/");
-           pglob->gl_pathv[i] = new;
-         }
-    }
-
   if (!(flags & GLOB_NOSORT))
     {
       /* Sort the vector.  */
@@ -1054,9 +1026,9 @@ weak_alias (__glob_pattern_p, glob_patte
 /* We put this in a separate function mainly to allow the memory
    allocated with alloca to be recycled.  */
 #if !defined _LIBC || !defined GLOB_ONLY_P
-static int
-link_exists_p (const char *dir, size_t dirlen, const char *fname,
-              glob_t *pglob, int flags)
+static bool
+is_dir_p (const char *dir, size_t dirlen, const char *fname,
+         glob_t *pglob, int flags)
 {
   size_t fnamelen = strlen (fname);
   char *fullname = __alloca (dirlen + 1 + fnamelen + 1);
@@ -1066,9 +1038,9 @@ link_exists_p (const char *dir, size_t d
   mempcpy (mempcpy (mempcpy (fullname, dir, dirlen), "/", 1),
           fname, fnamelen + 1);
 
-  return (((flags & GLOB_ALTDIRFUNC)
-          ? (*pglob->gl_stat) (fullname, &st)
-          : __stat64 (fullname, &st64)) == 0);
+  return ((flags & GLOB_ALTDIRFUNC)
+         ? (*pglob->gl_stat) (fullname, &st) == 0 && S_ISDIR (st.st_mode)
+         : __stat64 (fullname, &st64) == 0 && S_ISDIR (st64.st_mode));
 }
 #endif
 
@@ -1211,20 +1183,35 @@ glob_in_dir (const char *pattern, const 
 
                  if (fnmatch (pattern, name, fnm_flags) == 0)
                    {
-                     /* If the file we found is a symlink we have to
-                        make sure the target file exists.  */
-                     if (
-                         !DIRENT_MIGHT_BE_SYMLINK (d) ||
-                         link_exists_p (directory, dirlen, name, pglob,
-                                        flags))
+                     /* ISDIR will often be incorrectly set to false
+                        when not in GLOB_ONLYDIR || GLOB_MARK mode, but we
+                        don't care.  It won't be used and we save the
+                        expensive call to stat.  */
+                     int need_dir_test =
+                       (GLOB_MARK | (DIRENT_MIGHT_BE_SYMLINK (d)
+                                     ? GLOB_ONLYDIR : 0));
+                     bool isdir = (DIRENT_MUST_BE (d, DT_DIR)
+                                   || ((flags & need_dir_test)
+                                       && is_dir_p (directory, dirlen, name,
+                                                    pglob, flags)));
+
+                     /* In GLOB_ONLYDIR mode, skip non-dirs.  */
+                     if ((flags & GLOB_ONLYDIR) && !isdir)
+                         continue;
+
                        {
                          struct globlink *new =
                            __alloca (sizeof (struct globlink));
+                         char *p;
                          len = NAMLEN (d);
-                         new->name = malloc (len + 1);
+                         new->name =
+                           malloc (len + 1 + ((flags & GLOB_MARK) && isdir));
                          if (new->name == NULL)
                            goto memory_error;
-                         *((char *) mempcpy (new->name, name, len)) = '\0';
+                         p = mempcpy (new->name, name, len);
+                         if ((flags & GLOB_MARK) && isdir)
+                             *p++ = '/';
+                         *p = '\0';
                          new->next = names;
                          names = new;
                          ++nfound;

reply via email to

[Prev in Thread] Current Thread [Next in Thread]