bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: mgetgroups improvements


From: Eric Blake
Subject: Re: mgetgroups improvements
Date: Fri, 04 Dec 2009 19:55:27 -0700
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.23) Gecko/20090812 Thunderbird/2.0.0.23 Mnenhy/0.7.6.666

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

According to Pádraig Brady on 12/4/2009 7:14 PM:
>>> Why does the first element get extra checks.
>>> Is it more likely to be duplicated?
>> Yes, for two reasons.  One, we pre-insert an arbitrary gid_t to the front..
>> Two, some OS's actually list the effective gid twice, once at the front..
> 
> Ah OK, a comment along those lines would be useful.

Good thinking.  I've pushed the first two, and here's a respin of the
third for further review.

> :) May I suggest this comment.
> gid_t *last = g + ng; /* sentinel after the last element */

Or even s/last/sentinel/g

For what it's worth, here's one reason why this patch is worthwhile:
$ id --version | head -n1
id (GNU coreutils) 6.10
$ id
uid=1007(eblake) gid=513(None)
groups=0(root),513(None),544(Administrators),545(Users)
$ src/id --version | head -n1
id (GNU coreutils) 8.1.20-9b4b38
$ src/id
uid=1007(eblake) gid=513(None)
groups=513(None),0(root),513(None),544(Administrators),545(Users)
$ src/id -G
513 0 544 545

So, there has been a change in behavior between coreutils 6.10 and current
coreutils (without the duplicate removal), which lists the effective gid
twice if it was not first in the getgroups() return, and which is evident
on my machine.  Yet, when using id -G, the duplicate is already removed
(id -G goes through a slightly different path than id).  So, for
consistency, applying this patch would make 'id' and 'id -G' both list the
effective gid 513 first, and only once among the entire list.

By the way, I can't see anything in POSIX that forbids duplicates in the
id(1) output, but needlessly listing extra entries is certainly a QoI issue.

- --
Don't work too hard, make some time for fun as well!

Eric Blake             address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAksZy58ACgkQ84KuGfSFAYDOQgCfQNgtnzFFDCvyxMJeTqHGQZQ9
HJwAn25qKWyFOLIFbugFdHBtNHBQ4dyR
=kL2k
-----END PGP SIGNATURE-----
>From bd1a8af70d082f6fafaebab3668b5c4131302580 Mon Sep 17 00:00:00 2001
From: Eric Blake <address@hidden>
Date: Fri, 4 Dec 2009 14:07:58 -0700
Subject: [PATCH] mgetgroups: reduce duplicate listings

POSIX doesn't guarantee whether the effective gid is included in
the list of supplementary groups returned by getgroups.  On the
other hand, some platforms include the effective gid twice in
the list.  Meanwhile, mgetgroups can independently add a duplicate.
Rather than spend a full-blown O(n log n) cleanup, we just remove
the most common forms of duplicate groups with an O(n) pass.

* lib/mgetgroups.c (mgetgroups): Reduce duplicates from the
resulting array.
* tests/test-chown.h (test_chown): Simplify client.
* tests/test-lchown.h (test_lchown): Likewise.

Signed-off-by: Eric Blake <address@hidden>
---
 ChangeLog           |    6 ++++++
 lib/mgetgroups.c    |   34 +++++++++++++++++++++++++++++++++-
 tests/test-chown.h  |   27 +++++++++------------------
 tests/test-lchown.h |   33 ++++++++++++---------------------
 4 files changed, 60 insertions(+), 40 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 48e1026..19ba980 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,11 @@
 2009-12-04  Eric Blake  <address@hidden>

+       mgetgroups: reduce duplicate listings
+       * lib/mgetgroups.c (mgetgroups): Reduce duplicates from the
+       resulting array.
+       * tests/test-chown.h (test_chown): Simplify client.
+       * tests/test-lchown.h (test_lchown): Likewise.
+
        mgetgroups: add xgetgroups, and avoid ENOSYS failures
        * lib/mgetgroups.h (xgetgroups): New prototype.
        * lib/mgetgroups.c (xgetgroups): New wrapper.
diff --git a/lib/mgetgroups.c b/lib/mgetgroups.c
index 7a61db4..5ca450f 100644
--- a/lib/mgetgroups.c
+++ b/lib/mgetgroups.c
@@ -53,7 +53,8 @@ realloc_groupbuf (gid_t *g, size_t num)
    NULL, store the supplementary groups of the current process, and GID
    should be -1 or the effective group ID (getegid).  Upon failure,
    don't modify *GROUPS, set errno, and return -1.  Otherwise, return
-   the number of groups.  */
+   the number of groups.  The resulting list may contain duplicates,
+   but adjacent members will be distinct.  */

 int
 mgetgroups (char const *username, gid_t gid, gid_t **groups)
@@ -157,6 +158,37 @@ mgetgroups (char const *username, gid_t gid, gid_t 
**groups)
       ng++;
     }
   *groups = g;
+
+  /* Reduce the number of duplicates.  On some systems, getgroups
+     returns the effective gid twice: once as the first element, and
+     once in its position within the supplementary groups.  On other
+     systems, getgroups does not return the effective gid at all,
+     which is why we provide a GID argument.  Meanwhile, the GID
+     argument, if provided, is typically any member of the
+     supplementary groups, and not necessarily the effective gid.  So,
+     the most likely duplicates are the first element with an
+     arbitrary other element, or pair-wise duplication between the
+     first and second elements returned by getgroups.  It is possible
+     that this O(n) pass will not remove all duplicates, but it is not
+     worth the effort to slow down to an O(n log n) algorithm that
+     sorts the array in place, nor the extra memory needed for
+     duplicate removal via an O(n) hash-table.  Hence, this function
+     is only documented as guaranteeing no pair-wise duplicates,
+     rather than returning the minimal set.  */
+  {
+    gid_t first = *g;
+    gid_t *next;
+    gid_t *sentinel = g + ng;
+
+    for (next = g + 1; next < sentinel; next++)
+      {
+        if (*next == first || *next == *g)
+          ng--;
+        else
+          *++g = *next;
+      }
+  }
+
   return ng;
 }

diff --git a/tests/test-chown.h b/tests/test-chown.h
index 098b77d..c4e652a 100644
--- a/tests/test-chown.h
+++ b/tests/test-chown.h
@@ -170,21 +170,12 @@ test_chown (int (*func) (char const *, uid_t, gid_t), 
bool print)
      changing group ownership of a file we own.  If we belong to at
      least two groups, then verifying the correct change is simple.
      But if we belong to only one group, then we fall back on the
-     other observable effect of chown: the ctime must be updated.
-     Be careful of duplicates returned by getgroups.  */
-  gids_count = mgetgroups (NULL, -1, &gids);
-  if (2 <= gids_count && gids[0] == gids[1] && 2 < gids_count--)
-    gids[1] = gids[2];
-  if (1 < gids_count || (gids_count == 1 && gids[0] != st1.st_gid))
+     other observable effect of chown: the ctime must be updated.  */
+  gids_count = mgetgroups (NULL, st1.st_gid, &gids);
+  if (1 < gids_count)
     {
-      if (gids[0] == st1.st_gid)
-        {
-          ASSERT (1 < gids_count);
-          ASSERT (gids[0] != gids[1]);
-          gids[0] = gids[1];
-        }
-      ASSERT (gids[0] != st1.st_gid);
-      ASSERT (gids[0] != (gid_t) -1);
+      ASSERT (gids[1] != st1.st_gid);
+      ASSERT (gids[1] != (gid_t) -1);
       ASSERT (lstat (BASE "dir/link", &st2) == 0);
       ASSERT (st1.st_uid == st2.st_uid);
       ASSERT (st1.st_gid == st2.st_gid);
@@ -193,7 +184,7 @@ test_chown (int (*func) (char const *, uid_t, gid_t), bool 
print)
       ASSERT (st1.st_gid == st2.st_gid);

       errno = 0;
-      ASSERT (func (BASE "dir/link2/", -1, gids[0]) == -1);
+      ASSERT (func (BASE "dir/link2/", -1, gids[1]) == -1);
       ASSERT (errno == ENOTDIR);
       ASSERT (stat (BASE "dir/file", &st2) == 0);
       ASSERT (st1.st_uid == st2.st_uid);
@@ -205,10 +196,10 @@ test_chown (int (*func) (char const *, uid_t, gid_t), 
bool print)
       ASSERT (st1.st_uid == st2.st_uid);
       ASSERT (st1.st_gid == st2.st_gid);

-      ASSERT (func (BASE "dir/link2", -1, gids[0]) == 0);
+      ASSERT (func (BASE "dir/link2", -1, gids[1]) == 0);
       ASSERT (stat (BASE "dir/file", &st2) == 0);
       ASSERT (st1.st_uid == st2.st_uid);
-      ASSERT (gids[0] == st2.st_gid);
+      ASSERT (gids[1] == st2.st_gid);
       ASSERT (lstat (BASE "dir/link", &st2) == 0);
       ASSERT (st1.st_uid == st2.st_uid);
       ASSERT (st1.st_gid == st2.st_gid);
@@ -238,7 +229,7 @@ test_chown (int (*func) (char const *, uid_t, gid_t), bool 
print)
       ASSERT (l2.st_ctime == st2.st_ctime);
       ASSERT (get_stat_ctime_ns (&l2) == get_stat_ctime_ns (&st2));

-      ASSERT (func (BASE "dir/link2", -1, gids[0]) == 0);
+      ASSERT (func (BASE "dir/link2", -1, st1.st_gid) == 0);
       ASSERT (stat (BASE "dir/file", &st2) == 0);
       ASSERT (st1.st_ctime < st2.st_ctime
               || (st1.st_ctime == st2.st_ctime
diff --git a/tests/test-lchown.h b/tests/test-lchown.h
index 43690ac..93a657c 100644
--- a/tests/test-lchown.h
+++ b/tests/test-lchown.h
@@ -189,21 +189,12 @@ test_lchown (int (*func) (char const *, uid_t, gid_t), 
bool print)
      changing group ownership of a file we own.  If we belong to at
      least two groups, then verifying the correct change is simple.
      But if we belong to only one group, then we fall back on the
-     other observable effect of lchown: the ctime must be updated.
-     Be careful of duplicates returned by getgroups.  */
-  gids_count = mgetgroups (NULL, -1, &gids);
-  if (2 <= gids_count && gids[0] == gids[1] && 2 < gids_count--)
-    gids[1] = gids[2];
-  if (1 < gids_count || (gids_count == 1 && gids[0] != st1.st_gid))
+     other observable effect of lchown: the ctime must be updated.  */
+  gids_count = mgetgroups (NULL, st1.st_gid, &gids);
+  if (1 < gids_count)
     {
-      if (gids[0] == st1.st_gid)
-        {
-          ASSERT (1 < gids_count);
-          ASSERT (gids[0] != gids[1]);
-          gids[0] = gids[1];
-        }
-      ASSERT (gids[0] != st1.st_gid);
-      ASSERT (gids[0] != (gid_t) -1);
+      ASSERT (gids[1] != st1.st_gid);
+      ASSERT (gids[1] != (gid_t) -1);
       ASSERT (lstat (BASE "dir/link", &st2) == 0);
       ASSERT (st1.st_uid == st2.st_uid);
       ASSERT (st1.st_gid == st2.st_gid);
@@ -212,7 +203,7 @@ test_lchown (int (*func) (char const *, uid_t, gid_t), bool 
print)
       ASSERT (st1.st_gid == st2.st_gid);

       errno = 0;
-      ASSERT (func (BASE "dir/link2/", -1, gids[0]) == -1);
+      ASSERT (func (BASE "dir/link2/", -1, gids[1]) == -1);
       ASSERT (errno == ENOTDIR);
       ASSERT (stat (BASE "dir/file", &st2) == 0);
       ASSERT (st1.st_uid == st2.st_uid);
@@ -224,7 +215,7 @@ test_lchown (int (*func) (char const *, uid_t, gid_t), bool 
print)
       ASSERT (st1.st_uid == st2.st_uid);
       ASSERT (st1.st_gid == st2.st_gid);

-      ASSERT (func (BASE "dir/link2", -1, gids[0]) == 0);
+      ASSERT (func (BASE "dir/link2", -1, gids[1]) == 0);
       ASSERT (stat (BASE "dir/file", &st2) == 0);
       ASSERT (st1.st_uid == st2.st_uid);
       ASSERT (st1.st_gid == st2.st_gid);
@@ -233,7 +224,7 @@ test_lchown (int (*func) (char const *, uid_t, gid_t), bool 
print)
       ASSERT (st1.st_gid == st2.st_gid);
       ASSERT (lstat (BASE "dir/link2", &st2) == 0);
       ASSERT (st1.st_uid == st2.st_uid);
-      ASSERT (gids[0] == st2.st_gid);
+      ASSERT (gids[1] == st2.st_gid);

       /* Trailing slash follows through to directory.  */
       ASSERT (lstat (BASE "dir/link3", &st2) == 0);
@@ -243,13 +234,13 @@ test_lchown (int (*func) (char const *, uid_t, gid_t), 
bool print)
       ASSERT (st1.st_uid == st2.st_uid);
       ASSERT (st1.st_gid == st2.st_gid);

-      ASSERT (func (BASE "dir/link3/", -1, gids[0]) == 0);
+      ASSERT (func (BASE "dir/link3/", -1, gids[1]) == 0);
       ASSERT (lstat (BASE "dir/link3", &st2) == 0);
       ASSERT (st1.st_uid == st2.st_uid);
       ASSERT (st1.st_gid == st2.st_gid);
       ASSERT (lstat (BASE "dir/sub", &st2) == 0);
       ASSERT (st1.st_uid == st2.st_uid);
-      ASSERT (gids[0] == st2.st_gid);
+      ASSERT (gids[1] == st2.st_gid);
     }
   else if (!CHOWN_CHANGE_TIME_BUG || HAVE_LCHMOD)
     {
@@ -275,7 +266,7 @@ test_lchown (int (*func) (char const *, uid_t, gid_t), bool 
print)
       ASSERT (l2.st_ctime == st2.st_ctime);
       ASSERT (get_stat_ctime_ns (&l2) == get_stat_ctime_ns (&st2));

-      ASSERT (func (BASE "dir/link2", -1, gids[0]) == 0);
+      ASSERT (func (BASE "dir/link2", -1, st1.st_gid) == 0);
       ASSERT (stat (BASE "dir/file", &st2) == 0);
       ASSERT (st1.st_ctime == st2.st_ctime);
       ASSERT (get_stat_ctime_ns (&st1) == get_stat_ctime_ns (&st2));
@@ -291,7 +282,7 @@ test_lchown (int (*func) (char const *, uid_t, gid_t), bool 
print)
       ASSERT (lstat (BASE "dir/sub", &st1) == 0);
       ASSERT (lstat (BASE "dir/link3", &l1) == 0);
       nap ();
-      ASSERT (func (BASE "dir/link3/", -1, gids[0]) == 0);
+      ASSERT (func (BASE "dir/link3/", -1, st1.st_gid) == 0);
       ASSERT (lstat (BASE "dir/link3", &st2) == 0);
       ASSERT (l1.st_ctime == st2.st_ctime);
       ASSERT (get_stat_ctime_ns (&l1) == get_stat_ctime_ns (&st2));
-- 
1.6.5.rc1


reply via email to

[Prev in Thread] Current Thread [Next in Thread]