bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: hash resizing bug


From: Eric Blake
Subject: Re: hash resizing bug
Date: Wed, 17 Jun 2009 13:02:42 -0600
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.21) Gecko/20090302 Thunderbird/2.0.0.21 Mnenhy/0.7.6.666

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

According to Eric Blake on 6/17/2009 12:21 PM:
> Before the rehash, about 20% of the buckets were empty.  But after changing 
> the 
> prime factor, the hasher function scatters better, and _every single bucket_ 
> is 
> occupied.  Which means ALL subsequent calls to hash_insert will trigger an 
> overflow, rather than inserting into a bucket head.  And since the resize 
> checking logic in hash_rehash is only triggered after insertion into a bucket 
> head, the hash table has effectively been locked into a fixed size that will 
> never grow again, no matter how many hash_insert calls are made, even though 
> the table is certainly exceeding the requested threshold of 20% empty 
> buckets.  
> Ouch.

I'm now playing with this simple code motion patch, also available at:
http://repo.or.cz/w/gnulib/ericb.git
$ git pull git://repo.or.cz/gnulib/ericb.git master

- --
Don't work too hard, make some time for fun as well!

Eric Blake             address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAko5PdIACgkQ84KuGfSFAYBqZgCeOqttX/POCKvJXLBVubjaee9M
Tr4AoIMsLpf+3LELwthAN6QK3cUpo/9y
=I5iP
-----END PGP SIGNATURE-----
>From d3edc0d1e302af613e8b56396f1f9d0bec15a264 Mon Sep 17 00:00:00 2001
From: Eric Blake <address@hidden>
Date: Wed, 17 Jun 2009 13:01:41 -0600
Subject: [PATCH] hash: check for resize before insertion

* lib/hash.c (hash_insert): Check whether bucket usage exceeds
threshold before insertion, so that a pathological hash_rehash
that fills every bucket can still trigger another rehash.

Signed-off-by: Eric Blake <address@hidden>
---
 ChangeLog  |    5 +++++
 lib/hash.c |   52 ++++++++++++++++++++++++++--------------------------
 2 files changed, 31 insertions(+), 26 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 1f57b7e..38c1d45 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,10 @@
 2009-06-17  Eric Blake  <address@hidden>

+       hash: check for resize before insertion
+       * lib/hash.c (hash_insert): Check whether bucket usage exceeds
+       threshold before insertion, so that a pathological hash_rehash
+       that fills every bucket can still trigger another rehash.
+
        hash: manage bucket overflows more efficiently
        * lib/hash.c [USE_OBSTACK]: Delete; the optimizations provided by
        an obstack have been folded in to mainline code.
diff --git a/lib/hash.c b/lib/hash.c
index 9d78990..e4310b5 100644
--- a/lib/hash.c
+++ b/lib/hash.c
@@ -999,32 +999,6 @@ hash_insert (Hash_table *table, const void *entry)
   if ((data = hash_find_entry (table, entry, &bucket, false)) != NULL)
     return data;

-  /* ENTRY is not matched, it should be inserted.  */
-
-  if (bucket->data)
-    {
-      struct hash_entry *new_entry = allocate_entry (table);
-
-      if (new_entry == NULL)
-       return NULL;
-      if (table->overflow_size <= new_entry - table->overflow)
-       abort ();
-
-      /* Add ENTRY in the overflow of the bucket.  */
-
-      new_entry->data = (void *) entry;
-      new_entry->next = bucket->next;
-      bucket->next = new_entry - table->overflow;
-      table->n_entries++;
-      return (void *) entry;
-    }
-
-  /* Add ENTRY right in the bucket head.  */
-
-  bucket->data = (void *) entry;
-  table->n_entries++;
-  table->n_buckets_used++;
-
   /* If the growth threshold of the buckets in use has been reached, increase
      the table size and rehash.  There's no point in checking the number of
      entries:  if the hashing function is ill-conditioned, rehashing is not
@@ -1057,6 +1031,32 @@ hash_insert (Hash_table *table, const void *entry)
        }
     }

+  /* ENTRY is not matched, it should be inserted.  */
+
+  if (bucket->data)
+    {
+      struct hash_entry *new_entry = allocate_entry (table);
+
+      if (new_entry == NULL)
+       return NULL;
+      if (table->overflow_size <= new_entry - table->overflow)
+       abort ();
+
+      /* Add ENTRY in the overflow of the bucket.  */
+
+      new_entry->data = (void *) entry;
+      new_entry->next = bucket->next;
+      bucket->next = new_entry - table->overflow;
+      table->n_entries++;
+      return (void *) entry;
+    }
+
+  /* Add ENTRY right in the bucket head.  */
+
+  bucket->data = (void *) entry;
+  table->n_entries++;
+  table->n_buckets_used++;
+
   return (void *) entry;
 }

-- 
1.6.3.rc3.2.g4b51


reply via email to

[Prev in Thread] Current Thread [Next in Thread]