[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: hash resizing bug
From: |
Eric Blake |
Subject: |
Re: hash resizing bug |
Date: |
Wed, 17 Jun 2009 13:02:42 -0600 |
User-agent: |
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.21) Gecko/20090302 Thunderbird/2.0.0.21 Mnenhy/0.7.6.666 |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
According to Eric Blake on 6/17/2009 12:21 PM:
> Before the rehash, about 20% of the buckets were empty. But after changing
> the
> prime factor, the hasher function scatters better, and _every single bucket_
> is
> occupied. Which means ALL subsequent calls to hash_insert will trigger an
> overflow, rather than inserting into a bucket head. And since the resize
> checking logic in hash_rehash is only triggered after insertion into a bucket
> head, the hash table has effectively been locked into a fixed size that will
> never grow again, no matter how many hash_insert calls are made, even though
> the table is certainly exceeding the requested threshold of 20% empty
> buckets.
> Ouch.
I'm now playing with this simple code motion patch, also available at:
http://repo.or.cz/w/gnulib/ericb.git
$ git pull git://repo.or.cz/gnulib/ericb.git master
- --
Don't work too hard, make some time for fun as well!
Eric Blake address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAko5PdIACgkQ84KuGfSFAYBqZgCeOqttX/POCKvJXLBVubjaee9M
Tr4AoIMsLpf+3LELwthAN6QK3cUpo/9y
=I5iP
-----END PGP SIGNATURE-----
>From d3edc0d1e302af613e8b56396f1f9d0bec15a264 Mon Sep 17 00:00:00 2001
From: Eric Blake <address@hidden>
Date: Wed, 17 Jun 2009 13:01:41 -0600
Subject: [PATCH] hash: check for resize before insertion
* lib/hash.c (hash_insert): Check whether bucket usage exceeds
threshold before insertion, so that a pathological hash_rehash
that fills every bucket can still trigger another rehash.
Signed-off-by: Eric Blake <address@hidden>
---
ChangeLog | 5 +++++
lib/hash.c | 52 ++++++++++++++++++++++++++--------------------------
2 files changed, 31 insertions(+), 26 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index 1f57b7e..38c1d45 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,10 @@
2009-06-17 Eric Blake <address@hidden>
+ hash: check for resize before insertion
+ * lib/hash.c (hash_insert): Check whether bucket usage exceeds
+ threshold before insertion, so that a pathological hash_rehash
+ that fills every bucket can still trigger another rehash.
+
hash: manage bucket overflows more efficiently
* lib/hash.c [USE_OBSTACK]: Delete; the optimizations provided by
an obstack have been folded in to mainline code.
diff --git a/lib/hash.c b/lib/hash.c
index 9d78990..e4310b5 100644
--- a/lib/hash.c
+++ b/lib/hash.c
@@ -999,32 +999,6 @@ hash_insert (Hash_table *table, const void *entry)
if ((data = hash_find_entry (table, entry, &bucket, false)) != NULL)
return data;
- /* ENTRY is not matched, it should be inserted. */
-
- if (bucket->data)
- {
- struct hash_entry *new_entry = allocate_entry (table);
-
- if (new_entry == NULL)
- return NULL;
- if (table->overflow_size <= new_entry - table->overflow)
- abort ();
-
- /* Add ENTRY in the overflow of the bucket. */
-
- new_entry->data = (void *) entry;
- new_entry->next = bucket->next;
- bucket->next = new_entry - table->overflow;
- table->n_entries++;
- return (void *) entry;
- }
-
- /* Add ENTRY right in the bucket head. */
-
- bucket->data = (void *) entry;
- table->n_entries++;
- table->n_buckets_used++;
-
/* If the growth threshold of the buckets in use has been reached, increase
the table size and rehash. There's no point in checking the number of
entries: if the hashing function is ill-conditioned, rehashing is not
@@ -1057,6 +1031,32 @@ hash_insert (Hash_table *table, const void *entry)
}
}
+ /* ENTRY is not matched, it should be inserted. */
+
+ if (bucket->data)
+ {
+ struct hash_entry *new_entry = allocate_entry (table);
+
+ if (new_entry == NULL)
+ return NULL;
+ if (table->overflow_size <= new_entry - table->overflow)
+ abort ();
+
+ /* Add ENTRY in the overflow of the bucket. */
+
+ new_entry->data = (void *) entry;
+ new_entry->next = bucket->next;
+ bucket->next = new_entry - table->overflow;
+ table->n_entries++;
+ return (void *) entry;
+ }
+
+ /* Add ENTRY right in the bucket head. */
+
+ bucket->data = (void *) entry;
+ table->n_entries++;
+ table->n_buckets_used++;
+
return (void *) entry;
}
--
1.6.3.rc3.2.g4b51
- alt hash implementation (was: [PATCH] hash: declare some functions with the warn_unused_result attribute), (continued)
- Re: [PATCH] hash: declare some functions with the warn_unused_result attribute, Eric Blake, 2009/06/18
- Re: [PATCH] hash: declare some functions with the warn_unused_result attribute, Jim Meyering, 2009/06/17
- hash_rehash allocatino failure (was: [PATCH] hash: declare some functions with the warn_unused_result attribute), Eric Blake, 2009/06/17
- Re: hash_rehash allocatino failure, Jim Meyering, 2009/06/17
- Re: hash_rehash allocatino failure, Eric Blake, 2009/06/18
- Re: hash_rehash allocation failure, Eric Blake, 2009/06/19
- hash resizing bug (was: [PATCH] hash: declare some functions with the warn_unused_result attribute), Eric Blake, 2009/06/17
- Re: hash resizing bug,
Eric Blake <=
- Re: hash resizing bug, Eric Blake, 2009/06/17
- Re: hash resizing bug, Jim Meyering, 2009/06/17
- Re: hash resizing bug, Eric Blake, 2009/06/17
- Re: hash resizing bug, Jim Meyering, 2009/06/17
- Re: hash resizing bug, Eric Blake, 2009/06/17
- Re: hash resizing bug, Jim Meyering, 2009/06/18
- Re: hash resizing bug, Jim Meyering, 2009/06/17
- Re: hash resizing bug, Eric Blake, 2009/06/18
- Re: hash resizing bug, Jim Meyering, 2009/06/18
- Re: hash resizing bug, Jim Meyering, 2009/06/18