bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: hash resizing bug


From: Eric Blake
Subject: Re: hash resizing bug
Date: Thu, 18 Jun 2009 17:10:41 +0000 (UTC)
User-agent: Loom/3.14 (http://gmane.org/)

Jim Meyering <jim <at> meyering.net> writes:

> Unless I become exceptionally unresponsive,
> please post before pushing changes to hash.c.
> 
> I would have requested that you not mix clean-up like this with
> a semantics-changing change set: 

Sorry for jumping the gun on that.  I'll be a little more patient for the rest 
of my series, then.

> > + hash: provide default callback functions
> > + * lib/hash.c (raw_hasher, raw_comparator): New functions.
> > + (hash_initialize): Use them as defaults.
> > + * tests/test-hash.c (main): Test this.
> 
> This looks fine, too.
> Yes, I've often hashed pointer values.
> However, note that the low bits of a pointer are often zero,
> so that for small table sizes, this may be a very poor choice.

Agreed (and these days, pointers are usually aligned to at least 8 bytes, not 
just 4).

> size_t
> hash_address (const void *addr, size_t table_size)
> {
>   size_t k = (size_t) addr;
>   assert (k % 4 == 0);
>   return (k / 4) % table_size;
> }

That discards bits, in the case where non-malloced values are being used.  The 
assertion is nice when you know that all values are malloc'd, but is harsh for 
a default.  So, how about I squash this slight modification into my patch 
before pushing this commit?

diff --git i/lib/hash.c w/lib/hash.c
index 52a211e..c9866c5 100644
--- i/lib/hash.c
+++ w/lib/hash.c
@@ -483,7 +483,14 @@ hash_reset_tuning (Hash_tuning *tuning)
 static size_t
 raw_hasher (const void *data, size_t n)
 {
-  return (size_t) data % n;
+  /* When hashing unique pointers, it is often the case that they were
+     generated by malloc and thus have the property that the low-order
+     bits are 0.  As this tends to give poorer performance with small
+     tables, we rotate the pointer value before performing division,
+     in an attempt to improve hash quality.  */
+  size_t val = data;
+  val = (val >> 3) | (val << (CHAR_BIT * sizeof val - 3));
+  return val % n;
 }

 /* If the user passes a NULL comparator, we use pointer comparison.  */







reply via email to

[Prev in Thread] Current Thread [Next in Thread]