bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: hash resizing bug


From: Jim Meyering
Subject: Re: hash resizing bug
Date: Thu, 18 Jun 2009 16:18:56 +0200

Eric Blake wrote:

> According to Jim Meyering on 6/17/2009 1:21 PM:
>>> is not quite right.  A hashing function may be somewhat ill-conditioned for 
>>> one
>>> bucket size, but much better conditioned for another size.  At any rate, it
>>> looks like the correct fix is to check for rehashing thresholds based on 
>>> number
>>> of entries, not number of buckets, and that this check needs to happen on 
>>> every
>>> insertion, not just insertions that land in an empty bucket.  Or, at the 
>>> very
>>> least, check for threshold PRIOR to insertion, rather than after, such that 
>>> in
>>> the above case with m4, the very next call to hash_insert would notice the
>>> table was at 100%, and force another resize above 907 buckets.
>>>
>>> Thoughts, before I implement something along these lines?
>>
>> Sounds good.
>
> Hey - I just noticed that this also fixed the bug where hash_insert was
> inconsistent on NULL return whether an entry had been inserted or not.
> Now, you are guaranteed that a NULL return means the entry was not inserted.
>
> Here's two simple patches.  I'm committing the first, as there should be
> enough cases where you end up searching for an element already known to be
> hashed, where making an indirect function call is needless overhead
> (particularly if the comparator function is expensive, and neglected to do
> the pointer comparison filter itself).  And by documenting this behavior,
> the user's comparator functions can now be safely rewritten to avoid the
> pointer comparison filter.

Unless I become exceptionally unresponsive,
please post before pushing changes to hash.c.

I would have requested that you not mix clean-up like this with
a semantics-changing change set:

    +   (hash_do_for_each, hash_clear, hash_free): Use C89 syntax.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]