[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] hash: declare some functions with the warn_unused_result att
From: |
Eric Blake |
Subject: |
Re: [PATCH] hash: declare some functions with the warn_unused_result attribute |
Date: |
Tue, 16 Jun 2009 16:09:48 +0000 (UTC) |
User-agent: |
Loom/3.14 (http://gmane.org/) |
Jim Meyering <jim <at> meyering.net> writes:
> Long ago (like 12-15 years), in an application that made very heavy use
> of hash tables, the obstack code provided a large performance benefit.
Indeed, if you are frequently calling allocate_entry, the cost of malloc'ing
lots of small objects is measurably worse than using an obstack to malloc large
chunks and carve off entries as needed. In other words, it's not the speedup
from freeing multiple entries, so much as the speedup from dealing with
collisions when inserting them along with a reduction in memory usage, that
makes obstack-managed entry allocation attractive even with modern heap
management.
For that matter, my patch to hash_rehash penalizes the USE_OBSTACK case by
calling obstack_alloc and updating the free_entry_list once per needed free
entry, where it could instead use obstack_grow to guarantee enough space
without bothering with free_entry_list, so if we spend time improving
USE_OBSTACK, I will work on a followup to provide this optimization.
> So I'm reluctant to remove it altogether. However, it's been so long
> since I last tested it that it may well have suffered from bit rot.
> And with modern heap management code, it may not be useful anymore.
My code inspection didn't turn up obvious problems, but I have not yet tested
with USE_OBSTACK either, so I wouldn't be surprised to find bit rot. If we do
get it working, would it be worth making the use of obstacks a runtime decision
via hash_tuning, rather than a compile-time decision?
There might also be alternative implementations possible with better
performance. For example, instead of malloc'ing free entries one at a time and
tracking bucket overflows as pointers to malloc'd objects, we could instead
malloc/realloc an array of overflows, with bucket overflows tracked as indices
into that array. This would provide some of the same benefits as obstack
allocation (no per-entry malloc overhead, and fewer calls to malloc and with
larger chunks of malloc claimed per call), such that it is no longer necessary
to use an obstack for free-entry management in the first place.
Another hash implementation I am familiar with treats the slots array as a
circular array, and has no overflow linked lists (thus, the entire hash table
is contained within a single malloc'd block, rather than the hash.c
implementation of one block for bucket heads and other blocks for collisions).
Instead, the hasher merely determines where in the circular array to start
searching, and on collisions, you end up searching until you hit a free slot.
But this has other ramifications; searching for a match means traversing
through the array until you encounter an empty slot, and you no longer have the
guarantee that the comparison function will always be called with two elements
with the same hash. Also, whereas linked lists only search the set of
collisions, a nearly full circular array may end up searching the bulk of the
table before it arrives at the next free entry. However, I have never
benchmarked this implementation to see how it fares when compared with the more
traditional bucket of linked lists.
We may also want to borrow from Bruno's idioms in the gl_list arena, where
hash_initialize is implemented as setting up a vtable optimized to an
appropriate implementation, such that you could then experiment with different
implementations by swapping the vtable at initialization time.
>
> I've begun writing a test module, and may even find the time to test
> the USE_OBSTACK code, too. As you say, it also needs to be improved
> to handle allocation failure, probably the same way remove.c now does:
>
> http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=a5111af33ea6a5d27
Unrelated to this topic, but in looking at that patch, I noticed you added the
following line:
cleanup:;
Any reason for the empty statement after the label? Is this worth a syntax
check in maint.mk?
--
Eric Blake
- [PATCH] hash: declare some functions with the warn_unused_result attribute, Jim Meyering, 2009/06/06
- Re: [PATCH] hash: declare some functions with the warn_unused_result attribute, Eric Blake, 2009/06/07
- Re: [PATCH] hash: declare some functions with the warn_unused_result attribute, Eric Blake, 2009/06/08
- Re: [PATCH] hash: declare some functions with the warn_unused_result attribute, Jim Meyering, 2009/06/08
- Re: [PATCH] hash: declare some functions with the warn_unused_result attribute, Eric Blake, 2009/06/15
- Re: [PATCH] hash: declare some functions with the warn_unused_result attribute, Eric Blake, 2009/06/16
- Re: [PATCH] hash: declare some functions with the warn_unused_result attribute, Eric Blake, 2009/06/16
- Re: [PATCH] hash: declare some functions with the warn_unused_result attribute, Jim Meyering, 2009/06/16
- Re: [PATCH] hash: declare some functions with the warn_unused_result attribute,
Eric Blake <=
- Re: [PATCH] hash: declare some functions with the warn_unused_result attribute, Ben Pfaff, 2009/06/16
- Re: [PATCH] hash: declare some functions with the warn_unused_result attribute, Jim Meyering, 2009/06/16
- Re: [PATCH] hash: declare some functions with the warn_unused_result attribute, Eric Blake, 2009/06/16
- alt hash implementation (was: [PATCH] hash: declare some functions with the warn_unused_result attribute), Eric Blake, 2009/06/17
- Re: alt hash implementation, Jim Meyering, 2009/06/17
- Re: alt hash implementation, Eric Blake, 2009/06/17
- Re: [PATCH] hash: declare some functions with the warn_unused_result attribute, Eric Blake, 2009/06/18
- Re: [PATCH] hash: declare some functions with the warn_unused_result attribute, Jim Meyering, 2009/06/17
- hash_rehash allocatino failure (was: [PATCH] hash: declare some functions with the warn_unused_result attribute), Eric Blake, 2009/06/17
- Re: hash_rehash allocatino failure, Jim Meyering, 2009/06/17