[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Resizing hash tables in Guile
From: |
Mikael Djurfeldt |
Subject: |
Re: Resizing hash tables in Guile |
Date: |
Thu, 13 Feb 2003 10:35:40 +0100 |
User-agent: |
Gnus/5.090008 (Oort Gnus v0.08) Emacs/21.2 (i386-pc-linux-gnu) |
Roland Orre <address@hidden> writes:
> On Wed, 2003-02-12 at 18:53, Mikael Djurfeldt wrote:
>> I've been thinking that maybe we should continue the move and let the
>> resizing tables entirely replace the fixed size ones.
[...]
> I think that this is a a good idea, but I would like the optional size
> argument to still be there as an initial guess about the number of
> items to expect
[...]
Agreed.
> Of course this additional level of abstraction is nice as it also gives
> a future option to use trees if the user e.g gives an optional
> comparision operator.
>
> By the way, Joris van der Hoeven mentioned that this type of resizing
> hash tables are faster than trees, which have a time complexity
> of O(log n)+reshuffling time if they are balanced. Do you have any
> numbers showing how much faster and if possible if there are any
> conditions? The reshuffling time will grow at least O(n) when the
> size of our linear tables increases if I have understood right.
I doubt that we'll ever use trees under the hash table API, the reason
being that the user expects O(1) for hash table operations. Even
though O(log n) grows slowly, it is substantially different for large
enough data sets.
Regarding reshuffling time: Yes, rehuffling means that every operation
isn't O(1), but it *does* mean that they are O(1) on average. You can
understand this by a trick sometimes used in algorithm run-time
analysis called "amortization":
The idea is that for every operation on the hash table you "pay" some
extra virtual time, and the sum of this time can then be used to
account for the reshuffling time. In my implementation, the hash
table is roughly doubled for each rehash. Before rehashing occurs,
you have inserted N elements. This has cost you less than cN seconds.
Rehashing is O(2N) = O(N), so we can say it will cost us less than dN
seconds. If we now pay off d seconds per operation in advance, and
note that the argument above holds equally well for each rehashing
point, we realize that each operation costs less than c + d seconds on
average. This means that, on average, operations are O(1).
Mikael
- Re: Efficiency and flexibility of hash-tables, (continued)
- Re: Efficiency and flexibility of hash-tables, Greg Troxel, 2003/02/10
- Re: Efficiency and flexibility of hash-tables, Roland Orre, 2003/02/10
- Re: Efficiency and flexibility of hash-tables, Mikael Djurfeldt, 2003/02/10
- Re: Efficiency and flexibility of hash-tables, Mikael Djurfeldt, 2003/02/10
- Resizing hash tables in Guile, Mikael Djurfeldt, 2003/02/11
- Re: Resizing hash tables in Guile, Roland Orre, 2003/02/11
- Re: Resizing hash tables in Guile, Marius Vollmer, 2003/02/12
- Re: Resizing hash tables in Guile, Marius Vollmer, 2003/02/12
- Re: Resizing hash tables in Guile, Mikael Djurfeldt, 2003/02/12
- Re: Resizing hash tables in Guile, Roland Orre, 2003/02/12
- Re: Resizing hash tables in Guile,
Mikael Djurfeldt <=
- Re: Resizing hash tables in Guile, Harvey J. Stein, 2003/02/13
- Re: Resizing hash tables in Guile, Joris van der Hoeven, 2003/02/13
- Re: Resizing hash tables in Guile, Harvey J. Stein, 2003/02/13
- Re: Resizing hash tables in Guile, Paul Jarc, 2003/02/13
- Re: Resizing hash tables in Guile, Joris van der Hoeven, 2003/02/13
- Re: Resizing hash tables in Guile, Rob Browning, 2003/02/12
- Re: Efficiency and flexibility of hash-tables, Roland Orre, 2003/02/10
- Re: Efficiency and flexibility of hash-tables, Paul Jarc, 2003/02/12
- Re: Efficiency and flexibility of hash-tables, Roland Orre, 2003/02/12