[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Efficiency and flexibility of hash-tables

From: Joris van der Hoeven
Subject: Re: Efficiency and flexibility of hash-tables
Date: Sat, 8 Feb 2003 15:14:39 +0100 (MET)


Thanks for your reply. Unfortunately, I think that
you did not fully understand my question.

> > When declaring a hash table using
> > 
> >     (define H (make-hash-table 100))
> > 
> > does this mean that the number of slots will *always* remain 100?
> No, the hash table is a vector of entries to lists where the actual
> information is stored. A hash table in guile can therefore contain
> any number of items. The number of entries is merely a choice of what
> performance you need. If you declare too few entries you will get
> a lot of linear search through the lists from each entry.
> I myself use to estimate it so that the lists will rarely be deeper
> than two or three to get a reasonable performance.

That is why I distinguished the word 'slots' from the word 'entries'.
The number of slots is the length of the vector you mention.
So the ratio 'nr entries / nr slots' should be small in order to
get a good performance.

My question was: is the number of slots automatically adapted
as a function of the number or entries, or is it not?
If you cannot have a good estimate for the number of entries,
then this auto-adaptation may be important.
In fact, I think that a good low level implementation of
general purpose hash tables should have this feature.

> The performance is also reflected upon the hash function versus the
> vector length. Usually it is advisable to use a prime number to avoid
> systematic hashing to the same entries. Sometime it happened I missed
> this and sloppily declared the hash table length to e.g. 1000000 if
> needing about 3000000 items. The run took several hours instead of
> the expected half an hour, which I got when changing the length to
> 1000003. If you have access to some mathematical package like maple
> there is often a function nextprime which can be helpful.
> Usually the built-in hash functions works fine but you may also
> consider making a special hash functions for special needs if
> the built-in function doesn't spread good enough.
> > I am frequently dealing with hash tables where I do not
> > have a reasonable estimation of number of entires in advance.
> > In TeXmacs, I therefore implemented a hash table type which
> > doubles the number of slots each time that the number of entries
> > becomes larger than a constant times the number of slots
> > (and divides by two the number of slots when the number of
> > entries becomes smaller than a constant times the number of slots).
> > Has a similar system been implemented in (an extension of) guile?
> > 
> > Thanks for your help, Joris
> > 
> > 
> > -----------------------------------------------------------
> > Joris van der Hoeven <address@hidden>
> > GNU TeXmacs scientific text editor
> > personal homepage
> > -----------------------------------------------------------
> > 
> > 
> > 
> > _______________________________________________
> > Guile-user mailing list
> > address@hidden
> >
> -- 

reply via email to

[Prev in Thread] Current Thread [Next in Thread]