guile-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Does anyone have a better scm_string_hash ?


From: Ludovic Courtès
Subject: Re: Does anyone have a better scm_string_hash ?
Date: Mon, 17 Nov 2003 14:01:33 +0100
User-agent: Mutt/1.5.4i [Guile enabled]

Hi,

Today, 3 hours, 58 minutes, 19 seconds ago, Roland Orre wrote:
> The performance increase we obtained with this hash function were
> tremendous. With the problematic symbol table the average lookup
> time went down from 10 ms to 9 us on this specific machine (1 GHz PIII).
> The max number of collisions went down from 13856 to 7 (on a total
> of 14166 strings in a table of length 8209). The number of empty
> entries went down from 8201 to 909. The run time for a data mining
> scan on a database went down from 5 hours to 15 minutes. The reason
> I added i in the hash function was to decrease the risk that two
> strings with same but permuted contents would hash to the same value.
> I have not done any thorough tests on this hash function yet.

Looks cool.

> /* replaced by Roland Orre address@hidden */
> #define hash_mask 0x00FFFFFF
> unsigned long 
> scm_string_hash (const unsigned char *str, size_t len)
> {
>   /* from suggestion at: */
>   /* http://srfi.schemers.org/srfi-13/mail-archive/msg00112.html */
>   int i;
>   /* originally h=0 was suggested. address@hidden */
>   unsigned long h=177;
> 
>   for (i = len-1; i >= 0; i--)
>     {
>       /* h = (str[i] +i+ h*37) & hash_mask; */
>       h = (str[i] + i + (h<<5) + (h<<2) + h) & hash_mask;
>     }
>   return h;
> } /* scm_string_hash */

I don't think you need to compute a modulo (logical and with HASH_MASK)
of the hash key, because scm_hasher () returns scm_string_hash () % n.

Thanks,
Ludovic.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]