guile-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Does anyone have a better scm_string_hash ?


From: Roland Orre
Subject: Does anyone have a better scm_string_hash ?
Date: Fri, 14 Nov 2003 16:28:59 +0100

I noticed that our data mining software run very very slow with a
new data base. I localized the problem to scm_string_hash.

A hash table in this case was loaded with 14166 strings. I have
a function which creates a reasonable sized hash table, in this
case the hash table size was 8209.

13856 of these strings were hashed to the same index= 1067.
303 strings got index = 8061.
2 strings got the index = 754.
8201 entries were empty.

We are running guile 1.6 but I checked the scm_string_hash from a recent
1.7 CVS also and the function in hash.c there is identical.

I added a few of the symbols hashing to 1067 below. One can of course
argue that the symbols in this case should be hashed as numbers.
Anyway, does anyone have any hint or have a better string hash function?

        Best regards
        Roland Orre


A few strings hashing to entry 1067 for hash table length 8209:
"01632001" "01627301" "01626801" "01626601" "01626501" "01626401"
"01626301" "01626101" "01626001" "01625901" "01625801" "01625701"
"01625401" "01625301" "01625101" "01625001" "01624801" "01624601"
"01624501" "01624401" "01624301" "01624101" "01624001" "01623901"
"01623801" "01623701" "01623601" "01623501" "01623401" "01623201"
"01623101" "01622901" "01622801" "01622701" "01622601" "01622401"






reply via email to

[Prev in Thread] Current Thread [Next in Thread]