pspp-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Merge the "charset" branch?


From: Ben Pfaff
Subject: Re: Merge the "charset" branch?
Date: Thu, 09 Apr 2009 07:13:26 -0700
User-agent: Gnus/5.11 (Gnus v5.11) Emacs/22.2 (gnu/linux)

Thanks, I rebased and pushed this to master.

In case anyone was concerned about not getting a copyright
assignment on this, essentially the same code appears in GCC:
        http://git.infradead.org/gcc.git?a=blob;f=libiberty/hashtab.c

John Darrington <address@hidden> writes:

> This looks fine to me.
>
> On Wed, Apr 08, 2009 at 09:59:04PM -0700, Ben Pfaff wrote:
>      John Darrington <address@hidden> writes:
>      
>      I've been meaning to replace the PSPP hash functions for a while
>      now.  The FNV hash is not so great, and our implementations lack
>      a "basis" or "initval" argument that can be used to combine
>      hashes in a high-quality way (e.g. not XOR of their results).
>      
>      So I've pushed a branch for review that fixes these problems.
>      It's named "hash", and here is the summary:
>      
>      commit b4e3275011982e29b80589bef705fc8a0a0316dd
>      Author: Ben Pfaff <address@hidden>
>      Date:   Wed Apr 8 21:39:22 2009 -0700
>      
>          NPAR TESTS: Consistently order variables in summary statistics.
>          
>          The set of variables in the NPAR TESTS specs structure was ordered
>          randomly, according to however the hash function happened to arrange 
> them.
>          Sort them by variable name, instead, so that they always appear in
>          alphabetical order in, e.g., descriptive statistics output.
>          
>          The particular hash function PSPP uses now tends to order variables
>          alphabetically anyhow.  The next commit changes the PSPP hash 
> functions,
>          so fixing this in advance prevents having to update any test output.
>      
>       src/language/stats/npar.q |    4 ++--
>       1 files changed, 2 insertions(+), 2 deletions(-)
>      
>      commit e9c717e43278364a49b68db4718cab5c9229c8fb
>      Author: Ben Pfaff <address@hidden>
>      Date:   Wed Apr 8 21:55:31 2009 -0700
>      
>          Use Bob Jenkins lookup3 hash instead of FNV.
>          
>          The Jenkins lookup3 hash is superior to FNV in collision resistance,
>          avalanching, and performance on systems that do not have fast
>          multiplication.  It also provides a good way to combine the result of
>          a previous hashing step with the current hash, using its "basis" 
> argument.
>          This commit replaces the PSPP implementation of FNV with the Jenkins
>          lookup3 hash and updates all the current users.
>          
>          In addition, John Darrington pointed out that commit dd2e61b4a
>          "Make create_iconv() properly distinguish converters by name"
>          unintentionally introduced gratuitous hash collisions, by causing
>          all converters where tocode and fromcode were the same to hash to
>          value 0, and converters where tocode and fromcode were swapped to
>          hash to the same value as each other.  Using the "basis" argument to
>          the Jenkins hash properly, instead of just attempting to combine
>          hash values with XOR, fixes this problem.
>      
>       src/data/attributes.c           |    6 +-
>       src/data/file-handle-def.c      |   12 ++-
>       src/data/file-name.c            |    6 +-
>       src/data/short-names.c          |    8 +-
>       src/data/value-labels.c         |    8 +-
>       src/data/value.c                |    6 +-
>       src/data/variable.c             |    4 +-
>       src/language/stats/autorecode.c |    4 +-
>       src/language/stats/crosstabs.q  |    2 +-
>       src/libpspp/hash-functions.c    |  196 
> ++++++++++++++++++++++++++++-----------
>       src/libpspp/hash-functions.h    |   12 +-
>       src/libpspp/i18n.c              |    2 +-
>       src/math/covariance-matrix.c    |   11 +-
>       13 files changed, 180 insertions(+), 97 deletions(-)
>
> -- 
> PGP Public key ID: 1024D/2DE827B3 
> fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
> See http://pgp.mit.edu or any PGP keyserver for public key.
>
>

-- 
Ben Pfaff 
http://benpfaff.org




reply via email to

[Prev in Thread] Current Thread [Next in Thread]