pspp-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Merge the "charset" branch?


From: John Darrington
Subject: Re: Merge the "charset" branch?
Date: Thu, 9 Apr 2009 14:09:59 +0800
User-agent: Mutt/1.5.13 (2006-08-11)

This looks fine to me.

On Wed, Apr 08, 2009 at 09:59:04PM -0700, Ben Pfaff wrote:
     John Darrington <address@hidden> writes:
     
     I've been meaning to replace the PSPP hash functions for a while
     now.  The FNV hash is not so great, and our implementations lack
     a "basis" or "initval" argument that can be used to combine
     hashes in a high-quality way (e.g. not XOR of their results).
     
     So I've pushed a branch for review that fixes these problems.
     It's named "hash", and here is the summary:
     
     commit b4e3275011982e29b80589bef705fc8a0a0316dd
     Author: Ben Pfaff <address@hidden>
     Date:   Wed Apr 8 21:39:22 2009 -0700
     
         NPAR TESTS: Consistently order variables in summary statistics.
         
         The set of variables in the NPAR TESTS specs structure was ordered
         randomly, according to however the hash function happened to arrange 
them.
         Sort them by variable name, instead, so that they always appear in
         alphabetical order in, e.g., descriptive statistics output.
         
         The particular hash function PSPP uses now tends to order variables
         alphabetically anyhow.  The next commit changes the PSPP hash 
functions,
         so fixing this in advance prevents having to update any test output.
     
      src/language/stats/npar.q |    4 ++--
      1 files changed, 2 insertions(+), 2 deletions(-)
     
     commit e9c717e43278364a49b68db4718cab5c9229c8fb
     Author: Ben Pfaff <address@hidden>
     Date:   Wed Apr 8 21:55:31 2009 -0700
     
         Use Bob Jenkins lookup3 hash instead of FNV.
         
         The Jenkins lookup3 hash is superior to FNV in collision resistance,
         avalanching, and performance on systems that do not have fast
         multiplication.  It also provides a good way to combine the result of
         a previous hashing step with the current hash, using its "basis" 
argument.
         This commit replaces the PSPP implementation of FNV with the Jenkins
         lookup3 hash and updates all the current users.
         
         In addition, John Darrington pointed out that commit dd2e61b4a
         "Make create_iconv() properly distinguish converters by name"
         unintentionally introduced gratuitous hash collisions, by causing
         all converters where tocode and fromcode were the same to hash to
         value 0, and converters where tocode and fromcode were swapped to
         hash to the same value as each other.  Using the "basis" argument to
         the Jenkins hash properly, instead of just attempting to combine
         hash values with XOR, fixes this problem.
     
      src/data/attributes.c           |    6 +-
      src/data/file-handle-def.c      |   12 ++-
      src/data/file-name.c            |    6 +-
      src/data/short-names.c          |    8 +-
      src/data/value-labels.c         |    8 +-
      src/data/value.c                |    6 +-
      src/data/variable.c             |    4 +-
      src/language/stats/autorecode.c |    4 +-
      src/language/stats/crosstabs.q  |    2 +-
      src/libpspp/hash-functions.c    |  196 
++++++++++++++++++++++++++++-----------
      src/libpspp/hash-functions.h    |   12 +-
      src/libpspp/i18n.c              |    2 +-
      src/math/covariance-matrix.c    |   11 +-
      13 files changed, 180 insertions(+), 97 deletions(-)

-- 
PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://pgp.mit.edu or any PGP keyserver for public key.


Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]