[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: t-test gets NaN instead of 0 for significance on x86-64

From: John Darrington
Subject: Re: t-test gets NaN instead of 0 for significance on x86-64
Date: Thu, 3 Sep 2009 02:14:05 +0800
User-agent: Mutt/1.5.18 (2008-05-17)

On Tue, Sep 01, 2009 at 09:15:49PM -0700, Ben Pfaff wrote:
     Some time ago, Matej Cepl <address@hidden> reported that in 0.6.2-pre5 failed on x86-64 with GCC 4.4:
     > PASS: tests/bugs/
     > 19c19
     > < #Pair 0|A & B#3|      1.000| NaN#
     > ---
     >> #Pair 0|A & B#3|      1.000|.000#
     > compare output
     > FAILED
     > FAIL: tests/bugs/
     This evening, I've run the same test on an x86-64 machine
     ( with GCC 4.3.2 and I reproduce this test
     failure.  I also get the same failure with 0.6.1, although I have
     to run the test by hand there since this test was new in
     After some fussing, I tracked the source of the NaN to this
     calculation in pscbox() in src/language/stats/t-test.q:
           double correlation_t =
             pairs[i].correlation * sqrt (df) /
             sqrt (1 - pow2 (pairs[i].correlation));
     In this particular test case, pairs[i].correlation is almost
     exactly 1.0, such that 1 - pow2 (pairs[i].correlation) comes out
     just slightly negative, making the square root yield NaN.
     John, do you have a suggestion for the correct fix?  I don't know
     enough about the math here to say.

So the cause of the problem is that correlation^2 has a value greater than 
This of course it not mathematically possible, because correlation is defined 
lie in the range [-1, +1]. So it must be because of numerical instability 
in the calculation of correlation.  This is not particularly surprising, 
the correlation is calculated according to the classical one pass algorithm,
which as we've discussed before is somewhat unstable.

In the long term, I think that all linear correlation should be calculated 
a common routine. For example, using src/math/covariance-matrix.c (which is
currently also unstable, but at least the instability would be in one place).

As a short term solution, the best I can suggest is that we 
clamp pow2(correlation) to 1.0.


PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See or any PGP keyserver for public key.

Attachment: signature.asc
Description: Digital signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]