[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Oneway Anova from a covariance/cross-product matrix?

From: John Darrington
Subject: Re: Oneway Anova from a covariance/cross-product matrix?
Date: Tue, 6 Jul 2010 19:09:05 +0000
User-agent: Mutt/1.5.18 (2008-05-17)

So what exactly do we pass to reg_sweep ?

Passing M doesn't seem to help.  If we need to use g or x then that 
requires access to the raw data.  I understood that anova could be calculated 
from M alone.


On Tue, Jul 06, 2010 at 11:25:18AM -0400, Jason Stover wrote:
     Treat the problem as a regression problem and use the SWEEP operator,
     as used in linreg.c and regression.q. More details are below.
     On Mon, Jul 05, 2010 at 03:13:03PM +0000, John Darrington wrote:
     > The cross-product matrix for this data is:
     >       x    g1
     > x   16.0  3.0
     > g1   3.0  1.5
     Call this matrix M, call the column vector transpose ((1,2,3,4,5,6)) x.
     We will re-express our data of groups as the following matrix:
        1 0
        1 0
        1 0
        1 1
        1 1
        1 1
     So the first column corresponds to a "grand mean" and the second tells
     us the group. Call this matrix 'g'.
     To get the sums of squares, you must consider this as a regression problem:
        x = g * beta + error
     ...where beta is a 2x1 dimensional matrix of unknown parameters and
     '*' denotes matrix multiplication. Another way to write this is x_i =
     beta_0 + beta_1 + error for group b, and x_i = beta_0 + error for
     group a.
     The least-squares estimates of beta_0 and beta_1 must satisfy the 
following relation:
         transpose (g) * g * beta = transpose (g) * x
     ...which gives the solution for beta:
         beta = (transpose (g) * g)^{-1} *transpose (g) * x
     The SWEEP operator will give you the sums of squares and beta. It
     works to solve the system via Gaussian elimination with partial
     pivoting. Solving for beta in the above system isn't interesting by
     itself for ANOVA, because of the unusual coding we used, but a
     by-product of the SWEEP operator is the sums of squares being left in
     place of the covariance matrix, and these sums of squares are
     independent of the coding of g (for example, using the vector for
     category b in the second column of the matrix beta would have given
     the same sums of squares).
     The code in regression.q and linreg.c does this, if you want to use
     it. It might be more efficient and easier to reach straight for the
     code in sweep.c.

PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See or any PGP keyserver for public key.

Attachment: signature.asc
Description: Digital signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]