pspp-dev
[Top][All Lists]

## Re: Oneway Anova from a covariance/cross-product matrix?

 From: Jason Stover Subject: Re: Oneway Anova from a covariance/cross-product matrix? Date: Wed, 7 Jul 2010 11:25:32 -0400 User-agent: Mutt/1.5.18 (2008-05-17)

```On Tue, Jul 06, 2010 at 07:09:05PM +0000, John Darrington wrote:
> So what exactly do we pass to reg_sweep ?
>
> Passing M doesn't seem to help.  If we need to use g or x then that
> requires access to the raw data.  I understood that anova could be calculated
> from M alone.

Almost: reg_sweep expects the final column and row to contain the values
related to
the dependent variable. So it should work with

g1        x
g1  1.5       3.0
x   3.0       16.0

Also, that matrix doesn't contain information related to the
intercept, or "grand mean," meaning you would need to either include
such a column in your covariance matrix, or call
post_sweep_computations.

> J'
>
> On Tue, Jul 06, 2010 at 11:25:18AM -0400, Jason Stover wrote:
>      Treat the problem as a regression problem and use the SWEEP operator,
>      as used in linreg.c and regression.q. More details are below.
>
>      On Mon, Jul 05, 2010 at 03:13:03PM +0000, John Darrington wrote:
>      > The cross-product matrix for this data is:
>      >
>      >       x    g1
>      > x   16.0  3.0
>      > g1   3.0  1.5
>
>      Call this matrix M, call the column vector transpose ((1,2,3,4,5,6)) x.
>      We will re-express our data of groups as the following matrix:
>
>         1 0
>         1 0
>         1 0
>         1 1
>         1 1
>         1 1
>
>      So the first column corresponds to a "grand mean" and the second tells
>      us the group. Call this matrix 'g'.
>
>      To get the sums of squares, you must consider this as a regression
> problem:
>
>         x = g * beta + error
>
>      ...where beta is a 2x1 dimensional matrix of unknown parameters and
>      '*' denotes matrix multiplication. Another way to write this is x_i =
>      beta_0 + beta_1 + error for group b, and x_i = beta_0 + error for
>      group a.
>
>      The least-squares estimates of beta_0 and beta_1 must satisfy the
> following relation:
>
>          transpose (g) * g * beta = transpose (g) * x
>
>      ...which gives the solution for beta:
>
>        beta = (transpose (g) * g)^{-1} *transpose (g) * x
>
>      The SWEEP operator will give you the sums of squares and beta. It
>      works to solve the system via Gaussian elimination with partial
>      pivoting. Solving for beta in the above system isn't interesting by
>      itself for ANOVA, because of the unusual coding we used, but a
>      by-product of the SWEEP operator is the sums of squares being left in
>      place of the covariance matrix, and these sums of squares are
>      independent of the coding of g (for example, using the vector for
>      category b in the second column of the matrix beta would have given
>      the same sums of squares).
>
>      The code in regression.q and linreg.c does this, if you want to use
>      it. It might be more efficient and easier to reach straight for the
>      code in sweep.c.
>
>      -Jason
>
> --
> PGP Public key ID: 1024D/2DE827B3
> fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
> See http://pgp.mit.edu or any PGP keyserver for public key.
>
>

```