pspp-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: GLM and interactions


From: Jason Stover
Subject: Re: GLM and interactions
Date: Thu, 7 Jul 2011 16:09:28 -0400
User-agent: Mutt/1.5.18 (2008-05-17)

On Thu, Jul 07, 2011 at 02:50:07PM +0000, John Darrington wrote:
> 
> Now, I thought that for purposes of the current investigation, I could
> "fake" an interaction term as follows:
> 
> compute interact = drug * 10 + category.
> 
> glm diffrate by category drug interact
>   /intercept=include
>   /design = category drug interact
>   .

This caused the problem below: You can see the number of degrees of
freedom for 'interact' were 5, when it should have been 2. The reason
is that the interaction variable doesn't reproduce all the possible
combinations of each variable, only enough to show all possible
combinations *in conjunction with the other variables*. So: what
should have been 2 degrees of freedom became 5, which meant no degrees
of freedom left over for the main variables (category and drug), which
meant zero sums of squares for those variables, since the interaction
term already had accounted for those effects.

The binary encoding for category, drug, and the interaction term would
be something like this:

category:
        1 --> 0
        2 --> 1

drug:
        1 --> 0 0
        2 --> 1 0
        3 --> 0 1

interaction (category * drug):
        1 * 1 --> 0 0
        1 * 2 --> 0 0
        1 * 3 --> 0 0
        2 * 1 --> 0 0
        2 * 2 --> 1 0
        2 * 3 --> 0 1

So, I have just multiplied each of the pairs. Notice most are mapped
to the origin. This isn't a problem, though, if we just want to test
for an interaction. If we take X to be our binary variable for
category, and Y_1, Y_2 for our binary variables for drug, we can write
our linear model this way:

        response = intercept + b_1 * X + b_2 * Y_1 + b_3 * Y_2 + b_4 * X * Y_1 
+ b_5 * X * Y_2 + error

Now in spite of the interactions usually being 0, we can still
estimate the means for any factor/level combination:

category * drug    estimated mean
----------------------------------
1 * 1              intercept
1 * 2              intercept + b_2
1 * 3              intercept + b_3
2 * 1              intercept + b_1
2 * 2              intercept + b_1 + b_4
2 * 3              intercept + b_1 + b_5

Just enough terms. If we add any more terms to our model, then we
won't have enough degrees of freedom to estimate each coefficient,
which will mean the sums of squares will be 0 for some of the terms.

> Doing this, I get:
> 
> #===============#=======================#==#===========#=====#====#
> #     Source    #Type III Sum of Squares|df|Mean Square|  F  |Sig.#
> #===============#=======================#==#===========#=====#====#
> #Corrected Model#                 210.00| 8|      26.25| 2.23| .13#
> #Intercept      #                 882.00| 1|     882.00|74.89| .00#
> #category       #                    .00| 1|        .00|  .00| NaN#
> #drug           #                    .00| 2|        .00|  .00|1.00#
> #interact       #                 144.00| 5|      28.80| 2.45| .12#
> #Error          #                 106.00| 9|      11.78|     |    #
> #Total          #                1198.00|18|           |     |    #
> #Corrected Total#                 316.00|17|           |     |    #
> #===============#=======================#==#===========#=====#====#
> 
> 
> which, as you can see gives the correct "interact" and Error values.
> It's a bit dissapointing that the uninteracted "drug" and "category"
> ssq are now zero.
> 
> So this means that to get all the sums of squares we will have to run
> the get_ssq function twice - once without interactions, and once with.
> And in general, for a NxN design where all the interactions are desired,
> then it'll be necessary to run the function N times.

It should only take one run, by just dropping the 'X' column from the
design matrix, then the columns corresponding to Y_1 and Y_2
(simultaneously), but not the columns for the products of X and (Y_1,
Y_2). I guess this would mean get_ssq would need to know about
interactions.

-Jason





reply via email to

[Prev in Thread] Current Thread [Next in Thread]