Re: regression lib

From: Jason H. Stover
Subject: Re: regression lib
Date: Sun, 1 May 2005 11:43:07 -0400
I got started on a regression lib. You can find it

Let me know if it looks offensive. I just dropped it into lib/ and
compiled it. It doesn't contain much yet, but I thought I should give
people a chance to critique its design before going much further.

I called it 'linreg' because 'regression' could mean 'non-linear
regression'. I also created a struct which can contain a lot of
relevant information about estimation for a linear model, including
coefficients, residuals, sums of squares and whatever else becomes
necessary later.  That information can be passed to other procedures,
making extra data passes unnecessary for some analyses.

On this topic of caching statistics: It would be nice if pspp_linreg()
could accept as an argument the means and standard deviations of all
model variables. That would eliminate the need for pspp_linreg() to
pass through the data to get those values. Under this design, when
pspp_linreg() gets a mean and/or std. dev. for a variable in the
model, it will not compute that mean/std. dev. again. If it doesn't
get the mean/std. dev. for a variable in the model, it will compute
that mean/std. dev.

If some PSPP procedure had already computed means/std. dev.'s by the
time pspp_linreg() is called, can PSPP pass those values to
pspp_linreg()? If so, where does PSPP store that information? What
structure should I look in to figure this all out? I see the variable
structure contains information about a variable like its label and
number of values. Can it also contain a variable's mean and standard


