[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: moving coefficient.[ch] and design-matrix.[ch]
From: |
Jason Stover |
Subject: |
Re: moving coefficient.[ch] and design-matrix.[ch] |
Date: |
Wed, 11 Jun 2008 12:11:35 -0400 |
User-agent: |
Mutt/1.5.10i |
On Wed, Jun 11, 2008 at 09:32:50AM +0800, John Darrington wrote:
> I'm worried that src/data is becoming a dumping ground for things
> which don't fit elsewhere. At the same time, src/math is becoming
> depleted.
>
> I always envisaged src/data to be about the definition, storage and
> access of data. The manipulation of data is, as I see it, a seperate,
> higher level task.
I agree src/data should not be a dumping ground. If the code like that
in coefficient.[ch] and design-matrix.[ch] should be elsewhere, then I
would like to say something about how I think of that code as a
category: This is code for handling data structures specific to
statistical models. It isn't code that knows about one kind of model,
but code that offers some common data structures used when fitting
many different models. Coefficients and design matrices are two
examples, but there could be others. Whenever I think of any other examples,
though, I seem to think of something specific enough that it should
probably belong to code specific to a particular model (such as knots
for splines).
So where should coefficient.[ch] and design-matrix.[ch] go? In their
own separate directory?
-Jason
>
> J'
>
> On Tue, Jun 10, 2008 at 01:45:30PM -0400, Jason Stover wrote:
> I want to move coefficient.[ch] and design-matrix.[ch] to
> src/data. Ben and I thought this might be a good idea after discussing
> it on IRC, so I thought I would elicit more discussion here.
>
> coefficient.[ch] and design-matrix.[ch] don't do any computations.
> They're purpose to offer some common, data-shuffling functionality to
> model-fitting procedures in src/math/. Right now the only
> model-fitting code in src/math is linreg, but we will eventually want
> code for logistic regression and other models. They all need design
> matrices and coefficients to match variables to columns to
> coefficients. That matching goes pretty much the same for each kind of
> model.
>
> So it seems that a model-fitting directory in src/math should depend
> on coefficient.[ch] and design-matrix.[ch]. coefficient.[ch] and
> design-matrix.[ch] probably should not have to know about
> model-fitting: there are many kinds of models, and for each of them, a
> coefficient means the same thing, and a design matrix means the same
> thing, independently of what kind of model it is.
>
> Furthermore, coefficients and design-matrix code doesn't have any math
> in it. There is no linear algebra, or descent algorithms, etc. That
> code just takes a coefficient and returns the variable it matches, or
> vice versa, or takes a variable and finds its corresponding columns in
> a design matrix. So putting this in src/math doesn't seem as
> appropriate as putting it in a place like src/data.
>
> coefficient.c depends on linreg.h, but that could be changed. By
> contrast,
> design-matrix.c depends on src/data/category.h, and I don't think I could
> eliminate this dependency.
>
> So how does everyone feel about moving coefficient.[ch] and
> design-matrix.[ch] to src/data? If it sounds good, I'll post a patch.
>
> -Jason
>
>
> _______________________________________________
> pspp-dev mailing list
> address@hidden
> http://lists.gnu.org/mailman/listinfo/pspp-dev
>
> --
> PGP Public key ID: 1024D/2DE827B3
> fingerprint = 8797 A26D 0854 2EAB 0285 A290 8A67 719C 2DE8 27B3
> See http://pgp.mit.edu or any PGP keyserver for public key.
>
>