parametric models [Was: Re: musings on performance]

From: John Darrington
Subject: parametric models [Was: Re: musings on performance]
Date: Wed, 10 May 2006 19:43:41 +0800
On Tue, May 09, 2006 at 05:36:16PM -0400, Jason Stover wrote:
     This is something I want to take up soon. I have a rough plan below.
     Please let me know how this sounds. SPSS can now do little of what I
     suggest below. (But what I'm suggesting would make PSPP a good
     model-building tool.)
     I would like to make PSPP able to:
     1. Save models for later use within PSPP. 'Later uses' include
     combining them into other models, and assessing by comparing many
     models, mostly by checking their performance on 'scratch' data.
     'Later uses' might also include fitting other models that could use
     some of the sufficient statistics (like sample means and covariance
     matrices). Saving models would not take much work if I can use the pool
     allocator to do so.

Here's where I'm going to start showing my ignorance of statistical
methods.  What exactly do you mean by a "model"?  How is it different
(or similar) to the data saved by SPSS's MATRIX subcommand?

     2. Export models in some external formats so they can be used by another
     program later. The first format I was thinking of was compilable C. I
     suppose other formats like XML ought to be supported too, since SPSS
     can export some models as XML. Right now, REGRESSION has some ugly
     functions that let it write little C programs. I'd like to clean that
     code up and move it to a place where other procedures could use

     To learn how to do numbers 1 and 2, I should write a modeling procedure
     that fits a model quite different from that fit by REGRESSION, but one
     whose purpose is, like regression, to find a function f(input) that
     predicts some output. I was thinking of a neural network. Another
     possibility is a regression tree. I don't want this next procedure to
     resemble linear regression too closely, lest I inadvertently write
     model-shuffling procedures closely tailored to manipulation of one
     particular type of model.

If you go down the neural net path, then I would suggest that a radial
basis function net would be the thing to use.  
     Saving models requires a standard syntax, usable by any procedure that
     fits a model to data, that tells PSPP to save that model. I think the
     SAVE subcommand is a good candidate, as in this possibility:
         REGRESSION /variables y x1 x2 /DEPENDENT y /SAVE model=m1
     ...but maybe something else would be better?

Like I mentioned above, it sounds similar to the SPSS /MATRIX
subcommand so maybe that would be the thing to use.

