pspp-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [patch #5690] Clean up case code


From: John Darrington
Subject: Re: [patch #5690] Clean up case code
Date: Wed, 31 Jan 2007 18:43:45 +0900
User-agent: Mutt/1.5.13 (2006-08-11)

On Tue, Jan 30, 2007 at 10:30:43PM -0800, Ben Pfaff wrote:

     So I'm not proposing to
     encourage use of random access where it's not necessary.
     
Would it therefore be worth having a flag passed to the casereader
constructor which declares whether or not the casereader performs
random access? 

     
     Probably the most powerful thing to stack on top of a casereader
     is what I'm tentatively calling a "casegrouper".  A casegrouper
     takes a casereader and a function that classifies consecutive
     pairs of cases as in the same group or different groups.  It then
     hands you a sequence of casereaders, one by one, each of which
     contains a single group.  This is invaluable for SPLIT FILE, for
     break groups on AGGREGATE or RANK or SORT CASES, and so on.
     
Sounds good.  I was thinking about looking at the percentiles code
again.    (The more I look at it the less I like it.  Also, I'm not
convinced that it gets the right answers in all cases.  It needs more
test cases.)  But in view of the magnitude of the changes  you're
making, I think I'll wait.  The new functions will probably make it
simpler. 
     
     What I have left:
     
             * Make the GUI compile and work again.  Currently it does
               neither.  As part of that, finish and test the
               datasheet implementation.
     
               I might need help or advice with some of the GUI stuff,
               but I don't know yet.


No worries.

     
             * Write an extensive section for the manual describing
               best practices for data processing under PSPP.  I'm
               confident that, with this set of changes, PSPP data
               processing will be mature enough that we can provide
               good guidance for future developers this way.
     
               I might break this into a separate developers' guide,
               along with the existing chapter on q2c.  What do you
               think?

I think a developers' guide is a good idea. q2c docs really don't
belong in the user manual, so should be moved, along with the *.sav
file format description.

     
     I'm really excited about this set of changes.  It feels to me
     like one-third of the important PSPP implementation (the data
     processing) is finally coming together.  The other two-thirds are
     syntax parsing and output formatting (see the end of the PSPP
     README), and I finally have ideas for those that I think will
     really work.
     
     data_model is a really really generic name.  It could be a name
     for the model for any kind of data.  The name datasheet calls to
     my mind a spreadsheet, which more specifically describes what the
     datasheet actually implements.  So I'm not 100% happy with the
     suggestion data_model.

How about datasheetmodel or would that be too long?

J'     

-- 
PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://pgp.mit.edu or any PGP keyserver for public key.


Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]