[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [patch #5690] Clean up case code
From: |
John Darrington |
Subject: |
Re: [patch #5690] Clean up case code |
Date: |
Wed, 31 Jan 2007 18:43:45 +0900 |
User-agent: |
Mutt/1.5.13 (2006-08-11) |
On Tue, Jan 30, 2007 at 10:30:43PM -0800, Ben Pfaff wrote:
So I'm not proposing to
encourage use of random access where it's not necessary.
Would it therefore be worth having a flag passed to the casereader
constructor which declares whether or not the casereader performs
random access?
Probably the most powerful thing to stack on top of a casereader
is what I'm tentatively calling a "casegrouper". A casegrouper
takes a casereader and a function that classifies consecutive
pairs of cases as in the same group or different groups. It then
hands you a sequence of casereaders, one by one, each of which
contains a single group. This is invaluable for SPLIT FILE, for
break groups on AGGREGATE or RANK or SORT CASES, and so on.
Sounds good. I was thinking about looking at the percentiles code
again. (The more I look at it the less I like it. Also, I'm not
convinced that it gets the right answers in all cases. It needs more
test cases.) But in view of the magnitude of the changes you're
making, I think I'll wait. The new functions will probably make it
simpler.
What I have left:
* Make the GUI compile and work again. Currently it does
neither. As part of that, finish and test the
datasheet implementation.
I might need help or advice with some of the GUI stuff,
but I don't know yet.
No worries.
* Write an extensive section for the manual describing
best practices for data processing under PSPP. I'm
confident that, with this set of changes, PSPP data
processing will be mature enough that we can provide
good guidance for future developers this way.
I might break this into a separate developers' guide,
along with the existing chapter on q2c. What do you
think?
I think a developers' guide is a good idea. q2c docs really don't
belong in the user manual, so should be moved, along with the *.sav
file format description.
I'm really excited about this set of changes. It feels to me
like one-third of the important PSPP implementation (the data
processing) is finally coming together. The other two-thirds are
syntax parsing and output formatting (see the end of the PSPP
README), and I finally have ideas for those that I think will
really work.
data_model is a really really generic name. It could be a name
for the model for any kind of data. The name datasheet calls to
my mind a spreadsheet, which more specifically describes what the
datasheet actually implements. So I'm not 100% happy with the
suggestion data_model.
How about datasheetmodel or would that be too long?
J'
--
PGP Public key ID: 1024D/2DE827B3
fingerprint = 8797 A26D 0854 2EAB 0285 A290 8A67 719C 2DE8 27B3
See http://pgp.mit.edu or any PGP keyserver for public key.
signature.asc
Description: Digital signature
- [patch #5690] Clean up case code, (continued)
- Re: [patch #5690] Clean up case code, John Darrington, 2007/01/30
- Re: [patch #5690] Clean up case code, Ben Pfaff, 2007/01/30
- Re: [patch #5690] Clean up case code, Ben Pfaff, 2007/01/30
- Re: [patch #5690] Clean up case code, John Darrington, 2007/01/30
- Re: [patch #5690] Clean up case code, Ben Pfaff, 2007/01/31
- Re: [patch #5690] Clean up case code, Ben Pfaff, 2007/01/31
- Re: [patch #5690] Clean up case code,
John Darrington <=
- Re: [patch #5690] Clean up case code, Ben Pfaff, 2007/01/31
- Sequential vs. Random access, John Darrington, 2007/01/31