pspp-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: category.c


From: John Darrington
Subject: Re: category.c
Date: Tue, 21 Mar 2006 08:35:52 +0800
User-agent: Mutt/1.5.9i

On Mon, Mar 20, 2006 at 10:03:27AM -0500, Jason Stover wrote:

     
     > 3. cat_value_update seems to do nothing for numeric variables.  Why is
     >    this?  A numeric variable can be used as a categorical variable
     >    just as easily as an alpha one.
     
     Good point. Encoding numeric data as categorical is usually a mistake
     from a statistical standpoint, but there are circumstances when
     treating a numeric variable as categorical makes perfect sense, so
     maybe cat_value_update() shouldn't care what type of variable it is
     looking at. This is where the question 'should we protect the user?'
     comes up. Someone with a numeric variable that has, say, 10^5 distinct
     values and inadvertently treats that variable as categorical could
     wind up running a procedure with 0 or negative degrees of freedom;
     slowing the machine down to a crawl; or, worst of all, finding bugs
     we'd rather not know about. But users should probably have the ability
     to treat numeric data as categorical if they want to.

I'm not a statistician, so I can't make any comment about whether
numeric variables, "ought" to be used as catagorical ones.  But I've
seen *many* examples where this is done.  Most demonstrations of
T-TEST do something like 0 = Male, 1 = Female.  I've even seen reports
telling me that a person's average sex is 0.54  Maybe we could have a
very mild warning if a catagorical variable is numeric.
     
     
     While we're on the topic, is anyone in favor of using a garbage
     collector in PSPP?
     
Using pool.c sort of does something similar.  Perhaps we should make
more use of that.

J'

-- 
PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://pgp.mit.edu or any PGP keyserver for public key.


Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]