[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: naming data sets

From: John Darrington
Subject: Re: naming data sets
Date: Sat, 17 Dec 2005 10:15:46 +0800
User-agent: Mutt/1.5.9i

On Fri, Dec 16, 2005 at 08:39:02AM -0800, Ben Pfaff wrote:
     Jason Stover <address@hidden> writes:
     > It's time for me to write a routine that saves the residuals of
     > a regression model to the data. I have avoided doing this until
     > now because I want the user to be able to have a choice of saving
     > residuals to different data sets.
     > Right now there is only one data set we can refer to, and that
     > is a serious limitation. How difficult would it be to make PSPP
     > able to recognize different data sets? If there is only one
     > dictionary? 
     I've thought about this for a while and I think I have a
     First, I think that my suggestion that all the data sets have the
     same dictionary was flawed.  The problem is that the dictionary
     for whatever data set you're working with can change without the
     other dictionaries changing in a similar manner.  For example,
     COMPUTE can add a variable to the current data set's dictionary,
     but if we want to add that to the other data sets' dictionaries,
     what should be the values?  We'd have to, essentially, perform
     all transformations on all the existing data sets, and I don't
     think that's something that really makes sense (if it does to
     you, please explain).
     Thus, I propose that we support multiple data sets, each of which
     has an independent dictionary.  We introduce a new type of file
     handle for these data sets.  Tentatively I'll call these
     "temporary" file handles and "temporary" data sets (but better
     terminology is welcome).  Access to temporary data sets is
     through temporary file handles, using the usual commands for
     accessing system files (GET, SAVE, XSAVE).
     Here's what I'd add to the PSPP language:
             * Some extra syntax on FILE HANDLE for declaring
               temporary file handles (MODE=TEMPORARY perhaps).
             * A new command for destroying temporary file handles
               (e.g. CLOSE FILE HANDLE or DELETE FILE HANDLE), so that
               the memory or disk space used to store them can be
               freed up.
             * GET, SAVE, XSAVE would be extended to read and write to
               temporary file handles.  I'd introduce some kind of
               syntactic sugar so that it wasn't strictly necessary to
               declare temporary file handles in advance,
               e.g. something like XSAVE OUTFILE=TEMPORARY
               <HANDLENAME> would work properly.
     Internally, a temporary file handle would be represented by a
     dictionary plus a casefile, I think.  When PSPP terminates, the
     data in temporary file handles would automatically disappear,
     just like data in the active file.
Presumably this something that would only be available if
--syntax=enhanced is used ??


PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See or any PGP keyserver for public key.

Attachment: signature.asc
Description: Digital signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]