[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: PSPP goals

From: Marshall DeBerry
Subject: RE: PSPP goals
Date: Tue, 2 Aug 2005 21:06:41 -0400

  I think what you've outlined is right on the mark.  I was very pleased to
see the areas you mentioned, particularly with regard to large data files.
I still do some work with what used to be considered very large files from
an ongoing nationwide household survey, but these days the data files might
be thought of as perhaps "small" in size.  The devil is in the details, of
course, but you've made a big start with the current versions of pspp.  I
think one of the hard areas will be the graphical interfaces, as there is
the Windows XP (now Vista) world, the Linux Gnome/KDE world, the Mac OS X
Aqua/Cocoa world, and then say the Solaris world with gtk/gnome.  From a
statistical procedure standpoint, it might be interesting if someone would
think about what's involved in implementing the Complex Samples routines--I
suggested such an addition to SPSS several years ago, and they picked up the
Wesvar package, then dropped it, and have recently implemented their own

Having an open source project like pspp will be of huge benefit to the
statistical/data analysis communities, and the time is right for such a

  I would be happy to help out or assist as best I can.


-----Original Message-----
From: address@hidden
[mailto:address@hidden On Behalf Of Ben Pfaff
Sent: Tuesday, August 02, 2005 12:58 PM
To: address@hidden
Subject: PSPP goals

Jason Stover and I met over lunch yesterday and talked over some
of the goals for PSPP.  I realized that I haven't ever done a
good job of expressing these on the list, although I've talked
them over with a few individuals at different times.  So I've
written up a statement of my long-term goals for PSPP, included
below.  I think I'd like to include this in the README for 0.4.0.
Comments are welcome--please give feedback.


The long term goals for PSPP are ambitious.  We wish to provide the
following support to users:

        * All of the SPSS transformation language.  PSPP already
          supports a large subset of it.

        * All the statistical procedures that someone is willing to
          implement, whether they exist in SPSS or not.  Currently,
          statistical support is limited, but growing.

        * Compatibility with SPSS syntax, including compatibility with
          known bugs and warts, where it makes sense.  We also provide
          an "enhanced" mode in certain cases where PSPP can output
          better results that may surprise SPSS users.

        * Friendly textual and graphical interfaces.  PSPP does not do
          a good job of this yet.

        * Attractive output, including graphs, in a variety of human-
          and machine-readable formats.  PSPP currently produces
          output in ASCII, PostScript, and HTML formats.  We will
          enhance PSPP's output formatting in the future.

        * Good documentation.  Currently the PSPP manual describes its
          language completely, but we would like to add information on
          how to select statistical procedures and interpret their

        * Efficient support for very large data sets.  For procedures
          where it is practical, we wish to efficiently support data
          sets many times larger than physical memory.  The framework
          for this feature is already in place, but it has not been
          tuned or extensively tested.

Over the long term, we also wish to provide support to developers who
wish to extend PSPP with new statistical procedures, by supplying the

        * Easy-to-use support for parsing language syntax.  Currently,
          parsing is done by writing "recursive descent" code by hand,
          with some support for automated parsing of the most common
          constructs.  We wish to improve the situation by supplying a
          more complete and flexible parser generator.

        * Easy-to-use support for producing attractive output.
          Currently, output is done by writing code to explicitly fill
          in table cells with data.  We should be able to supply a
          more convenient interface that also allows for providing
          machine-readable output.

        * Eventually, a plug-in interface for procedures.  Over the
          short term, the interface between the PSPP core and
          statistical procedures is evolving quickly enough that a
          plug-in model does not make sense.  Over the long term, it
          may make sense to introduce plug-ins.
Only wimps use tape backup: _real_ men just upload their important stuff
on ftp, and let the rest of the world mirror it ;)
        -- Linus Torvalds

pspp-dev mailing list

reply via email to

[Prev in Thread] Current Thread [Next in Thread]