[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Optimisation of statistic calculations

From: John Darrington
Subject: Re: Optimisation of statistic calculations
Date: Thu, 4 Nov 2004 08:18:17 +0800
User-agent: Mutt/1.3.28i

On Wed, Nov 03, 2004 at 02:03:10PM +0000, Jason Stover wrote:
     Two ways attenuate eventual bloat of PSPP are:
     1. As you mentioned, cache the common and, most importantly,
     sufficient, statistics. Have every statistical procedure cache its
     sufficient statistics for later use. After being computed once, the
     sufficient statistics can be used by that or other procedures
     later. Sufficient statistics are used frequently, so this policy of
     caching them could reduce a lot of recomputation. 

That was the basic idea I had in mind.  If the cache is preserved
across PSPP commands however, then we'd have to ensure that the cache
is invalidated if any transformations are done.  But I think the cache
would be beneficial even within a single command, especially if there
are many subcommands used.
     2. Use a generic optimization module. GSL provides one that could be
     hooked in to PSPP.  Different statistical estimation procedures use
     the same backend algorithms (e.g., sorting for nonparametric routines
     and Newton-Rhapson for generalized linear models). A single optimizer,
     or other backend routines, can eliminate a lot of redundancy.
I briefly looked at the gsl manual, but couldn't see any mention of
this.  Can you give me a reference to where this is documented?


PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See or any PGP keyserver for public key.

Attachment: pgpwNfdB4mlJZ.pgp
Description: PGP signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]