[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Optimisation of statistic calculations

From: John Darrington
Subject: Optimisation of statistic calculations
Date: Wed, 3 Nov 2004 19:35:45 +0800
User-agent: Mutt/1.5.4i

As we add more commands to PSPP, there becomes considerable repetition
of code involving the calculation of common statistics eg, sum ,
sum-of-squares, variance etc. and an add hoc approach can and has lead
to the same calculations being unnecessarily repeated, which will make
PSPP slower than it needs to be.

The way it's going, PSPP is going to bloat, and have large chunks of
disjoint code, duplicating the same basic algorithms time and time again.

So I'm looking at introducing a framework to allow the optimal reuse
of statistics already calculated.  This will involve each statistic
featuring in a DAG, and an engine which traverses the DAG in
postorder.  In fact, 2 DAGs will be required, because some stats
require others to be completely precalculated.

In fact, looking at some of the existing PSPP code, it appears that
Ben may have had something similar in mind at one stage.  Anyway, if
anyone is an expert on such things, then please pipe up.


PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See or any PGP keyserver for public key.

Attachment: pgp7iFeTS1Xvp.pgp
Description: PGP signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]