[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
## Optimisation of statistic calculations

**From**: |
John Darrington |

**Subject**: |
Optimisation of statistic calculations |

**Date**: |
Wed, 3 Nov 2004 19:35:45 +0800 |

**User-agent**: |
Mutt/1.5.4i |

As we add more commands to PSPP, there becomes considerable repetition
of code involving the calculation of common statistics eg, sum ,
sum-of-squares, variance etc. and an add hoc approach can and has lead
to the same calculations being unnecessarily repeated, which will make
PSPP slower than it needs to be.
The way it's going, PSPP is going to bloat, and have large chunks of
disjoint code, duplicating the same basic algorithms time and time again.
So I'm looking at introducing a framework to allow the optimal reuse
of statistics already calculated. This will involve each statistic
featuring in a DAG, and an engine which traverses the DAG in
postorder. In fact, 2 DAGs will be required, because some stats
require others to be completely precalculated.
In fact, looking at some of the existing PSPP code, it appears that
Ben may have had something similar in mind at one stage. Anyway, if
anyone is an expert on such things, then please pipe up.
J'
--
PGP Public key ID: 1024D/2DE827B3
fingerprint = 8797 A26D 0854 2EAB 0285 A290 8A67 719C 2DE8 27B3
See http://wwwkeys.pgp.net or any PGP keyserver for public key.

**
**`pgp7iFeTS1Xvp.pgp`

*Description:* PGP signature

**Optimisation of statistic calculations**,
*John Darrington* **<=**