pspp-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: linked lists and trees


From: Jason Stover
Subject: Re: linked lists and trees
Date: Wed, 24 Sep 2008 15:11:31 -0400
User-agent: Mutt/1.5.18 (2008-05-17)

On Wed, Sep 24, 2008 at 09:31:02AM -0700, Ben Pfaff wrote:
> The key questions you should be asking are:
> 
>         - How much data can I end up with?  If it is potentially
>           more than can fit in memory, maybe you shouldn't insist
>           on doing it in one pass.  (But if the operations to be
>           performed on it essentially require random access, then
>           it's still reasonable to do it all in-memory in a
>           single pass.)

Yes, the matrix itself needs to be random-access. I'm not storing
any more data in the temporary structure than in the final covariance
matrix.

>         - What are the operations I want to perform on the data
>           structure?  In particular, what do you want to look up
>           a node based on?  Is the key a single value or a range?
>           Does the data consist of single values or ranges?  Do
>           you care whether the data is in sorted order for most
>           operations?
> 
> My guess is that you want to look up nodes based on the exact
> value of a key, and don't need the nodes to be in sorted order
> for that.  Then, the proper data structure is a hash table.  A
> tree would work too, but a hash table will be asymptotically
> faster.

Yes, I just need to store the data and update them as the program
iterates through the data. Sorting isn't important.

> > One node in the tree would look like this:
> >
> > struct node
> > {
> >   struct variable *v1;
> >   struct variable *v2;
> >   double covariance_element;
> >   double center;
> >   union value *val1;
> >   union value *val2;
> >   size_t count;
> >   struct node *right;
> >   struct node *left;
> > };
> 
> Which members are the key that you want to look up to find a
> node?

The key is a quadruple: (v1, v2, val1, val2).

I quickly wrote something to do what I want, but I would prefer to use
something we already have.

So does the hashing in src/libpspp sound like what I need?

-Jason




reply via email to

[Prev in Thread] Current Thread [Next in Thread]