[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

value representation, case indexes, and dictionary indexes

From: Ben Pfaff
Subject: value representation, case indexes, and dictionary indexes
Date: Sun, 29 Mar 2009 23:21:50 -0700
User-agent: Gnus/5.11 (Gnus v5.11) Emacs/22.2 (gnu/linux)

I pushed a new "value-rep" branch to savannah.  This branch is
for changing the representation of "union value".  The log
message tells the beginning of the story pretty well:

commit bd78a32b882b4c614e83ea8b5aa40ee7ddeba3d6
Author: Ben Pfaff <address@hidden>
Date:   Sun Mar 29 22:53:07 2009 -0700

    Change "union value" to dynamically allocate long strings.
    Until now, a single "union value" could hold a numeric value or a short
    string value.  A long string value (one longer than MAX_SHORT_STRING)
    required a number of contiguous "union value"s.  This situation was
    inconvenient sometimes, because any occasion where a long string value
    might be required (even if it was unlikely) required using dynamic
    memory allocation.
    With this change, a value of any type, regardless of whether it is numeric
    or short or long string, occupies a single "union value".  The internal
    representation of short and long strings is now different, however: long
    strings are now internally represented by a pointer to dynamically
    allocated memory.  This means that "union value"s must now be initialized
    and uninitialized properly, to ensure that memory is properly allocated
    and freed behind the scenese.
    This change thus has a ripple effect on PSPP code that works with values.
    In particular, code that deals with cases is greatly changed, because a
    case now needs to know the type of each value that it contains.  Thus, a
    new concept called a "case prototype", which represents the type and
    width of each value within a case, is introduced, and every place in PSPP
    that creates a case must now create a corresponding prototype to go with
    This commit is not cleaned up to production standards.  It needs
    additional work on comments, for example, as well as updates to the
    developers' reference.  There might be unrelated debug prints still left
    in, and so on.  Nevertheless it should serve as a place to start

Now I'm trying to get the PSPPIRE code working well with it.
This is a little harder than I expected, because I'm not quite
sure what the intended relationships are among the dictionary and
the datasheet and the case indexes and dictionary indexes.

Dictionary indexes are always from 0 through the number of
variables in the dictionary minus 1.  That part is easy.

This is what I think might make sense for the remaining
relationships in the value-rep branch's PSPPIRE:

        The case index is always exactly the same as the
        dictionary index, simply because there is no reason for
        it to be different given that PSPPIRE is working with a
        datasheet, which is capable of permuting its variables
        into whatever order is most convenient for its client,
        and also given that each column in a datasheet has an
        arbitrary width in this branch.

The following is what appears to actually happen:

        Case indexes simply continue to increase as variables are
        added, because deleting a variable from the dictionary
        does not delete any columns from the datasheet (there are
        no references to datasheet_delete_columns() from
        src/ui/gui at all, even in the master branch).

So I'm thinking about, roughly, changing
insert_variable_callback() and delete_variable_callback() to call
dict_compact_values() on the dictionary, also about adding a call
to datasheet_delete_columns() to delete_variable_callback().  I
think that this will enforce the new invariant above, as well as
garbage collecting deleted variables.

Do you have comments on this?  i.e. does it sound reasonable and
a good way to do things?  Or should I do something different?



(By the way, I'm completely ignoring changing the width of
existing variables, e.g. dict_size_change_callback().  There are
obvious bugs in that case in the value-rep branch, which I will
fix once I figure out what's supposed to happen.)
"I was born lazy.  I am no lazier now than I was forty years ago, 
 but that is because I reached the limit forty years ago.  You can't 
 go beyond possibility."
--Mark Twain

reply via email to

[Prev in Thread] Current Thread [Next in Thread]