[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: contribution and cvs

From: Ben Pfaff
Subject: Re: contribution and cvs
Date: Sun, 21 Aug 2005 14:39:16 -0700
User-agent: Gnus/5.1007 (Gnus v5.10.7) Emacs/21.4 (gnu/linux)

Jason Stover <address@hidden> writes:

> The first has a linear regression library, the second has
> regression.q and some routines to recode categorical variables
> to vectors. They aren't finished, but they do compile and run
> correctly when called within regression.q. See regress_categorical.stat
> for sample pspp syntax.

I looked at your revised versions of these routines this
afternoon, and I compiled them and tried out the sample .stat
file you provided.  In general it looks good and I'm happy with
it.  The list of comments below may be pretty long, but a lot of
it is nits.

First, some specific comments.  I don't know whether I understand
what's going on well enough for some of these, so please feel free to
tell me where I'm wrong.

Looking at the structures in cat.h, I have some questions.   First, I
*think* that each row in the `struct recoded_categorical' matrix `m'
has nonzero values only along a diagonal.  Is that true?  The big
header comment says that a variable might map to (0 1 0 1 1), and I
don't see how that can happen.

The comment on `m' in `struct recoded_categorical' refers to "hash a".
I don't see any hashes around.

Should the definition of `struct recoded_categorical' be in
cat.h?  I only see a few references to its internals from outside
cat.c.  It might be better to define accessor functions for those
purposes and move the definition to cat.c.  I don't know whether
this applies to other structs in cat.h.

The extensive comment on var_cols in `struct design_matrix' makes me
wonder whether an array of structs would be better, e.g.
    struct design_matrix_var 
       int first_column;            /* First column (see recoded_categorical
                                       struct for full info). */
       struct variable *var;        /* Variable. */

Next, some general comments.


        * Please don't use errno to report errors.  It's an ugly
          interface and it's not very portable.  Even EINVAL is not
          guaranteed to exist on all systems (ANSI C only requires
          EDOM, ERANGE, and EILSEQ).

        * There is no need to cast a `void *' in assigning it to any
          (non-const) pointer to object type.  Thus, please don't cast
          the return value of xmalloc() and similar routines.

        * I saw code like this in a few places:

              errno = ENODATA;
              fprintf( stderr, "%s:%d: %s\n",__FILE__,__LINE__,

          I'm not sure what the goal is here.  Does this code only
          execute if there is a bug in the code?  If so, then I'd
          prefer to see it written as simply `assert (0);' (or a more
          descriptive assertion) or even just `abort ();'.  If not,
          then it would be better to give a meaningful error
          message--users can't be expected to understand this kind of

        * Header files should have header guards: #ifndef
          HEADERNAME_H/#define HEADERNAME_H/.../#endif.


        * Please declare pointers as `type *name', not `type * name'
          or another way.  I know that there are some existing
          instances of the latter in PSPP sources, but I am fixing
          those as I clean up code.

        * Please write array[index] to index an array, not *(array +
          index).  Yes, they are equivalent to the C compiler, but the
          former is easier to read.

        * Please write comments without an extra * at the beginning of
          each line, e.g. like this:

          /* Here's a long multiline comment that explains what this
             chunk of code is supposed to do.  It doesn't need lots of
             extra * characters... */

        * Please don't put extra spaces inside (), e.g. write `if (a > b)'
          instead of `if ( a > b )'.

        * Please put a blank line between function definitions. 

        * Please put a space between a statement name or a function
          name and the following `('.  (Mostly I see this looking
          correct, but a few files tend not to do it, e.g. sweep.c.)

        * (Some of these issues could be fixed just by running GNU

        * At the bottom of a .q file, please put the following:

             Local Variables:
             mode: c

          This allows Emacs to recognize .q files as C.

> I'm not happy with the categorical recode routines, and I don't
> think I ever will be. I suspect a less offensive way to implement
> them would be to put the struct recoded_categorical inside a struct
> variable, but this wouldn't eliminate the need for some of the
> ugly code in cat.c.

I think that you might have missed one of the mechanisms that
PSPP has for adding `hooks' into `struct variable'.  The `aux'
and `aux_dtor' members of `struct variable' are set aside for
procedures to tag variables with auxiliary data.  The idea is
that you attach a structure with var_attach_aux(), then access it
as needed.  Later you can clear the aux data with
var_clear_aux(), or you can clear it from the entire dictionary
with dict_clear_aux().
Ben Pfaff 
email: address@hidden

reply via email to

[Prev in Thread] Current Thread [Next in Thread]