[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Directory restructuring

From: Ben Pfaff
Subject: Re: Directory restructuring
Date: Thu, 02 Feb 2006 17:20:21 -0800
User-agent: Gnus/5.110004 (No Gnus v0.4) Emacs/21.4 (gnu/linux)

John Darrington <address@hidden> writes:

> I'm generally happy with the structure you proposed.  Having said
> this, when I've done these sort of exercises in the past, at the end
> of it, I have a much better idea of what is appropriate than at the
> start.   It's difficult to get a good picture of the design while
> everything is mixed up in one place (which is the whole point of
> seperating it in the first place).
> The procedure usually ends up as:
> 1.  Remove all -I directives from the Makefiles.
> 2.  Classify the source files according to some reasonable criteria, and put
>     them into respective sub-directories.   
> 3.  Put in the whatever -I directives are necessary in order to make
>     the damn thing build.

#1 and #3 together are going to be kind of painful, because some
source files need a lot of headers in PSPP.

> 4.  Run a script over the Makefiles, to extract the -I directives and
>     create a diagram of dependencies between directories.  Identify
>     any dependencies which don't seem to make sense.

This sounds like a good idea but I don't know of a script that
does this.  It sounds like you've done this before--any pointers?

> 5.  Rework modules which cause illogical dependencies.  Split up any
>     files which seem to belong in more than one class. 
> 6.  Goto 1.

If you're volunteering... okay.  I personally hate
"organizational" type stuff--I'd much rather write code--which is
one reason it's been put off so long.

I'd propose that, when you get to someplace you think makes sense
to some extent (whether it compiles and links or not), you post a
.tar.gz of it somewhere and we can discuss it.  It's such a pain
dealing with major changes in a CVS tree that it'd be a shame to
have to make major changes more than once.

Here are the major components of PSPP in my opinion:

        * Dictionaries and associated data, including properties
          of variables, processing the active file (vfm),
          casefiles, sorting, missing values.

        * Data I/O, including the dfm, pfm, and sfm modules
          (which should really get better names) and the new
          any-* and scratch-* modules.

                I'm not sure whether format.* and data-{in,out}.*
                belong in the former or the latter category.  I
                suspect the former.

        * Parsing and executing the PSPP language.  (Actually I'd
          like to separate parsing from execution, but that's a
          big project, not something that can be accomplished by
          rearranging files.)

          At the top level, this is things like the line reader
          (getl), the lexer, command.*, vars-prs.c, etc.  There
          would be multiple subcategories:

                - Control structures.

                - Commands that modify the dictionary (as their
                  primary purpose).

                - Commands for data I/O.

                - Statistical procedures (commands that analyze
                  data and produce output based on it).

                - Transformations (commands that modify data).

                - Utilities (commands that don't modify the
                  dictionary or access data).

        * Output: the table formatter, PostScript driver, etc.

                - "charts" as a subdirectory of "output" just
                  because they're easily distinguished and
                  there's a lot of them.

                - I have big plans for output but again that's
                  another big project in itself.

        * Statistical calculation library: these are routines
          that are tied to PSPP but otherwise just mathematics.
          (Routines that are separable from PSPP would presumably
          go in "lib", not "src".)

        * User interface.  Presumably the GUI, when merged, would
          be in a separate directory too, but John can say for sure.

Some files will need to be split to fit this well, e.g. sort.c
currently implements both the SORT CASES BY command and the
infrastructure for sorting.  The former should go into the
"dictionaries and data" directory, the latter into

Other files aren't named well and we'd want to change them,
e.g. I've been a bit irritated with "sfm-read.c" and related
files for a while.  It should really be something like
"sysfile-reader.c", because that makes it a lot more obvious what
it actually does.

Let me propose an initial file split to start out, based on that,
and everyone can criticize it.  I haven't done any file renaming
in this sample, because then nobody would really be sure what
each file actually is:

CVS/   data/   glob.h     lib/    main.h   settings.c  stats/
ChangeLog  glob.c  language/  main.c  output/  settings.h  ui/

Entries  Repository  Root

case.c           data-in.h          format.c          sort.c      vfm.h
case.h           data-out.c         format.def        sort.h      vfmP.h
casefile-test.c  dictionary.c       format.h          val.h
casefile.c       dictionary.h       io/               var.h
casefile.h       file-handle-def.c  missing-values.c  vars-atr.c
data-in.c        file-handle-def.h  missing-values.h  vfm.c

any-reader.c  dfm-read.h   pfm-write.c       scratch-reader.h  sfm-write.c
any-reader.h  dfm-write.c  pfm-write.h       scratch-writer.c  sfm-write.h
any-writer.c  dfm-write.h  scratch-handle.c  scratch-writer.h  sfmP.h
any-writer.h  pfm-read.c   scratch-handle.h  sfm-read.c
dfm-read.c    pfm-read.h   scratch-reader.c  sfm-read.h

command.c    dictionary/   io/        lexer.h      sort-prs.c  subclist.h
command.def  expressions/  lex-def.c  q2c.c        sort-prs.h  utilities/
command.h    getl.c        lex-def.h  range-prs.c  stats/      vars-prs.c
control/     getl.h        lexer.c    range-prs.h  subclist.c  xforms/

ctl-stack.c  ctl-stack.h  do-if.c  loop.c  repeat.c  repeat.h

apply-dict.c  modify-vars.c  split-file.c    val-labs.c      var-labs.c
format-prs.c  numeric.c      sysfile-info.c  value-labels.c  vector.c
formats.c     rename-vars.c  temporary.c     value-labels.h  weight.c
mis-val.c     sample.c       title.c         var-display.c

CVS/    helpers.h  public.h  operations.def   parse.c
evaluate.c   helpers.c        optimize.c       private.h

Entries  Repository  Root

data-list.c  file-handle.h  file-type.c  inpt-pgm.c  matrix-data.c
data-list.h  file-handle.q  get.c        list.q      print.c

aggregate.c     crosstabs.q  flip.c         oneway.q      regression_export.h
autorecode.c    descript.c   frequencies.q  rank.q        t-test.q
correlations.q  examine.q    means.q        regression.q

copyleft.c  copyleft.h  date.c  echo.c  include.c  permissions.c  set.q

compute.c  count.c  recode.c  sel-if.c

algorithm.c  calendar.c     hash.c         magic.h   random.c
algorithm.h  calendar.h     hash.h         mkfile.c  random.h
alloc.c      debug-print.h  linked-list.c  mkfile.h  str.c
alloc.h      filename.c     linked-list.h  pool.c    str.h
bitvector.h  filename.h     magic.c        pool.h    version.h

ascii.c  font.h        html.c   output.c  postscript.c  som.h  tab.h
charts/  groff-font.c  htmlP.h  output.h  som.c         tab.c

barchart.c     chart.c        histogram.c  plot-chart.c
box-whisker.c  chart.h        histogram.h  plot-hist.c
cartesian.c    dummy-chart.c  piechart.c

cat-routines.h   design-matrix.h  group.h       misc.c     percentiles.c
cat.c            factor_stats.c   group_proc.h  misc.h     percentiles.h
cat.h            factor_stats.h   levene.c      moments.c
design-matrix.c  group.c          levene.h      moments.h

cmdline.c  cmdline.h  error.c  error.h  readln.c  readln.h

"I admire him, I frankly confess it; and when his time comes
 I shall buy a piece of the rope for a keepsake."
--Mark Twain

reply via email to

[Prev in Thread] Current Thread [Next in Thread]