[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Directory restructuring

From: John Darrington
Subject: Re: Directory restructuring
Date: Sun, 5 Feb 2006 08:07:13 +0800
User-agent: Mutt/1.5.4i

On Sat, Feb 04, 2006 at 01:47:13PM -0800, Ben Pfaff wrote:
     First a little commentary on oddities I noticed:

I agree with everything here except:
             math/sort.? -> move to data/sort.?
                     I'd argue that sorting is a fundamental operation
                     on data.

That will make src/data depend upon vfm.c ; so until we've got vfm
sorted out, then I think we should leave it.
     Now, to propose some renamings.  At top level:

Agreed.  But I suggest that we refrain from renaming anything for the
time being.  Otherwise we'll both get very confused while we're both
working on it.  We can rename things before we commit to CVS.

     John Darrington <address@hidden> writes:
     > 1. I left src/data as one directory instead of src/data and
     > src/data-io no particular reason except that it seemed an
     > unecessary distinction.
     Okay.  I liked the idea of splitting out i/o because that allows
     the "core" data source files to be easily spotted.  Eventually we
     may add support for more data formats too--for example you've
     proposed supporting OpenOffice's spreadsheet format, I believe.
     But I'm not prepared to argue about it; not a big deal.

On the other hand, it is the biggest directory.  We can consider it
again later.

     > 2. The files which I've left in src are those which are
     > currently in the "too hard" basket.  This means:
     >  error.c : Should really be in libpspp, but it has too many 
     >            IMHO this interface needs to be rethought anyway.
     I understand that it's not the right interface for the GUI to
     use.  Can you explain other objections?  Do you have suggestions?

Almost everything depends on error.c yet error.c depends upon
src/output.  Not only is it not the "right" interface for the GUI,
some circumstances which the command line interface considers "errors"
are normal circumstances for the GUI.  For example, when a user is
halfway through typing a date eg: "04-July-2001" at the point "04-J"
the data-in function decides that it's an error. 
My suggestion would go something like this:

* Define a error struct:  struct error { d_str message ; int code};
  (not sure if code is actually needed).

* Functions which can (possibly) produce error information should take
  an argument struct error *, which the caller passes.  The caller can
  pass NULL if she's not interested in errors.

* It is the responsibility of the function to populate the error
  struct, and the responsibility of the caller to destroy it, or pass
  it to the higher stack frame as appropriate.

That way, the act of reporting the error is left to the user interface
(command line or gui), whereas deciding what the errors are is left to
the individual function.

     >  main.[ch]: I don't know why we need a main.h ?
     main.c exports some interfaces?
I'm of the opinion, that main.c should be moved to src/ui

     >  glob.[ch]: This seems to be the rubbish bin for things that don't fit 
     >             elsewhere.
     At one point I was convinced that all global data should be
     declared in a single source file.  Later I decided that was a
     stupid idea and I've been moving variables out of here since
     then.  It's probably time to finish the job.

     >  getl.c:    This is just a pain, which keeps biting me whenever I kick 
     Maybe you can help me to understand.

I'm confused as to the purpose of this file.  Lots of things depend
upon it, (eg sfm-read ) when they don't seem to have anything to do
with getting lines.

     >  vfm* :     You suggested this ought to go into data.  It certainly 
can't go 
     >             there in its current form.  It just depends on too many 
     >                    The code in there frightens me ....
     I'm not surprised that it's a bit scary.  The code in here
     touches practically everything else.  I'm not sure what to do
     about it.  Some of it could probably be pushed into the
     dictionary, other parts could be split out: perhaps the
     stream-related bits could be in their own files and perhaps
     transformation handling and SPLIT FILE related code is separable
     as well.
     It might be best to basically postpone dealing with vfm.

Do you mean postpone it indefinately, or just for a few days?
     > 3.  The dependency diagram shows a couple of nasty circular
     >     dependencies involving src/math, src/output and src/output/chart.  I
     >     need to look at these.  It seems clear to me that the charting API
     >     needs work.
     A few greps don't turn up any direct dependency from src/output
     to src/math for me.  Can you elaborate on that?
src/output/chart/box-whisker depends on for example src/math/weighted_value.

     > 6.  src/output depends on src/language which seems wrong to me.  In
     >     fact src/language/ contains only command.* and it's a very common
     >     dependency, so perhaps this needs to be split or something.
     I think src/output only wants command.h because it wants
     cur_proc, the name of the current command, to include it in the
     output.  We could reverse the dependency by making command.c call
     a function in src/output.  src/language definitely needs to call
     into src/output in any case, so how's that sound to you?
Sounds fine.

     > 10. src/output is depending on src/data which seems wrong.  I think
     >     this is because the ascii driver needs a filename or something.
     src/output directly uses the following from src/data according to
     a little bit of grepping:
                     These are general-purpose code, not data-specific,
                     so I'd move them to libpspp.

I tried this.  In the end I was forced to move *everything from
src/data to libpspp.  Perhaps it will help if settings.c is split.
                     tab_value() takes a fmt_spec and formats the
                     table cell accordingly.  tab_float() does
                     something related.  
                     We could use a helper routine in src/language
                     instead, if you'd prefer.
                     This dependency is unneeded.  You can just delete
                     the #include for it from tab.c without problems.

     > I think the most urgent issues are to sort out the remaining files in
     > src (especially vfm*) and to sort out the output/chart business.  I'm
     > sure we'll have to go several more iterations, but it's a start.
     I'm willing to help, let me know what you'd like help with.

Perhaps you can look at glob.c and at point 6. (above).  I think I can
do something better with output vs. output/charts.

I suggest that you ftp or http a tarball back to me, and I'll
aggregate the results this end.  


PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See or any PGP keyserver for public key.

Attachment: pgp_vyfPH0wu3.pgp
Description: PGP signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]