axiom-developer
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Axiom-developer] RE: algebra Makefiles with explicit dependencies,


From: Bill Page
Subject: RE: [Axiom-developer] RE: algebra Makefiles with explicit dependencies, bootstrap, fixed-points etc.
Date: Mon, 10 Jan 2005 11:43:49 -0500

Tim,

On Monday, January 10, 2005 1:16 AM you wrote:
> 
> Bill Page wrote:
> > I agree that it is important to make sure that it is right.
> > In fact that is why I started to look at this. I was worried
> > that the current build does not (quite) do it right. The
> > fact that the generated lisp files change after an iteration
> > of the build proves that, I think. So on the contrary I do
> > think there is some urgency.
> 
> If you remember the primary reason why we decided that there
> should be monthly releases of Axiom was to undermine this sort
> of reasoning. The monthly release cycle is intended to ensure
> that we have timely releases of changes. The argument was that
> a 6 month release cycle would delay a bug fix for an average of
> 9 months. A monthly release cycle reduces that to an average of
> 1.5 months.

Ok, the March release is early enough for me.

> 
> Yes, I agree that there is an interest in Axiom and, yes, I agree
> that this might be an important issue. Then again, the suggested
> change might have no effect on the algebra at all. I do not believe
> that we have sufficient time to 
> (a) understand the nature of the problem
> (b) decide if it is a bug
> (c) decide that we have the correct fix
> (d) document the problem and the fix
> (e) decide that we haven't broken anything
> (f) correctly merge the changes
> (g) do "round-trip" system builds
> 
> We have 20 days left before the February 1 release.

Ok. No problem.

> At the moment I understand that we have found a difference between
> versions of the BOOTSTRAP code and the later compiled forms.

No. That is not correct. We have known for a long time that the
BOOTSTRAP code is not the same as the later compiled forms. I
presume that that is why you included the final step in the
current build that re-builds the BOOTSTRAP *.o files from the
spad files rather than just leaving the original *.o files
compiled from the BOOTSTRAP *.lsp code.

The BIG ISSUE is that *other files* besides the bootstrap files
are affected. The differences in the bootstrap files *propagate*
to other files that depend on them. That is the reason that some
of the non-bootstrap *.lsp files change during the 1st iteration
of the fixedPoint calculation.

> I do not yet understand (a) above, the nature of the problem. Why
> are they different?

The original bootstrap *.lsp files are different because you
copied them from a running Axiom system that is different than
the one that we are compiling now. There are some missing function
definitions and some different function definitions.

> Even if they are different the build system intentionally rebuilds
> all of the BOOTSTRAP files as the last step of the build in order
> to guard against some potential problems.

No, that is not sufficient because the differences in the original
bootstrap *.lsp files have already moved on to the other files.

> So I'm not sure that (b) it is a bug.

Surely you accept that it is a bug that compiling the spad files
twice does not give the same *.lsp code each time? This means
that somewhere in the Axiom system, some calculation will produce
different results if we build the current system once or twice
in succession.

> After all, if we mass-replace 85 files of the build there will
> be one of two cases:
>  1) the input files are the same and no bugs are fixed.
>       so why change the algebra?

You meant the "output files are the same", I think?

The input files are not sufficient to detect the changes that
occur in the non-bootstrap files so the output from these files
does in fact remain the same.

>  2) the input files are different and either bugs are fixed 
> or introduced so why change the algebra? 

The output files do remain the same.

> 
> Clearly case 1 will require a massive change to the system with
> no apparent gain.

The gain is currently invisible because the input file test cases
are not adequate to detect all possible problems (of course they
never can be really adequate for all possible problems).

> If it doesn't fix anything then there is no hurry.

But it does fix something rather serious: in the current build the
*.lsp files are not consistent with the *.spad files.

> We can study the fixed-point problem at our leisure. I agree that
> a change is needed, just not how urgent it is.
> 

Ok. That's a judgement call that I am happy to let you make.

> Clearly case 2 will require further study to decide what broke
> and which version is correct. If the input files break in some way
> then we need to clearly understand how the massive change broke or
> fixed the system. If it fixes things then we need to think about
> the best way to ensure that the problem can't happen again. If it
> breaks things then we need to re-evaluate the change. And we need
> to document it.

Unfortunately Axiom is not being that kind to us. I think what
we should do is design some new input files that are sensitive
to the changes that occur in the *.lsp files. This should be
possible since we know exactly what changes from the during the
1st iteration. I will try to do at least one such input file
today.

> 
> This is a change to the very heart of the system. It deserves
> careful time and attention and much careful checking. And as yet
> I don't believe we fully understand anything but the symptoms of
> a possible, but not proven, bug.

I hope that my long-winded explanations are helping to convince
you. Tim, the main point is that differences in the bootstrap *.lsp
files propagate to non-bootstrap files during the initial build.
To me, this is very clearly a BUG.

> The changes are not uploaded anywhere, they are not "round-trip"
> tested, they are not hand checked, there is no documentation of
> the problem and its fixes, etc. Given all that I believe February 1
> is too soon.

Ok. However I do consider it very significant that Steve has obtained
the same results. Perhaps you will also have time to repeat the tests?

> 
> Lets upload the changes to axiom--algebra--1 where we can all work 
> on it.
> 
> Quality counts and it takes time.

Agreed.

> 
> > > I'd suggest that the "fixed-point" bootstrap be merged into 
> > > axiom--algebra--1 so we can test it first.
> > 

Ok, Steve has said that he is already merging the changes to
the bootstrap files in to his axiom--language--1 branch.

> 
> > > By test it I mean that we have to compare, line by line,
> > > the output of the current input files to see if anything
> > > breaks. There are also a set of known broken input files
> > > (see src/input/Makefile.pamphlet) and we need to see if this
> > > fixes any known problems there.
> > 
> > That should be pretty easy since we already save the output as
> > .output files. I think we just need to cache the first version
> > of these files somewhere and then delete them and re-run 'make'
> > to create a new set. Then run a diff. I will do this later
> > tonight with my current axiom--windows--1 build and let you
> > know the result.
> 
> Ah. The easy part. But you missed the hard part. There are 
> input files listed in src/input/Makefile.pamphlet that do not
> currently get built. They have various errors which I have not
> had time to track yet. If we are going to change the basic
> algebra we need to check each of these files to either make
> sure they are still broken in the same way or are fixed.
>

Ok, great. Is there an easy way to run these in the current
build?
 
> > I think it is *essential* that the build system CHECKS to see
> > if there is a difference between the bootstrap files and the
> > build files. 
> 
> This is reasonable. However the .lsp files contain gensym symbols
> which are very sensitive to any changes. So the diff procedure
> either needs to be written in lisp or needs a filter to ensure
> that gensyms don't cause spurious "failure" reports.

No. The current iteration procedure does not produce random changes
in the gensym symbols. I also worried about this initially, but
it seems that GCL generates the same symbol names in a predictable
manner. None of the changes that occur in the 1st iteration are
due to changing gensym symbols.

> 
> > My conclusion is that the current build is already subtly
> > broken and we probably do not want to release a February build
> > *until* we fix this problem. 
> 
> Different is not the same as broken. Broken means the algebra
> gives wrong answers. 

Yes I think it is clear that in certain cases the algebra will
give wrong answers. I say that because of the nature of the
changes in the *.lsp files. But so far the *.input files are
not sufficient to detect these errors.

> 
> > > The February build will be the first "real axiom" version for
> > > most people and it is important that it run properly.
> > 
> > I agree. That is the reason why I think it is quite urgent that
> > we resolve the current problem first.
> 
> We disagree on this. There will always be bugs. There will always
> be an urgency to fix them. Axiom, as it exists, has many known bugs.
> The previous release system I used would update Axiom with every
> fix. The result of the mailing list discussion was that this was
> unacceptable and needed a schedule. So far we have had "one release
> in a row on schedule".  Unless the algebra is shown to be badly
> broken, i.e. giving many NEW wrong answers and that the new BOOTSTRAP
> change fixes them I feel we can release a version on February 1
> without it. It will certainly make the March 1 release.

I don't think the (known) algebra is badly broken, instead there are
some hidden traps somewhere that could result in broken algebra. But
you are right that there are always likely to be many such bugs. So
March 1 release is soon enough for me.

> 
> So, basically, there's the bar.... show me "stop the press"
> failures that the BOOTSTRAP changes will fix. Or any new failures.
> 
> Otherwise there are people waiting on the newly merged graphics and
> hyperdoc. One person I'm working with offline is developing hyperdoc
> pages for a linear algebra course and needs the working hyperdoc.
> 

I understand that. Thanks for discussing it in detail.

Regards,
Bill Page.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]