[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: GOP2: 2 - Stable releases and roadmap (radical change)
From: |
Keith OHara |
Subject: |
Re: GOP2: 2 - Stable releases and roadmap (radical change) |
Date: |
Wed, 27 Jun 2012 05:33:47 +0000 (UTC) |
User-agent: |
Loom/3.14 (http://gmane.org/) |
Graham Percival <graham <at> percival-music.ca> writes:
>
> Let’s drop the “any unintended change” thing, and go totally with
> the regression tests. Tests pass? We can make a stable release.
I don't know. Maybe that would be alright. I'm not sure.
The 'Regression' label would be come more important, because we will
want to keep track of and fix regressions before too many stable
releases go by. For this purpose, I guess a regression would be
failure of something that worked on purpose in /any/ stable release.
> - any regression test which fails to compile or shows incorrect
> output.
For any changed test then, it is probably worth reading the header, to
see if a subtle change that looks harmless happens to be the point of
the test (and would presumably cause other trouble).
The "incorrect output" should only count where the previous stable
release gave correct output. Lots of tests show behavior that some
people think is wrong (‘accidental-ledger.ly’ ‘ambitus.ly’) or that
looks bad because it is a a stress test (‘break.ly’ ‘prefatory-
separation.ly’ ‘spacing-strict-spacing-grace.ly’).
> *** Details: Regtests
>
> The current regtests don’t cover enough – that’s why we keep on
> finding new regression-Critical issues. I think it’s worth
> expanding the regtests and splitting them into multiple
> categories.
This cannot be done quickly. Adding a few pieces of music in
various styles might help, but I remember from my regression this
cycle that my patch worked fine on score and parts of a full symphony,
but then version 2.15.37 failed spectacularly on two pages of guitar
music.
> Tiny: these files would test individual features, such as
> printing accidentals or slurs, with a minimum of shared features.
As a goal, I suggest "Targeted" instead of "Tiny" -- testing performance
in one narrow area, but thoroughly. More often than not, after I fix
a bug, I find a regtest that should have caught it, and expand that
test rather than add a new one.