gnu-misc-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Concerns about GNU Bison maintenance.


From: Kaz Kylheku (gnu-misc-discuss)
Subject: Concerns about GNU Bison maintenance.
Date: Wed, 05 Aug 2020 16:06:41 -0700
User-agent: Roundcube Webmail/0.9.2

Hello everyone,

Without a doubt, GNU Bison is an a cornerstone piece of the GNU system,
relied upon by many programs.

Developers rely on Bison to be stable. What I mean by this is that a
project which has a mature Bison grammar file that changes very little
or not at all over a long period of time should not have to do anything
to the code because of changes in the Bison upstream.

For example, it should be possible to check out a ten-year-old version
of the code (say during a "git bisect" operation, in uncovering the
commit which caused a bug) and build it without problems with the
whatever Bison is installed.

Some developers write the grammar file such that it works with multiple
implementations. That doesn't necessarily mean adhering to the POSIX
Yacc specification. For instance, Berkeley Yacc has some GNU features
like %pure-parser. This works fine with GNU Flex, just like the same
feature in GNU Bison.

However, over some years now there has been an unsettling trend in the
development of Bison which can be summarized as the current maintainer
treating it as a personal research project.

Features are being introduced that are nice, but that nobody requires
from GNU Bison. Tautologically, no existing code depends on a new
feature. (So where are these requirements coming from? Who is
gate-keeping them? What is the "product management" for Bison?) At the
same time, stability and compatibility are showing the hairline cracks
of fracture.

Most recently, Bison 3.7 was just announced. I first saw the posting in
the comp.compilers newsgroup, then in the Bison mailing list.  Not soon
afterward, the GNU Awk maintainer reported that it doesn't even
build on Ubuntu 18.04, which is almost a poster child for "popular
GNU/Linux distro". A storm of mailing list posts has ensued.

Here is a problem I ran into fairly recently, after upgrading my
environment to a newer GNU/Linux distribution with a newer Bison.

Once upon a time, Bison introduced an extension to the language for
making a re-entrant parser; it was keyed to the directive
"%pure-parser". This went on to be adopted by other Yacc-like
implementations such as Berkeley Yacc.

The Bison maintainer believes that Bison "owns" this language feature
and is free to deprecate it. Note that deprecating doesn't mean removing
the *feature* of re-entrant parsing; just the *spelling* of the
"%pure-parser" directive. As of some 3.x version, Bison now warns now
that it's deprecated, and that one should use a different spelling
for it.

In a mailing list response, I was told that my "problem" is that I'm
trying to write code that works with Byacc and Bison.  (Writing code
targeting multiple implementations is a problem?  Now what are the odds
that someone who thinks that way would end up breaking stuff?)

The maintainer doesn't seem to understand that if I have to switch for
some new spelling for an old feature to avoid the deprecation warning
(and to anticipate the outright removal of the old spelling), the code
then not only then doesn't work on Byacc, but it also doesn't work in
older Bisons. The software no longer builds in operating system
installations that have not updated to the latest Bison.

Moreover, if Bison actually drops support for the spelling, then old
baselines of my code will not build. Thus, for instance, I will not be
easily able to do a "git bisect" to find where a bug was introduced. The
old versions won't build unless I patch every commit I visit, or use a
parallel installation of old Bison for the old baselines.

Bison makes careless changes to the skeletons and other generated
material. For instance, in Bison 3, a declaration of yyparse was
introduced to "y.tab.h".  I had to add a sed command into the makefile
build recipe to filter it out textually.

What is the problem with declaring yyparse in "y.tab.h"? The problem is
that if you're using a re-entrant parser, the signature of yyparse
contains custom types. For instance suppose we have this in the .y
grammar file:

  %pure-parser
  %parse-param{scanner_t *scnr}
  %parse-param{parser_t *parser}

The declaration of yyparse is this:

  int yyparse(scanner_t *scnr, parser_t *parser);

It's not just something innocuous like:

  int yyparse(void);

If the former is suddenly plonked into "y.tab.h" by the parser
generator, it means that whoever is including that header now has to
provide declarations of scanner_t and parser_t before the header.

yyparse is not necessarily treated as a public function; programs can be
written such that all the calls to yyparse occur in the same file.

POSIX doesn't say anything abuot yyparse being declared in "y.tab.h".
It says this:

  The header file shall contain #define statements that associate the
  token numbers with the token names. This allows source files other
  than the code file to access the token codes. If a %union declaration
  is used, the declaration for YYSTYPE and an extern YYSTYPE yylval
  declaration shall also be included in this file.

The bottom line is that you can't just add material into a header file
(whether it is static or generated).  Due to the large number of
programs which depend on it, you don't know what may break.

The Bison project seems to lack proper focus. It now has parser
generators for numerous languages, which distract from the main mission,
which is to be a great replacement for Yacc, with some essential
extensions.

Bison could perhaps benefit from a split; do all the experimental new
stuff and support for new languages and whatnot in a "Bison New
Generation" project; and just keep "Regular Bison" working.

All that said, I believe that the current maintainer is competent, and
the situation can be turned around with a bit of an attitude
readjustment.

I think that understanding issues in software maintenance relating to
backward compatibility is a separate skill apart from other software
skills, and the Bison maintainer is lacking in this area; however, those
things can be easily learned. (Often they can be deduced from first
principles, if you think about the implications of every code change
from that perspective.)

I hasten to add the observation that no matter how much a maintainer
cares about compatibility and stability and all that, one person will
make mistakes anyway.

Software shops nowadays deploy peer review systems, which require every
change to be viewed by several other developers and approved.  While
I'm a good, reasonably conscientious coder with decades of experience,
this has saved my proverbial butt. I can now palpably feel the
disadvantage of not having a peer review crew in my side projects.
The GNU project could benefit from a collaborative review system so that
changes to an important, high-impact program that countless projects
depend on, such as Bison, are not just down to a single fallible human
being.

There is a bit of collaboration in Bison in that some people, like
Paul Eggert, regularly keep up with the baseline and post to the
bug-bison mailing list. I feel that without their efforts, the situation
would be worse.

Lastly, it think it may be a good idea for at least every major release
of Bison to be regression tested by building several GNU/Linux
distributions from scratch with it.  A distro build is a great test
suite for a toolchain component. If that is available, why would you
only rely on the tool's own limited suite when releasing?




reply via email to

[Prev in Thread] Current Thread [Next in Thread]