bug-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: api.header.include and backward-compatible .y files


From: Kaz Kylheku
Subject: Re: api.header.include and backward-compatible .y files
Date: Mon, 07 Sep 2020 02:30:01 -0700
User-agent: Roundcube Webmail/0.9.2

On 2020-09-06 00:46, Akim Demaille wrote:
Kaz,

Le 5 sept. 2020 à 17:58, Kaz Kylheku <kaz@kylheku.com> a écrit :

On 2020-08-30 05:21, Akim Demaille wrote:
Hi Adam,
Le 22 août 2020 à 02:33, Adam Novak <anovak@soe.ucsc.edu> a écrit :
Hello,
I'm maintaining a .y file at
https://github.com/vgteam/raptor/blob/master/src/turtle_parser.y that
needs to be backward-compatible with the Bison available in Ubuntu
18.04 (3.0.4), but also work on the latest Bison that our project's
Mac users get supplied from Homebrew (3.7.1).
Back in the days, people were *shipping* the generated files.  That
was awesome, since then maintainers are free from such constraints:
they use whatever version of their favorite generator is, and are
free from requiring anything from the user; users don't even need
to have the generator (Bison in the present case).
It's a pity today we lost this wisdom.

Back in the day, people retained the generated files because
the C language had started to become portable, whereas to get
C from a Yacc grammar, they still had to upload their code
to a Unix box to run the proprietary Yacc program.

Even the person who wrote the program didn't necessarily have
consistent Unix access, not to mention anyone friends to whom
that person might give the code.

People would upload just their .y file to a Unix system, run
yacc and then download the y.tab.[ch] files.

The only valid reasons for having any generated files in version
control or distribution is unavailability of the tool.

You are vastly simplifying things.  In particular, you completely
discard the problems with evolutions here.

I see simplifying things as my job, really.

Bison's user is whoever runs Bison. Bison's user is not that one
who runs the program built with Bison; that is the user's user.
The user's user is not your user.

You cannot assume that your user is just a middleman in
a delivery chain, who can deal with any nuisance that lands his way,
because it's his job. That user may be a free software developer,
like you.

You are misunderstanding my point.  My point is that back in the
days people were shipping releases, and releases are self-contained,
they protect the end user from any non standard dependency such
as Autoconf, Automake, Bison, Flex, Gperf, Libtool, Gettext, just
to name a few of them close the GNU project.  Installing a release
was super easy, because you hardly had any dependency.

Well, Autoconf without question! If the end user of a program is
required to have Autoconf, then the developer has misunderstood the
meaning of Autoconf, which is to generate a configure script that
assumes little about the environment.

I think that Autoconf and Automake are so thoroughly baked into
the "DNA" of Bison, that you may be losing touch with the idea that
a parser generator is not Autoconf.

Look, Bison's tree makes more use of M4 macros than anything I remember
ever seeing. It's not just for configuration but elements like
parser skeletons and test cases. It's kind of weird!

In the Unix world, Yacc is standard. You can rely on it for building
your program as surely as you can rely on make, awk, sed, or the
shell.

In the GNU/Linux world, Bison is standard. You will never run into
a sitution where you don't have Bison, if you rely on Bison
extensions.

Note that even Autoconf doesn't prevent the user from not requiring
a shell or make. Those are going to be whatever the user has.
The compiler is going to be whatever the user has.

Maintainers and contributors had a way more complex task: setting
up a *developer*  environment with all the required versions of the
required tools.  And they had to keep their environment fresh.  On
occasions it meant using non released versions of these tools.  But
that was not a problem, because it was only on the shoulders of a
few experienced people.

Way too often today people no longer make self-contained releases,
and releases are hardly different from a git snapshot.  That is
wrong.

Nope; what is wrong is thinking there is a difference.

If you're distributing source code, then the user must have tools
to build it.

These should be exactly the same as what is required to work on
the program.

Any differences are confusing and annoying, and create barriers to
entry to the project.

"Oh, you think you've built Foobar and can create a patch for it?
Hahahaha, you tarball-sucking fool. Let me introduce you to the
git repository of Foobar, and the seven-headed development
environment bootstrap process."

This is wrong because now end users need to install tons of tools.

Good incentive too keep that tool count down, right?

If you think the user will hate installing a ton of tools, what
makes you think the contributing user won't hate it?

And most of them don't want to install recent versions of these tools
(and I don't blame them), they just want to use the one provided by
their distro.

Are you the same person who reminded me not to use GCC-specific
warning options in a patch, because Bison builds with many C compilers?

:)

So today, some maintainers locked themselves into not being able
to use tools that are no widespread enough.

Not to mention that they
might even have to deal with different behaviors from different
versions of the tool.  Then they find convenient to blame the
evolution of the tool.

I'd love to see you maintain Bison stoically, without complaining,
if different versions of your compiler (or other tools) were
producing different results.

We have detailed requirements for those things, and international
standards, for good reasons.

But the problem is rather their use of the generator.  *They* are
in charge of generating say Bison parsers, and to pass them in
their releases.  That's a mild effort, but with a huge ROI: you
no longer, ever, have to face the nightmare of having to support
very different versions of the tool, and you also can *immediately*
benefit from new features.

What do you do if you've maintained a program for over a decade,
and none of your old commits have a copy of the generated parser?

When you do a "git bisect", the old builds have to build!

The C parts of the old builds build because you wrote portable
code, and the C compiler people take it seriously.

I know of several projects, some very important ones, that are
stuck with old versions of Bison although they could benefit from

Are they really stuck with old versions of Bison, though?

That only happens if their code actually doesn't work with newer Bison.

*Using* only the features provided by Ubuntu 18's packaged Bison
is not the same as being *stuck*.

I'm using Ubuntu 18's Bison myself, but I'm not stuck. It looks
like my stuff works with 3.7.

newer features, features that have sometimes been written *for them*,
to simplify *their* problems.  But they still have the old hackish
code because the recent releases of Bison are not available "yet"
in Ubuntu 18.04...  Gee.

They will pick that up in due time; that's their decision.

Using the new features takes work. If the parser side of their
stuff works fine, maybe they have higher priority items to work on.
Making the same stuff work fine, like before, but with nicer,
terser Bison code, requires development effort.

They probably like being able to check out an old baseline
and also have that work with whatever Bison they have.

Can't Bison have improvements that are internal? As in, nothing
changes in the input file, but the output is better?

Suppose I want the user to benefit from the newest, shiniest Bison
they can get their hands on.

The following is the actual situation.

My code works with old Bison, as far back as 2.x.

The Bison I'm using is behind; it's the Ubuntu one.

But, I have downstream packagers who are on Bison 3.7.

Maybe that puts out better code. Maybe tables are compressed better,
or the skeleton has some new tricks to run faster or whatever.
Maybe some buffer overflow has been fixed somewhere. I have no idea.

Just because I'm not using new syntax doesn't mean I'm not
using new Bison. Just like just because I'm using C99
or C90 doesn't mean I'm not getting better code generation
or diagnostics.

Why would I ship frozen parser output? Why recommend that to me?

The downstream packagers have chosen Bison 3.7 for their distro,
and expect all programs to use that Bison.

If there is some security issue found in Bison-generated code,
they expect to be able to upgrade Bison and rebuild all
packages that name it as a dependency.

Shipping a frozen parser is downright antisocial.

Today, the consumers of the free software developer's code base
are downstream packagers. They have the whole suite of tools, by definition.

The users who run the program get binaries from the packagers;
they need no tools.

If they do need tools, their packagers have them all,
in package form, so they can be almost instantly as well-tooled
as the distro itself.

The imaginary user with the "medium amount of tools" went
extinct in the 1990's.

The modern user has all the tools. He or she just doesn't have the latest
version of all of them, necessarily.

The consumers of programming languages
are programmers. Yet, we broadly value stability of programming
languages. Multiple implementations that adhere to common standards
are also a boon.

True, but moot.  There's one Bison.

There is one C#. So Microsoft should just break all C# code
written before 2014.

The byte-code is environment-independent; users with old
code should just compile it with the old C# compiler and retain
the byte code.

The right approach is rather to see how your need is part of general
pattern, and how that need can be fulfilled in a clean way.

But don't, say, happily sed the generated output, and expect it to
work forever.

If Bison has a test case representing a usage, then that will continue
to work. If a decision is made that it will not continue to work, then
that decision will appear in the form of a commit which removes that
test case, which leaves a very clear record. The users relying on it
break, but if they look at the history of the tool, they will see that
it's not by accident, and just have to suck it up. The tool's project
decided to drop their use case, recorded in a commit, and that is
all there is to it.

sed-ing the output is a poor approach, which was made necessary due to
not having a test case in the parser generator to check the behavior.

Well, not exactly necessary. A fix was necessary, and there are always
alternative solutions to seding the output.

However, seding the output is (sometimes) the simplest solution which
has the virtue of having the highest probability of easily backporting
to old baselines, building which requires the fix.

If you go back with "git bisect", that seding very easy to apply.
Even if there is a conflict in that Makefile rule (rare), it can be
added by hand. A refactoring of the code may not backport as easiy.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]