MESSAGE
DATE | 2020-08-05 |
FROM | From: "Kaz Kylheku (gnu-misc-discuss)" <936-846-2769@kylheku.com>
|
SUBJECT | Subject: [Hangout - NYLXS] Concerns about GNU Bison maintenance.
|
Hello everyone,
Without a doubt, GNU Bison is an a cornerstone piece of the GNU system, relied upon by many programs.
Developers rely on Bison to be stable. What I mean by this is that a project which has a mature Bison grammar file that changes very little or not at all over a long period of time should not have to do anything to the code because of changes in the Bison upstream.
For example, it should be possible to check out a ten-year-old version of the code (say during a "git bisect" operation, in uncovering the commit which caused a bug) and build it without problems with the whatever Bison is installed.
Some developers write the grammar file such that it works with multiple implementations. That doesn't necessarily mean adhering to the POSIX Yacc specification. For instance, Berkeley Yacc has some GNU features like %pure-parser. This works fine with GNU Flex, just like the same feature in GNU Bison.
However, over some years now there has been an unsettling trend in the development of Bison which can be summarized as the current maintainer treating it as a personal research project.
Features are being introduced that are nice, but that nobody requires from GNU Bison. Tautologically, no existing code depends on a new feature. (So where are these requirements coming from? Who is gate-keeping them? What is the "product management" for Bison?) At the same time, stability and compatibility are showing the hairline cracks of fracture.
Most recently, Bison 3.7 was just announced. I first saw the posting in the comp.compilers newsgroup, then in the Bison mailing list. Not soon afterward, the GNU Awk maintainer reported that it doesn't even build on Ubuntu 18.04, which is almost a poster child for "popular GNU/Linux distro". A storm of mailing list posts has ensued.
Here is a problem I ran into fairly recently, after upgrading my environment to a newer GNU/Linux distribution with a newer Bison.
Once upon a time, Bison introduced an extension to the language for making a re-entrant parser; it was keyed to the directive "%pure-parser". This went on to be adopted by other Yacc-like implementations such as Berkeley Yacc.
The Bison maintainer believes that Bison "owns" this language feature and is free to deprecate it. Note that deprecating doesn't mean removing the *feature* of re-entrant parsing; just the *spelling* of the "%pure-parser" directive. As of some 3.x version, Bison now warns now that it's deprecated, and that one should use a different spelling for it.
In a mailing list response, I was told that my "problem" is that I'm trying to write code that works with Byacc and Bison. (Writing code targeting multiple implementations is a problem? Now what are the odds that someone who thinks that way would end up breaking stuff?)
The maintainer doesn't seem to understand that if I have to switch for some new spelling for an old feature to avoid the deprecation warning (and to anticipate the outright removal of the old spelling), the code then not only then doesn't work on Byacc, but it also doesn't work in older Bisons. The software no longer builds in operating system installations that have not updated to the latest Bison.
Moreover, if Bison actually drops support for the spelling, then old baselines of my code will not build. Thus, for instance, I will not be easily able to do a "git bisect" to find where a bug was introduced. The old versions won't build unless I patch every commit I visit, or use a parallel installation of old Bison for the old baselines.
Bison makes careless changes to the skeletons and other generated material. For instance, in Bison 3, a declaration of yyparse was introduced to "y.tab.h". I had to add a sed command into the makefile build recipe to filter it out textually.
What is the problem with declaring yyparse in "y.tab.h"? The problem is that if you're using a re-entrant parser, the signature of yyparse contains custom types. For instance suppose we have this in the .y grammar file:
%pure-parser %parse-param{scanner_t *scnr} %parse-param{parser_t *parser}
The declaration of yyparse is this:
int yyparse(scanner_t *scnr, parser_t *parser);
It's not just something innocuous like:
int yyparse(void);
If the former is suddenly plonked into "y.tab.h" by the parser generator, it means that whoever is including that header now has to provide declarations of scanner_t and parser_t before the header.
yyparse is not necessarily treated as a public function; programs can be written such that all the calls to yyparse occur in the same file.
POSIX doesn't say anything abuot yyparse being declared in "y.tab.h". It says this:
The header file shall contain #define statements that associate the token numbers with the token names. This allows source files other than the code file to access the token codes. If a %union declaration is used, the declaration for YYSTYPE and an extern YYSTYPE yylval declaration shall also be included in this file.
The bottom line is that you can't just add material into a header file (whether it is static or generated). Due to the large number of programs which depend on it, you don't know what may break.
The Bison project seems to lack proper focus. It now has parser generators for numerous languages, which distract from the main mission, which is to be a great replacement for Yacc, with some essential extensions.
Bison could perhaps benefit from a split; do all the experimental new stuff and support for new languages and whatnot in a "Bison New Generation" project; and just keep "Regular Bison" working.
All that said, I believe that the current maintainer is competent, and the situation can be turned around with a bit of an attitude readjustment.
I think that understanding issues in software maintenance relating to backward compatibility is a separate skill apart from other software skills, and the Bison maintainer is lacking in this area; however, those things can be easily learned. (Often they can be deduced from first principles, if you think about the implications of every code change from that perspective.)
I hasten to add the observation that no matter how much a maintainer cares about compatibility and stability and all that, one person will make mistakes anyway.
Software shops nowadays deploy peer review systems, which require every change to be viewed by several other developers and approved. While I'm a good, reasonably conscientious coder with decades of experience, this has saved my proverbial butt. I can now palpably feel the disadvantage of not having a peer review crew in my side projects. The GNU project could benefit from a collaborative review system so that changes to an important, high-impact program that countless projects depend on, such as Bison, are not just down to a single fallible human being.
There is a bit of collaboration in Bison in that some people, like Paul Eggert, regularly keep up with the baseline and post to the bug-bison mailing list. I feel that without their efforts, the situation would be worse.
Lastly, it think it may be a good idea for at least every major release of Bison to be regression tested by building several GNU/Linux distributions from scratch with it. A distro build is a great test suite for a toolchain component. If that is available, why would you only rely on the tool's own limited suite when releasing?
_______________________________________________ Hangout mailing list Hangout-at-nylxs.com http://lists.mrbrklyn.com/mailman/listinfo/hangout
|
|