bug-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 0/4] Fix crash when generating IELR


From: Dwight Guth
Subject: Re: [PATCH 0/4] Fix crash when generating IELR
Date: Thu, 9 Jul 2020 14:13:35 -0500

Hi,

I have sent the personal and employer releases, so you should be able to
confirm that internally and proceed to use the test case that I sent you,
hopefully. If anyone lets me know that I messed something up, I will try to
follow up as needed, but I think it's now taken care of.

In response to your other questions:

* We use string aliases solely in order to inform error messages that are
autogenerated by bison. These aliases are autogenerated from the original
grammar, which supports regular expressions in terminals, which is what you
are seeing in this case. the nonterminal that references that terminal is
not actually used, however; it's dead code I never got around to removing
from the autogenerated grammars. Instances of that token will never
actually be created because things that would have been lexed as that token
will instead be removed entirely by the lexer, you just don't see that here
because the lexer isn't part of the code needed to reproduce the bug.
* The original grammar can be found here:
https://github.com/kframework/k/blob/master/k-distribution/include/kframework/builtin/domains.k#L247
* The grammar transformation we do essentially takes the grammar as well as
a list of pairs of productions in the grammar, and modifies the grammar
such that for each pair, the first production cannot appear immediately
under the second production. This is done by duplicating nonterminals and
then iteratively removing certain rules from the duplicated nonterminals
and replacing certain nonterminals in rules with their duplicated variants.
This is similar to how scannerless GLR parsers like SDF deal with
precedence, as I understand, if that's what you're asking about.

On Sat, Jun 27, 2020, 1:41 AM Akim Demaille <akim.demaille@gmail.com> wrote:

> Hi Dwight,
>
> > Le 26 juin 2020 à 18:13, Dwight Guth <
> dwight.guth@runtimeverification.com> a écrit :
> >
> > Hi,
> >
> > I can confirm that the version of bison downloaded from the link you
> provided does not manifest the error in my real example, and valgrind no
> longer reports any use-after-free errors on the larger example, so it looks
> like it's been fixed.
>
> Good!  I just installed the fixes in master.  Thanks!
>
> > Since the grammar it was generated from originally, as well as the code
> that generated it, is completely open source, I'd be happy to sign some
> kind of waiver with respect to the intellectual property on the bison
> grammar I provided you
>
> Thanks, I'll send you a separate email for this.
>
> > About the original grammar: It is an automatically generated bison file,
> but it is generated from a manually written grammar written in a language
> called K (https://github.com/kframework/k).
>
> Interesting.  Thanks for the context.
>
>
> > I figured it shouldn't crash, either...
>
> Indeed :)
>
> > Basically the way it is constructed is by taking the original grammar,
> duplicating certain nonterminals in order to introduce an explicit
> precedence relationship between different productions (ie, that certain
> productions cannot appear directly under certain nonterminals in certain
> parent productions), and then automaticlaly encoding it into bison.
>
> I'm not sure I follow here.  Is this something resembling the way SGLR
> handles the precedence between rules? (not the old way, based on solving
> shift/reduce conflicts, which involves tokens, rather than rules only).


What I meant here is that K provides a way to indicate both "this
production should not appear directly under this other production", which
is a generalization of the total ordering of operator precedence that bison
provides to a partial ordering. We implement this by transforming the
grammar `Exp ::= Exp '+' Exp | Exp '*' Exp`


>
> > This seems to work well in many cases on real-world programming
> languages, but obviously it becomes possible to write arbitrary grammars
> that you would have to use GLR in order to parse with any real accuracy.
>
> Right.
>
> > I am attaching the original bison grammar that the manually modified
> grammar I sent you was derived from,
>
> I see that you are using string aliases with loads of escapes, e.g.,
>
> %token TOK_35 36 "(\\/\\*([^\\*]|(\\*+([^\\*\\/])))*\\*+\\/)"
>
> This looks very much like a regex for C comments.  Do you use this
> string?  I really means the strings that Bison embeds in the generated
> parser (in yytname), or is this just informative, but the actual use of
> this regex done elsewhere?
>
> I'm asking this because the handling of string aliases is changing
> currently, and I would need feedback on these changes.
>
> Also, according to what I saw, you should consider using api.token.raw.
>
>
> > but I think you will find that the minimized grammar probably works
> better as a test case.
>
> Yes, definitely :)
>
> > I can also point you to the original grammar written in K if you are
> curious...
>
> Yes please!
>
> Cheers!


reply via email to

[Prev in Thread] Current Thread [Next in Thread]