[PATCH] doc: cex documentation

bison-patches

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH] doc: cex documentation

From:	Vincent Imbimbo
Subject:	[PATCH] doc: cex documentation
Date:	Sun, 28 Jun 2020 15:30:17 -0400

Hey Akim,
I think I found a good place for the cex documentation in bison.texi, but it 
might still be a bit awkward. Tell me if you have any ideas for changes.

*NEWS, doc/bison.texi: Add documentation for conflict counterexample generation
---
 NEWS           | 24 +++++++++++++++++
 doc/bison.texi | 71 +++++++++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 91 insertions(+), 4 deletions(-)

diff --git a/NEWS b/NEWS
index 60587f7f..6c5564a7 100644
--- a/NEWS
+++ b/NEWS
@@ -12,6 +12,30 @@ GNU Bison NEWS
   similar to the `-ffile-prefix-map` in GCC.  This option can be used to
   make bison output reproducible.
 
+*** Counterexample Generation
+
+  When given --report=counterexample or -Wcounterexample, Bison will now output
+  counterexamples for conflicts in the grammar. These are strings in the 
grammar
+  which can be parsed in two ways due to the conflict. For example:
+
+    Example              exp '+' exp • '/' exp
+    First derivation     exp ::=[ exp ::=[ exp '+' exp • ] '/' exp ]
+    Example              exp '+' exp • '/' exp
+    Second derivation    exp ::=[ exp '+' exp ::=[ exp • '/' exp ] ]
+
+  This is a shift/reduce conflict caused by none of the operators having
+  precedence, so the example can be parsed in the two ways shown.
+  When bison cannot find an example that can be derived in two ways,
+  it instead generates two examples that are the same up until the dot:
+
+    First example        expr • ID $end
+    First derivation     $accept ::=[ s ::=[ a ::=[ expr • ] ID ] $end ]
+    Second example       expr • ID ',' ID $end
+    Second derivation    $accept ::=[ s ::=[ a ::=[ expr ::=[ expr • ID ',' ] 
] ID ] $end ]
+
+  In these cases, the parser usually doesn't have enough lookahead to
+  differentiate the two given examples.
+
 ** Changes
 
 *** Relocatable installation
diff --git a/doc/bison.texi b/doc/bison.texi
index 678b6efe..9dd60851 100644
--- a/doc/bison.texi
+++ b/doc/bison.texi
@@ -9793,12 +9793,67 @@ calc.y:19.1-7: @dwarning{warning}: nonterminal useless 
in grammar: useless [@dwa
    19 | @dwarning{useless: STR;}
       | @dwarning{^~~~~~~}
 calc.y: @dwarning{warning}: 7 shift/reduce conflicts 
[@dwarning{-Wconflicts-sr}]
+calc.y: @dwarning{warning}: rerun with option '-Wcounterexamples' to generate 
conflict counterexamples [@dwarning{-Wother}]
 @end example
 
-When given @option{--report=state}, in addition to @file{calc.tab.c}, it
-creates a file @file{calc.output} with contents detailed below.  The
-order of the output and the exact presentation might vary, but the
-interpretation is the same.
+When given @option{-Wcounterexample}, bison will run a search for strings in 
your grammar
+that better demonstrate you conflicts. Counterexample generation was initially 
developed
+by Chinawat Isradisaikul and Andrew Myers (@pxref{Bibliography}).  For 
@file{calc.y},
+the first printed example is:
+
+@example
+Shift/reduce conflict on token '/':
+  Example              exp '+' exp • '/' exp
+  First derivation     exp ::=[ exp ::=[ exp '+' exp • ] '/' exp ]
+  Example              exp '+' exp • '/' exp
+  Second derivation    exp ::=[ exp '+' exp ::=[ exp • '/' exp ] ]
+@end example
+
+This shows two separate derivations in the grammar for the same exp: @samp{e1 
+ e2 / e3}.
+The derivations show how your rules would parse the given example. Here,
+the first derivation completes a reduction when seeing @samp{/}, causing 
@samp{e1 + e2}
+to be grouped as an @code{exp}. The second derivation shifts on @samp{/}, 
resulting in
+@samp{e2 / e3} being grouped as an @code{exp}. Therefore, it is easy to see 
that adding
+@code{%precedence} directives would fix this conflict.
+
+Sometimes, the search will not find an example that can be derived in two ways.
+In these cases, counterexample generation will provide two examples that are
+the same up until the dot.  Most notably, this will happen when your grammar
+requires a stronger parser (more lookahead, lr instead of lalr).
+The following example isn't LR(1):
+
+@example
+%token ID 
+%%
+s: a ID
+a: expr
+expr: %empty | expr ID ','
+@end example
+
+@command{bison} reports:
+
+@example
+Shift/reduce conflict on token ID:
+  First example        expr • ID $end
+  First derivation     $accept ::=[ s ::=[ a ::=[ expr • ] ID ] $end ]
+  Second example       expr • ID ',' ID $end
+  Second derivation    $accept ::=[ s ::=[ a ::=[ expr ::=[ expr • ID ',' ] ] 
ID ] $end ]
+@end example
+
+This conflict is caused by the parser not having enough information
+to know the difference between these two examples. The parser would need
+an additional lookahead token to know whether or not a comma follows
+the @samp{ID} after @code{expr}.  These types of conflicts tend to be more 
difficult to
+fix, and usually need a rework of the grammar.  In this case, it can be
+fixed by changing around the recursion:
+@code{expr: ID | ',' expr ID}
+
+Counterexamples can also be written to a file with @option{--report=cex}
+
+Going back to the calc example, when given @option{--report=state},
+in addition to @file{calc.tab.c}, it creates a file @file{calc.output}
+with contents detailed below.  The order of the output and the exact
+presentation might vary, but the interpretation is the same.
 
 @noindent
 @cindex token, useless
@@ -15266,6 +15321,14 @@ Elizabeth Scott, Adrian Johnstone, and Shamsa Sadaf 
Hussain,
 @cite{Tomita-Style Generalised LR Parsers}, Royal Holloway, University of
 London, Department of Computer Science, TR-00-12 (December 2000).
 
@uref{http://www.cs.rhul.ac.uk/research/languages/publications/tomita_style_1.ps}
+
+@item [Isradisaikul, Myers 2015]
+Chinawat Isradisaikul, Andrew Myers,
+Finding Counterexamples from Parsing Conflicts,
+in @cite{Proceedings of the 36th ACM SIGPLAN Conference on
+Programming Language Design and Implementation} (PLDI ’15),
+ACM, pp.@: 555--564.
+@uref{https://www.cs.cornell.edu/andru/papers/cupex/cupex.pdf}
 @end table
 
 @node Index of Terms
-- 
2.24.3 (Apple Git-128)

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH] doc: cex documentation, Vincent Imbimbo <=
- Re: [PATCH] doc: cex documentation, Akim Demaille, 2020/06/30

Prev by Date: [PATCH 7/7] yacc.c: push: don't clear the parser state when accepting/rejecting
Next by Date: tests: don't use $VERBOSE
Previous by thread: [PATCH 0/7] RFC: yypstate_clear (ATTENTION push parser users)
Next by thread: Re: [PATCH] doc: cex documentation
Index(es):
- Date
- Thread