help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Enum for token '0', EOF


From: Joel E. Denny
Subject: Re: Enum for token '0', EOF
Date: Wed, 12 Jul 2006 01:53:35 -0400 (EDT)

On Sun, 9 Jul 2006, Joel E. Denny wrote:

> On Sun, 9 Jul 2006, Akim Demaille wrote:
> 
> > I'd like to see its
> > destructor called, just as in glr.
> 
> I'll implement whatever we decide for glr.c.  Maybe someone else can 
> handle yacc.c?

I went ahead.  As far as I can tell, all the skeletons now (after my 
uncommitted patch below) consistently pop EOF one time upon a successful 
parse.

I don't understand why the parser didn't clear the lookahead when it 
shifted EOF (actually, glr.c did clear it but only in non-deterministic 
operation).  That meant that, upon a parse failure, the parser didn't 
destroy an unshifted EOF lookahead for fear that it might have already 
shifted it.  I removed this behavior, and everything seems fine.  Is that 
ok?

> glr.c does something like this, but it uses EOF as both BOF and EOF.  I 
> haven't explored this in detail in a while, but I recall that it's the 
> reason for the double pop of EOF.

I should've done my homework before replying.  I see now that the first 
EOF is just the start state, and glr.c was popping the start state whereas 
yacc.c and lalr1.cc weren't.  I've changed it so glr.c doesn't pop the 
start state.

Or maybe we should say that, if the user declares %destructor for EOF (or 
BOF?), he must initialize $$ in %initial-action?  Well, this doesn't 
appear useful for my purposes.  I've removed the trouble at least.  I'll 
let someone else determine whether there's a use for BOF, what it's token 
number should be, whether the user can manipulate it, etc.

> However, maybe there's a better fix.  I'm not sure exactly what the 
> Bison-generated parser does now, but I think it does not perform the 
> reduction for rule 0.  Maybe it should?  Reducing rule 0 would = a 
> successful parse.  Bison could automatically add an action to rule 0 that 
> invokes any %destructor for the start symbol and any %destructor for the 
> end token.  Might be a clean way to block the "unused value" warnings.

This still seems cleaner to me.  I'm imagining those optional warnings 
we've discussed for unused typed values.  If the user gives EOF a type and 
requests those warnings, shouldn't Bison tell him that the value of EOF 
won't be used in the rule 0 action?  That is, unless he specifies a 
%destructor for EOF, which would cause Bison to generate a destructor 
invocation into the rule 0 action.  In general, having the same checks for 
rule 0 as for all other rules just seems right.

Anyway, it's going to take me too much time right now to figure out the 
table construction in order to change the final state to *after* reduction 
0.  Instead, I've just turned off the grammar checks for rule 0.

> If the user doesn't declare the end token, then it's called $end, and 
> Bison wouldn't associate any %destructor/%printer (not even the default) 
> with it.

My upcoming default %destructor/%printer patch will do that.

Anyway, here's the EOF patch, which I haven't committed yet.  Let me know 
if it seems ok.

Joel

Index: ChangeLog
===================================================================
RCS file: /sources/bison/bison/ChangeLog,v
retrieving revision 1.1532
diff -p -u -r1.1532 ChangeLog
--- ChangeLog   10 Jul 2006 19:36:31 -0000      1.1532
+++ ChangeLog   12 Jul 2006 05:41:45 -0000
@@ -1,3 +1,32 @@
+2006-07-12  Joel E. Denny  <address@hidden>
+
+       Clean up handling of %destructor for the end token (token 0).
+       Discussed starting at
+       <http://lists.gnu.org/archive/html/bison-patches/2006-07/msg00019.html>
+       and
+       <http://lists.gnu.org/archive/html/help-bison/2006-07/msg00013.html>.
+
+       Make the skeletons consistent in how they pop the end token and invoke
+       its %destructor.
+       * data/glr.c (yyrecoverSyntaxError, yyparse): Don't pop the start
+       state, which has token number 0, since this would invoke the
+       %destructor for the end token.
+       * data/lalr1.cc (yy::parser::parse): Don't check for the final state
+       until after shifting the end token, or else it won't be popped.
+       * data/yacc.c (yyparse): Likewise.
+
+       * data/glr.c (yyparse): Clear the lookahead after shifting it even when
+       it's the end token.  Upon termination, destroy an unshifted lookahead
+       even when it's the end token.
+       * data/lalr1.cc (yy::parser::parse): Likewise.
+       * data/yacc.c (yyparse): Likewise.
+
+        * src/reader.c (packgram): Don't check rule 0.  This suppresses unused
+       value warnings for the end token when the user gives the end token a
+       %destructor.
+
+       * tests/actions.at (Printers and Destructors): Test all the above.
+
 2006-07-10  Akim Demaille  <address@hidden>
 
        * src/complain.c (error_message, ERROR_MESSAGE): New.
Index: data/glr.c
===================================================================
RCS file: /sources/bison/bison/data/glr.c,v
retrieving revision 1.185
diff -p -u -r1.185 glr.c
--- data/glr.c  9 Jul 2006 20:36:33 -0000       1.185
+++ data/glr.c  12 Jul 2006 05:41:46 -0000
@@ -2261,7 +2261,8 @@ yyrecoverSyntaxError (yyGLRStack* yystac
            }
        }
 ]b4_locations_if([[      yystackp->yyerror_range[1].yystate.yyloc = 
yys->yyloc;]])[
-      yydestroyGLRState ("Error: popping", yys]b4_user_args[);
+      if (yys->yypred != NULL)
+       yydestroyGLRState ("Error: popping", yys]b4_user_args[);
       yystackp->yytops.yystates[0] = yys->yypred;
       yystackp->yynextFree -= 1;
       yystackp->yyspaceLeft += 1;
@@ -2373,8 +2374,7 @@ m4_popdef([b4_at_dollar])])dnl
              if (yyisShiftAction (yyaction))
                {
                  YY_SYMBOL_PRINT ("Shifting", yytoken, &yylval, &yylloc);
-                 if (yychar != YYEOF)
-                   yychar = YYEMPTY;
+                 yychar = YYEMPTY;
                  yyposn += 1;
                  yyglrShift (&yystack, 0, yyaction, yyposn, &yylval, &yylloc);
                  if (0 < yystack.yyerrState)
@@ -2490,7 +2490,7 @@ m4_popdef([b4_at_dollar])])dnl
   goto yyreturn;
 
  yyreturn:
-  if (yychar != YYEOF && yychar != YYEMPTY)
+  if (yychar != YYEMPTY)
     yydestruct ("Cleanup: discarding lookahead",
                YYTRANSLATE (yychar),
                &yylval]b4_locations_if([, &yylloc])[]b4_user_args[);
@@ -2512,7 +2512,8 @@ m4_popdef([b4_at_dollar])])dnl
                  {
                    yyGLRState *yys = yystates[yyk];
 ]b4_locations_if([[                yystack.yyerror_range[1].yystate.yyloc = 
yys->yyloc;]]
-)[                 yydestroyGLRState ("Cleanup: popping", yys]b4_user_args[);
+)[                 if (yys->yypred != NULL)
+                     yydestroyGLRState ("Cleanup: popping", yys]b4_user_args[);
                    yystates[yyk] = yys->yypred;
                    yystack.yynextFree -= 1;
                    yystack.yyspaceLeft += 1;
Index: data/lalr1.cc
===================================================================
RCS file: /sources/bison/bison/data/lalr1.cc,v
retrieving revision 1.137
diff -p -u -r1.137 lalr1.cc
--- data/lalr1.cc       9 Jul 2006 20:36:33 -0000       1.137
+++ data/lalr1.cc       12 Jul 2006 05:41:46 -0000
@@ -569,6 +569,11 @@ m4_popdef([b4_at_dollar])])dnl
   yynewstate:
     yystate_stack_.push (yystate);
     YYCDEBUG << "Entering state " << yystate << std::endl;
+
+    /* Accept?  */
+    if (yystate == yyfinal_)
+      goto yyacceptlab;
+
     goto yybackup;
 
     /* Backup.  */
@@ -618,16 +623,11 @@ m4_ifdef([b4_lex_param], [, ]b4_lex_para
        goto yyreduce;
       }
 
-    /* Accept?  */
-    if (yyn == yyfinal_)
-      goto yyacceptlab;
-
     /* Shift the lookahead token.  */
     YY_SYMBOL_PRINT ("Shifting", yytoken, &yylval, &yylloc);
 
-    /* Discard the token being shifted unless it is eof.  */
-    if (yychar != yyeof_)
-      yychar = yyempty_;
+    /* Discard the token being shifted.  */
+    yychar = yyempty_;
 
     yysemantic_stack_.push (yylval);
     yylocation_stack_.push (yylloc);
@@ -782,9 +782,6 @@ b4_error_verbose_if([, yytoken])[));
        YY_STACK_PRINT ();
       }
 
-    if (yyn == yyfinal_)
-      goto yyacceptlab;
-
     yyerror_range[1] = yylloc;
     // Using YYLLOC is tempting, but would change the location of
     // the lookahead.  YYLOC is available though.
@@ -810,7 +807,7 @@ b4_error_verbose_if([, yytoken])[));
     goto yyreturn;
 
   yyreturn:
-    if (yychar != yyeof_ && yychar != yyempty_)
+    if (yychar != yyempty_)
       yydestruct_ ("Cleanup: discarding lookahead", yytoken, &yylval, &yylloc);
 
     /* Do not reclaim the symbols of the rule which action triggered
Index: data/yacc.c
===================================================================
RCS file: /sources/bison/bison/data/yacc.c,v
retrieving revision 1.150
diff -p -u -r1.150 yacc.c
--- data/yacc.c 9 Jul 2006 20:36:33 -0000       1.150
+++ data/yacc.c 12 Jul 2006 05:41:47 -0000
@@ -1164,6 +1164,9 @@ m4_ifdef([b4_at_dollar_used], [[  yylsp[
 
   YYDPRINTF ((stderr, "Entering state %d\n", yystate));
 
+  if (yystate == YYFINAL)
+    YYACCEPT;
+
   goto yybackup;
 
 /*-----------.
@@ -1213,9 +1216,6 @@ yybackup:
       goto yyreduce;
     }
 
-  if (yyn == YYFINAL)
-    YYACCEPT;
-
   /* Count tokens shifted since error; after three, turn off error
      status.  */
   if (yyerrstatus)
@@ -1224,9 +1224,8 @@ yybackup:
   /* Shift the lookahead token.  */
   YY_SYMBOL_PRINT ("Shifting", yytoken, &yylval, &yylloc);
 
-  /* Discard the shifted token unless it is eof.  */
-  if (yychar != YYEOF)
-    yychar = YYEMPTY;
+  /* Discard the shifted token.  */
+  yychar = YYEMPTY;
 
   yystate = yyn;
   *++yyvsp = yylval;
@@ -1418,9 +1417,6 @@ yyerrlab1:
       YY_STACK_PRINT (yyss, yyssp);
     }
 
-  if (yyn == YYFINAL)
-    YYACCEPT;
-
   *++yyvsp = yylval;
 ]b4_locations_if([[
   yyerror_range[1] = yylloc;
@@ -1461,7 +1457,7 @@ yyexhaustedlab:
 #endif
 
 yyreturn:
-  if (yychar != YYEOF && yychar != YYEMPTY)
+  if (yychar != YYEMPTY)
      yydestruct ("Cleanup: discarding lookahead",
                 yytoken, &yylval]b4_locations_if([, &yylloc])[]b4_user_args[);
   /* Do not reclaim the symbols of the rule which action triggered
Index: src/reader.c
===================================================================
RCS file: /sources/bison/bison/src/reader.c,v
retrieving revision 1.265
diff -p -u -r1.265 reader.c
--- src/reader.c        9 Jul 2006 20:36:33 -0000       1.265
+++ src/reader.c        12 Jul 2006 05:41:47 -0000
@@ -488,7 +488,11 @@ packgram (void)
         rule.  Thus, the midrule actions have already been scanned in order to
         set `used' flags for this rule's rhs, so grammar_rule_check will work
         properly.  */
-      grammar_rule_check (p);
+      /* Don't check the generated rule 0.  It has no action, so some rhs
+        symbols may appear unused, but the parsing algorithm ensures that
+        %destructor's are invoked appropriately.  */
+      if (p != grammar)
+       grammar_rule_check (p);
 
       for (p = p->next; p && p->sym; p = p->next)
        {
Index: tests/actions.at
===================================================================
RCS file: /sources/bison/bison/tests/actions.at,v
retrieving revision 1.60
diff -p -u -r1.60 actions.at
--- tests/actions.at    23 Jun 2006 20:17:28 -0000      1.60
+++ tests/actions.at    12 Jul 2006 05:41:47 -0000
@@ -197,8 +197,9 @@ AT_LALR1_CC_IF([typedef yy::location YYL
 ]AT_LALR1_CC_IF([], [static void yyerror (const char *msg);])
 [}
 
-]m4_ifval([$6], [%type <ival> '(' 'x' 'y' ')' ';' thing line input])[
+]m4_ifval([$6], [%type <ival> '(' 'x' 'y' ')' ';' thing line input END])[
 
+/* This %printer isn't actually tested.  */
 %printer
   {
     ]AT_LALR1_CC_IF([debug_stream () << $$;],
@@ -226,6 +227,11 @@ AT_LALR1_CC_IF([typedef yy::location YYL
   { printf ("Freeing token 'y' (address@hidden)\n", $$, RANGE (@$)); }
   'y'
 
+%token END 0
+%destructor
+  { printf ("Freeing token END (address@hidden)\n", $$, RANGE (@$)); }
+  END
+
 %%
 /*
    This grammar is made to exercise error recovery.
@@ -241,7 +247,7 @@ input:
       printf ("input (address@hidden): /* Nothing */\n", $$, RANGE (@$));
     }
 | line input /* Right recursive to load the stack so that popping at
-               EOF can be exercised.  */
+               END can be exercised.  */
     {
       $$ = 2;
       printf ("input (address@hidden): line (address@hidden) input 
(address@hidden)\n",
@@ -308,7 +314,7 @@ yylex (]AT_LEX_FORMALS[)
   if (source[c])
     printf ("sending: '%c'", source[c]);
   else
-    printf ("sending: EOF");
+    printf ("sending: END");
   printf (" (address@hidden)\n", c, RANGE (]AT_LOC[));
   return source[c];
 }
@@ -372,9 +378,10 @@ sending: 'x' (address@hidden)
 thing (address@hidden): 'x' (address@hidden)
 sending: ')' (address@hidden)
 line (address@hidden): '(' (address@hidden) thing (address@hidden) ')' 
(address@hidden)
-sending: EOF (address@hidden)
+sending: END (address@hidden)
 input (address@hidden): /* Nothing */
 input (address@hidden): line (address@hidden) input (address@hidden)
+Freeing token END (address@hidden)
 Freeing nterm input (address@hidden)
 Successful parse.
 ]])
@@ -391,9 +398,10 @@ sending: 'y' (address@hidden)
 Freeing token 'y' (address@hidden)
 sending: ')' (address@hidden)
 line (address@hidden): '(' (address@hidden) error (@10-19) ')' (address@hidden)
-sending: EOF (address@hidden)
+sending: END (address@hidden)
 input (address@hidden): /* Nothing */
 input (address@hidden): line (address@hidden) input (address@hidden)
+Freeing token END (address@hidden)
 Freeing nterm input (address@hidden)
 Successful parse.
 ]])
@@ -445,12 +453,45 @@ input (address@hidden): /* Nothing */
 input (address@hidden): line (address@hidden) input (address@hidden)
 input (address@hidden): line (address@hidden) input (address@hidden)
 input (address@hidden): line (address@hidden) input (address@hidden)
-130-139: syntax error, unexpected 'y', expecting $end
+130-139: syntax error, unexpected 'y', expecting END
 Freeing nterm input (address@hidden)
 Freeing token 'y' (address@hidden)
 Parsing FAILED.
 ]])
 
+
+# Syntax error caught by the parser where lookahead = END
+# --------------------------------------------------------
+# Load the stack and provoke an error that cannot be caught by the
+# grammar, to check that the stack is cleared.  And make sure the
+# lookahead is freed.
+#
+#     '(', 'x', ')',
+#     '(', 'x', ')',
+#     'x'
+AT_PARSER_CHECK([./input '(x)(x)x'], 1,
+[[sending: '(' (address@hidden)
+sending: 'x' (address@hidden)
+thing (address@hidden): 'x' (address@hidden)
+sending: ')' (address@hidden)
+line (address@hidden): '(' (address@hidden) thing (address@hidden) ')' 
(address@hidden)
+sending: '(' (address@hidden)
+sending: 'x' (address@hidden)
+thing (address@hidden): 'x' (address@hidden)
+sending: ')' (address@hidden)
+line (address@hidden): '(' (address@hidden) thing (address@hidden) ')' 
(address@hidden)
+sending: 'x' (address@hidden)
+thing (address@hidden): 'x' (address@hidden)
+sending: END (address@hidden)
+70-79: syntax error, unexpected END, expecting 'x'
+Freeing nterm thing (address@hidden)
+Freeing nterm line (address@hidden)
+Freeing nterm line (address@hidden)
+Freeing token END (address@hidden)
+Parsing FAILED.
+]])
+
+
 # Check destruction upon stack overflow
 # -------------------------------------
 # Upon stack overflow, all symbols on the stack should be destroyed.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]