bison-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH 04/10] doc: promote YYEOF


From: Akim Demaille
Subject: [PATCH 04/10] doc: promote YYEOF
Date: Mon, 13 Apr 2020 17:43:35 +0200

* NEWS (Deep overhaul of the symbol and token kinds): New.
* doc/bison.texi: Promote YYEOF over "0" in scanners.
(Token Decl): No longer show YYEOF here, it now works by default.
(Token I18n): More details about YYEOF here.
(Calc++): Just use YYEOF.
---
 NEWS           | 35 ++++++++++++++++++++++++++++--
 doc/bison.texi | 59 ++++++++++++++++++++------------------------------
 2 files changed, 56 insertions(+), 38 deletions(-)

diff --git a/NEWS b/NEWS
index 3fc4eaae..ceaca65b 100644
--- a/NEWS
+++ b/NEWS
@@ -74,7 +74,6 @@ GNU Bison NEWS
     %token
         PLUS   "+"
         MINUS  "-"
-        EOF 0  _("end of file")
       <double>
         NUM _("double precision number")
       <symrec*>
@@ -83,7 +82,7 @@ GNU Bison NEWS
 
   In that case the user must define _() and N_(), and yysymbol_name returns
   the translated symbol (i.e., it returns '_("variable")' rather that
-  '"variable"').
+  '"variable"').  In Java, the user must provide an i18n() function.
 
 *** List of expected tokens (yacc.c)
 
@@ -95,6 +94,38 @@ GNU Bison NEWS
   It makes little sense to use this feature without enabling LAC (lookahead
   correction).
 
+*** Deep overhaul of the symbol and token kinds
+
+  To avoid the confusion with typing in programming languages, we now refer
+  to token and symbol "kinds" instead of token and symbol "types".
+
+**** Token kind
+
+  The "token kind" is what is returned by the scanner, e.g., PLUS, NUMBER,
+  LPAREN, etc.  Users are invited to replace their uses of "enum
+  yytokentype" by "yytoken_kind_t".
+
+  This type now also includes tokens that were proviously hidden: YYEOF (end
+  of input), YYUNDEF (undefined token), and YYERRCODE (error token).  They
+  now have string aliases, internationalized if internationalization is
+  enabled.  Therefore, by default, error messages now refer to "end of file"
+  (internationalized) rather than the cryptic "$end".
+
+  In most case, it is now useless to define the end-of-line token as
+  follows:
+
+    %token EOF 0  _("end of file")
+
+  Rather simply use "YYEOF" in your scanner.
+
+**** Symbol kinds
+
+  The "symbol kinds" is what the parser actually uses.  (Unless the
+  api.token.raw %define variable was used, the internal symbol kind of a
+  terminal differs from the corresponding token kind.)
+
+  They are now exposed as a enum, "yysymbol_kind_t".
+
 *** Modernize display of explanatory statements in diagnostics
 
   Since Bison 2.7, output was indented four spaces for explanatory
diff --git a/doc/bison.texi b/doc/bison.texi
index 8d448e4b..2d6cc327 100644
--- a/doc/bison.texi
+++ b/doc/bison.texi
@@ -1903,7 +1903,7 @@ yylex (void)
 @group
   /* Return end-of-input. */
   else if (c == EOF)
-    return 0;
+    return YYEOF;
   /* Return a single char. */
   else
     return c;
@@ -2352,7 +2352,7 @@ yylex (void)
 
   /* Return end-of-input. */
   if (c == EOF)
-    return 0;
+    return YYEOF;
 
 @group
   /* Return a single char, and update location. */
@@ -2722,7 +2722,7 @@ yylex (void)
     c = getchar ();
 
   if (c == EOF)
-    return 0;
+    return YYEOF;
 @end group
 
 @group
@@ -4926,14 +4926,6 @@ would produce in French @samp{erreur de syntaxe, || 
inattendu, attendait
 nombre ou (} rather than @samp{erreur de syntaxe, || inattendu, attendait
 number ou (}.
 
-The token numbered as 0 corresponds to the end of file; the following line
-allows for nicer error messages referring to ``end of file''
-(internationalized) instead of ``$end'':
-
-@example
-%token END 0 _("end of file")
-@end example
-
 @node Precedence Decl
 @subsection Operator Precedence
 @cindex precedence declarations
@@ -7812,7 +7804,6 @@ or @code{detailed}, token aliases can be 
internationalized:
 @example
 %token
     '\n'   _("end of line")
-    EOF 0  _("end of file")
   <double>
     NUM    _("double precision number")
   <symrec*>
@@ -7828,17 +7819,26 @@ If at least one token alias is internationalized, then 
the generated parser
 will use both @code{N_} and @code{_}, that must be defined
 (@pxref{Programmers, , The Programmer’s View, gettext, GNU @code{gettext}
 utilities}).  They are used only on string aliases marked for translation.
-In other words, even if your catalog features a translation for ``end of
-line'', then with
+In other words, even if your catalog features a translation for
+``function'', then with
 
 @example
 %token
-    '\n'     "end of line"
-    EOF 0  _("end of file")
+  <symrec*>
+    FUN      "function"
+    VAR    _("variable")
 @end example
 
 @noindent
-``end of line'' will appear untranslated in debug traces and error messages.
+``function'' will appear untranslated in debug traces and error messages.
+
+Unless defined by the user, the end-of-file token, @code{YYEOF}, is provided
+``end of file'' as an alias.  It is also internationalized if the user
+internationalized tokens.  To map it to another string, use:
+
+@example
+%token END 0 _("end of input")
+@end example
 
 
 @node Algorithm
@@ -11401,17 +11401,7 @@ Symbols}).  This directive:
 
 @noindent
 requests that Bison generates the functions @code{make_TEXT} and
-@code{make_NUMBER}.  As a matter of fact, it is convenient to have also a
-symbol to mark the end of input, say @code{END_OF_FILE}:
-
-@comment file: c++/simple.yy: 1
-@example
-%token END_OF_FILE 0
-@end example
-
-@noindent
-The @code{0} tells Bison this token is special: when it is reached, parsing
-finishes.
+@code{make_NUMBER}, but also @code{make_YYEOF}, for the end of input.
 
 Everything is in place for our scanner:
 
@@ -11441,7 +11431,7 @@ Everything is in place for our scanner:
 @end group
 @group
         default:
-          return parser::make_END_OF_FILE ();
+          return parser::make_YYEOF ();
 @end group
         @}
     @}
@@ -12439,17 +12429,14 @@ file; it needs detailed knowledge about the driver.
 
 
 @noindent
-The token code 0 corresponds to end of file; the following line
-allows for nicer error messages referring to ``end of file'' instead of
-``$end''.  Similarly user friendly names are provided for each symbol.  To
-avoid name clashes in the generated files (@pxref{Calc++ Scanner}), prefix
-tokens with @code{TOK_} (@pxref{%define Summary}).
+User friendly names are provided for each symbol.  To avoid name clashes in
+the generated files (@pxref{Calc++ Scanner}), prefix tokens with @code{TOK_}
+(@pxref{%define Summary}).
 
 @comment file: calc++/parser.yy
 @example
 %define api.token.prefix @{TOK_@}
 %token
-  END  0  "end of file"
   ASSIGN  ":="
   MINUS   "-"
   PLUS    "+"
@@ -12695,7 +12682,7 @@ The rules are simple.  The driver is used to report 
errors.
                (loc, "invalid character: " + std::string(yytext));
 @}
 @end group
-<<EOF>>    return yy::parser::make_END (loc);
+<<EOF>>    return yy::parser::make_YYEOF (loc);
 %%
 @end example
 
-- 
2.26.0




reply via email to

[Prev in Thread] Current Thread [Next in Thread]