bison-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH 6/6] examples: bistromathic: demonstrate use of yyexpected_tokens


From: Akim Demaille
Subject: [PATCH 6/6] examples: bistromathic: demonstrate use of yyexpected_tokens
Date: Sun, 1 Mar 2020 12:31:04 +0100

Let's use GNU readline and its TAB autocompletion to demonstrate the
use of yyexpected_tokens.

This shows a number of weaknesses in our current approach:

- some macros (yyssp, etc.) from push parsers "leak" in user code, we
  need to undefine them

- the context needed by yyexpected_tokens does not need the token,
  yypstate actually suffices

- yypstate is not properly setup when first allocated, which results
  in a crash of yyexpected_tokens if fired before a first token was
  read.  We should move initialization from yypush_parse into
  yypstate_new.

* examples/c/bistromathic/parse.y (yylex): Take input as a string, not
a file.
(EXIT): New token.
(input): Adjust to work only on a line.
(line): Remove.
(symbol_count, process_line, expected_tokens, completion)
(init_readline): New.
* examples/c/bistromathic/bistromathic.test: Adjust expectations.
---
 NEWS                                      |   8 +-
 examples/c/README.md                      |   8 +-
 examples/c/bistromathic/README.md         |  26 +--
 examples/c/bistromathic/bistromathic.test |  41 +++--
 examples/c/bistromathic/local.mk          |   2 +-
 examples/c/bistromathic/parse.y           | 213 +++++++++++++++++++---
 6 files changed, 238 insertions(+), 60 deletions(-)

diff --git a/NEWS b/NEWS
index b94ccaf0..e9157594 100644
--- a/NEWS
+++ b/NEWS
@@ -104,9 +104,11 @@ GNU Bison NEWS
   The lexcalc example (a simple example in C based on Flex and Bison) now
   also demonstrates location tracking.
 
-  A new C example, bistromathic, is a fully featured calculator using many
-  Bison features: pure interface, location tracking, internationalized
-  custom error messages, lookahead-correction, rich debug traces, etc.
+  A new C example, bistromathic, is a fully featured interactive calculator
+  using many Bison features: pure interface, push parser, autocompletion
+  based on the current parser state (using yyexpected_tokens), location
+  tracking, internationalized custom error messages, lookahead-correction,
+  rich debug traces, etc.
 
 * Noteworthy changes in release 3.5.2 (2020-02-13) [stable]
 
diff --git a/examples/c/README.md b/examples/c/README.md
index e69004c4..e1e04d84 100644
--- a/examples/c/README.md
+++ b/examples/c/README.md
@@ -50,9 +50,13 @@ This example is a straightforward conversion of the 'calc' 
example to the
 push-parser model.
 
 ## bistromathic - all the bells and whistles
-This example demonstrates the best practices when using Bison.
-- Its interface is pure.
+This example demonstrates best practices when using Bison.
 - Its hand-written scanner tracks locations.
+- Its interface is pure.
+- Its interface is "incremental", well suited for interaction: it uses the
+  push-parser API to feed the parser with the incoming tokens.
+- It features an interactive command line with completion based on the
+  parser state, based on `yyexpected_tokens`.
 - It uses a custom syntax error with location, lookahead correction and
   token internationalization.
 - It supports debug traces with semantic values.
diff --git a/examples/c/bistromathic/README.md 
b/examples/c/bistromathic/README.md
index 8bef5136..6b376b71 100644
--- a/examples/c/bistromathic/README.md
+++ b/examples/c/bistromathic/README.md
@@ -1,7 +1,11 @@
 # bistromathic - all the bells and whistles
-This example demonstrates the best practices when using Bison.
-- Its interface is pure.
+This example demonstrates best practices when using Bison.
 - Its hand-written scanner tracks locations.
+- Its interface is pure.
+- Its interface is "incremental", well suited for interaction: it uses the
+  push-parser API to feed the parser with the incoming tokens.
+- It features an interactive command line with completion based on the
+  parser state, based on `yyexpected_tokens`.
 - It uses a custom syntax error with location, lookahead correction and
   token internationalization.
 - It supports debug traces with semantic values.
@@ -17,16 +21,14 @@ Copyright (C) 2020 Free Software Foundation, Inc.
 
 This file is part of Bison, the GNU Compiler Compiler.
 
-This program is free software: you can redistribute it and/or modify
-it under the terms of the GNU General Public License as published by
-the Free Software Foundation, either version 3 of the License, or
-(at your option) any later version.
+Permission is granted to copy, distribute and/or modify this document
+under the terms of the GNU Free Documentation License, Version 1.3 or
+any later version published by the Free Software Foundation; with no
+Invariant Sections, with no Front-Cover Texts, and with no Back-Cover
+Texts.  A copy of the license is included in the "GNU Free
+Documentation License" file as part of this distribution.
 
-This program is distributed in the hope that it will be useful,
-but WITHOUT ANY WARRANTY; without even the implied warranty of
-MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-GNU General Public License for more details.
+LocalWords:  bistromathic yyexpected lookahead ispell american
+LocalWords:  MERCHANTABILITY
 
-You should have received a copy of the GNU General Public License
-along with this program.  If not, see <http://www.gnu.org/licenses/>.
 --->
diff --git a/examples/c/bistromathic/bistromathic.test 
b/examples/c/bistromathic/bistromathic.test
index 5ecff945..f42fb29d 100755
--- a/examples/c/bistromathic/bistromathic.test
+++ b/examples/c/bistromathic/bistromathic.test
@@ -18,47 +18,64 @@
 cat >input <<EOF
 1+2*3
 EOF
-run 0 7
+run 0 '> 1+2*3
+7
+> '
 
 cat >input <<EOF
 (1+2) * 3
 EOF
-run 0 9
-run -noerr 0 9 -p
+run 0  '> (1+2) * 3
+9
+> '
+run -noerr 0 '> (1+2) * 3
+9
+> ' -p
 
 cat >input <<EOF
 a = 256
 sqrt (a)
 EOF
-run 0 '256
-16'
+run 0 '> a = 256
+256
+> sqrt (a)
+16
+> '
 
 cat >input <<EOF
 a = .16
 b = 10 ^ 2
 sqrt (a * b)
 EOF
-run 0 '0.16
+run 0 '> a = .16
+0.16
+> b = 10 ^ 2
 100
-4'
+> sqrt (a * b)
+4
+> '
 
 cat >input <<EOF
 *
 EOF
-run 0 'err: 1.1: syntax error: expected end of file or - or ( or end of line 
or double precision number or function or variable before *'
+run 0 '> *
+> err: 1.1: syntax error: expected end of file or - or ( or exit or double 
precision number or function or variable before *'
 
 cat >input <<EOF
 1 + 2 * * 3
 EOF
-run 0 'err: 1.9: syntax error: expected - or ( or double precision number or 
function or variable before *'
+run 0 '> 1 + 2 * * 3
+> err: 1.9: syntax error: expected - or ( or double precision number or 
function or variable before *'
 
 cat >input <<EOF
 100%
 EOF
-run 0 '100
-err: 1.4: error: invalid character'
+run 0 '> 100%
+100
+> err: 1.4: error: invalid character'
 
 cat >input <<EOF
 1 / 0
 EOF
-run 0 'err: 1.1-5: error: division by zero'
+run 0 '> 1 / 0
+> err: 1.1-5: error: division by zero'
diff --git a/examples/c/bistromathic/local.mk b/examples/c/bistromathic/local.mk
index 0a805f9d..920265f6 100644
--- a/examples/c/bistromathic/local.mk
+++ b/examples/c/bistromathic/local.mk
@@ -27,7 +27,7 @@ nodist_%C%_bistromathic_SOURCES = %D%/parse.y %D%/parse.h
 
 # Don't use gnulib's system headers.
 %C%_bistromathic_CPPFLAGS = -I$(top_srcdir)/%D% -I$(top_builddir)/%D%
-%C%_bistromathic_LDADD = -lm
+%C%_bistromathic_LDADD = -lm -lreadline
 
 dist_bistromathic_DATA = %D%/parse.y %D%/Makefile %D%/README.md
 CLEANFILES += %D%/parse.[ch] %D%/parse.output
diff --git a/examples/c/bistromathic/parse.y b/examples/c/bistromathic/parse.y
index 39244f95..4591b64f 100644
--- a/examples/c/bistromathic/parse.y
+++ b/examples/c/bistromathic/parse.y
@@ -5,7 +5,11 @@
   #include <math.h>   // cos, sin, etc.
   #include <stddef.h> // ptrdiff_t
   #include <stdio.h>  // printf
+  #include <stdlib.h> // calloc.
   #include <string.h> // strcmp
+
+  #include <readline/readline.h>
+  #include <readline/history.h>
 }
 
 %code requires {
@@ -31,18 +35,24 @@
 }
 
 %code provides {
-  int yylex (YYSTYPE *yylval, YYLTYPE *yylloc);
-  void yyerror (YYLTYPE *yylloc, char const *);
+  int yylex (const char **line, YYSTYPE *yylval, YYLTYPE *yylloc);
+  void yyerror (YYLTYPE *yylloc, char const *msg);
 }
 
 %code {
 #define N_
 #define _
+
+  // Whether to quit.
+  int done = 0;
 }
 
 // Don't share global variables between the scanner and the parser.
 %define api.pure full
 
+// Generate a push parser.
+%define api.push-pull push
+
 // To avoid name clashes (e.g., with C's EOF) prefix token definitions
 // with TOK_ (e.g., TOK_EOF).
 %define api.token.prefix {TOK_}
@@ -70,7 +80,7 @@
     LPAREN "("
     RPAREN ")"
     EQUAL  "="
-    EOL    _("end of line")
+    EXIT   "exit"
     EOF 0  _("end of file")
   <double>
     NUM _("double precision number")
@@ -99,13 +109,8 @@
 %% // The grammar follows.
 input:
   %empty
-| input line
-;
-
-line:
-  EOL
-| exp EOL   { printf ("%.10g\n", $exp); }
-| error EOL { yyerrok; }
+| exp     { printf ("%.10g\n", $exp); }
+| "exit"  { done = 1; }
 ;
 
 exp:
@@ -134,6 +139,11 @@ exp:
 // End of grammar.
 %%
 
+#undef yyssp
+#undef yyesa
+#undef yyes
+#undef yyes_capacity
+
 /*------------.
 | Functions.  |
 `------------*/
@@ -190,13 +200,23 @@ getsym (char const *name)
   return NULL;
 }
 
+// How many symbols are registered.
+int
+symbol_count (void)
+{
+  int res = 0;
+  for (symrec *p = sym_table; p; p = p->next)
+    ++res;
+  return res;
+}
+
 
 /*----------.
 | Scanner.  |
 `----------*/
 
 int
-yylex (YYSTYPE *yylval, YYLTYPE *yylloc)
+yylex (const char **line, YYSTYPE *yylval, YYLTYPE *yylloc)
 {
   int c;
 
@@ -207,7 +227,7 @@ yylex (YYSTYPE *yylval, YYLTYPE *yylloc)
     yylloc->first_column = yylloc->last_column;
 
     yylloc->last_column += 1;
-    c = getchar ();
+    c = *((*line)++);
   } while (c == ' ' || c == '\t');
 
   switch (c)
@@ -221,40 +241,42 @@ yylex (YYSTYPE *yylval, YYLTYPE *yylloc)
     case '(': return TOK_LPAREN;
     case ')': return TOK_RPAREN;
 
-    case '\n':
-      yylloc->last_column = 1;
-      yylloc->last_line += 1;
-      return TOK_EOL;
+    case 0: return TOK_EOF;
 
-    case EOF: return TOK_EOF;
-
-      // Any other character is a token by itself.
     default:
+      // Numbers.
       if (c == '.' || isdigit (c))
         {
-          ungetc (c, stdin);
           int nchars = 0;
-          scanf ("%lf%n", &yylval->TOK_NUM, &nchars);
+          sscanf (*line - 1, "%lf%n", &yylval->TOK_NUM, &nchars);
+          *line += nchars - 1;
           yylloc->last_column += nchars - 1;
           return TOK_NUM;
         }
+      // Identifiers.
       else if (islower (c))
         {
-          ungetc (c, stdin);
           int nchars = 0;
           char buf[100];
-          scanf ("%99[a-z]%n", buf, &nchars);
-          symrec *s = getsym (buf);
-          if (!s)
-            s = putsym (buf, TOK_VAR);
-          yylval->TOK_VAR = s;
+          sscanf (*line - 1, "%99[a-z]%n", buf, &nchars);
+          *line += nchars - 1;
           yylloc->last_column += nchars - 1;
-          return s->type;
+          if (strcmp (buf, "exit") == 0)
+            return TOK_EXIT;
+          else
+            {
+              symrec *s = getsym (buf);
+              if (!s)
+                s = putsym (buf, TOK_VAR);
+              yylval->TOK_VAR = s;
+              return s->type;
+            }
         }
+      // Stray characters.
       else
         {
           yyerror (yylloc, "error: invalid character");
-          return yylex (yylval, yylloc);
+          return yylex (line, yylval, yylloc);
         }
     }
 }
@@ -291,6 +313,126 @@ void yyerror (YYLTYPE *loc, char const *msg)
 }
 
 
+/*-----------.
+| Readline.  |
+`-----------*/
+
+// Parse (and execute) this line.
+int process_line (YYLTYPE *lloc, const char *line)
+{
+  yypstate *ps = yypstate_new ();
+  int status = 0;
+  do {
+    YYSTYPE lval;
+    status = yypush_parse (ps, yylex (&line, &lval, lloc), &lval, lloc);
+  } while (status == YYPUSH_MORE);
+  yypstate_delete (ps);
+  lloc->last_line++;
+  lloc->last_column = 1;
+  return status;
+}
+
+// Get the list of possible tokens after INPUT was read.
+int
+expected_tokens (const char *input,
+                 int *tokens, int ntokens)
+{
+  YYDPRINTF ((stderr, "expected_tokens(\"%s\")", input));
+
+  // Parse the current state of the line.
+  YYLTYPE lloc;
+  yypstate *ps = yypstate_new ();
+  int status = 0;
+  do {
+    if (!*input)
+      break;
+    YYSTYPE lval;
+    int token = yylex (&input, &lval, &lloc);
+    if (!token)
+      break;
+    status = yypush_parse (ps, token, &lval, &lloc);
+  } while (status == YYPUSH_MORE);
+
+  // Then query for the accepted tokens at this point.
+  yyparse_context_t yyctx
+    = {ps->yyssp, YYEMPTY, &lloc, ps->yyesa, &ps->yyes, &ps->yyes_capacity};
+  return yyexpected_tokens (&yyctx, tokens, ntokens);
+}
+
+/* Attempt to complete on the contents of TEXT.  START and END bound the
+   region of rl_line_buffer that contains the word to complete.  TEXT is
+   the word to complete.  We can use the entire contents of rl_line_buffer
+   in case we want to do some simple parsing.  Return the array of matches,
+   or NULL if there aren't any. */
+char **
+completion (const char *text, int start, int end)
+{
+  YYDPRINTF ((stderr, "completion(\"%.*s[%.*s]%s\")\n",
+              start, rl_line_buffer,
+              end - start, rl_line_buffer + start,
+              rl_line_buffer + end));
+
+  // Get list of token numbers.
+  int tokens[YYNTOKENS];
+  char *line = strndup (rl_line_buffer, start);
+  int ntokens = expected_tokens (line, tokens, YYNTOKENS);
+  free (line);
+
+  // Build MATCHES, the list of possible completions.
+  const int len = strlen (text);
+  // Need initial prefix and final NULL.
+  char **matches = calloc (ntokens + symbol_count () + 2, sizeof *matches);
+  int match = 0;
+  matches[match++] = strdup (text);
+  for (int i = 0; i < ntokens; ++i)
+    if (tokens[i] == YYTRANSLATE (TOK_VAR))
+      {
+        for (symrec *s = sym_table; s; s = s->next)
+          if (s->type == TOK_VAR && strncmp (text, s->name, len) == 0)
+            matches[match++] = strdup (s->name);
+      }
+    else if (tokens[i] == YYTRANSLATE (TOK_FUN))
+      {
+        for (symrec *s = sym_table; s; s = s->next)
+          if (s->type == TOK_FUN && strncmp (text, s->name, len) == 0)
+            matches[match++] = strdup (s->name);
+      }
+    else
+      {
+        const char* token = yysymbol_name (tokens[i]);
+        if (strncmp (token, text, strlen (text)) == 0)
+          matches[match++] = strdup (token);
+      }
+
+  if (yydebug)
+    {
+      fprintf (stderr, "completion(\"%.*s[%.*s]%s\") = ",
+               start, rl_line_buffer,
+               end - start, rl_line_buffer + start,
+               rl_line_buffer + end);
+      for (int i = 1; matches[i]; ++i)
+        fprintf (stderr, "%s%s",
+                 i == 1 ? "{" : ", ",
+                 matches[i]);
+      fprintf (stderr, "}\n");
+    }
+
+  // Don't fall back to proposing file names.
+  rl_attempted_completion_over = 1;
+  return matches;
+}
+
+void init_readline (void)
+{
+  /* Allow conditional parsing of the ~/.inputrc file. */
+  rl_readline_name = "pushcalc";
+
+  /* Tell the completer that we want a crack first. */
+  rl_attempted_completion_function = completion;
+}
+
+
+
 /*-------.
 | Main.  |
 `-------*/
@@ -301,5 +443,16 @@ int main (int argc, char const* argv[])
   if (argc == 2 && strcmp(argv[1], "-p") == 0)
     yydebug = 1;
   init_table ();
-  return yyparse ();
+  init_readline ();
+  YYLTYPE lloc = {1, 1, 1, 1};
+  while (!done)
+    {
+      char *line = readline ("> ");
+      if (!line)
+        return 0;
+      if (*line)
+        add_history (line);
+      process_line (&lloc, line);
+      free (line);
+    }
 }
-- 
2.25.1




reply via email to

[Prev in Thread] Current Thread [Next in Thread]