bison-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RFC: a name for the error token


From: Akim Demaille
Subject: RFC: a name for the error token
Date: Sun, 26 Apr 2020 18:40:53 +0200

Hi,

We currently have several ways for the scanner to report an error to the 
scanner:

1. return the undefined token (YYUNDEF)
2. return an unknown token kind
3. return the error token

1 and 2 are basically indistinguishable: any token kind which is not known is 
mapped to the YYSYMBOL_YYUNDEF symbol kind by YYTRANSLATE.  The only difference 
is if you are using api.token.raw, in which case the token kind and the symbol 
kind coincide, and YYTRANSLATE is the identity.  In that case it is no longer 
valid to return invalid tokens (undefined behavior), you must return 
YYSYMBOL_YYUNDEF (aka YYUNDEF).  They have the parser emit an error message, 
and then enter error-recovery.

3. Until recently the error token used to behave like YYUNDEF, but with my 
recent changes (https://lists.gnu.org/r/bison-patches/2020-04/msg00145.html) it 
no longer emits an error message.



There is one problem left: having a name for the error token.  Currently it's 
"YYERRCODE", but it is an ugly name.  Since it was never documented (and in 3.6 
it will be documented), we have an opportunity to find a good way to name it.  
Actually, because some people have used in the past and expected an error 
message, we should have a backward compatibility macros that point YYERRCODE to 
YYUNDEF.

So, what name for the error token?

a. There's one quite obvious name: YYERROR.  Unfortunately it collides with the 
YYERROR macro.  We can play #define tricks around user actions to have it be 
YYERROR only there, but it feels not so good.

b. We can use a name such as YYERROR_TOKEN, but I don't like that much, as it's 
a completely different naming scheme compared to the other tokens (user tokens 
such as NUM, or special tokens such as YYEOF).  Besides, it would make a 
difference with the name of the symbol kind (YYSYMBOL_YYERROR) unless we also 
make it YYSYMBOL_YYERROR_TOKEN.  Which is erk...

c. In the grammar, the error token is spelled "error", so it would make a lot 
of sense to just name it "error" and "YYSYMBOL_error", but we are infringing 
the user "name space".

d. So it could be simply "YYerror", which does show it's a built-in symbol (as 
YYEOF and YYUNDEF), yet it does not follow the convention of uppercase for 
tokens.  Its symbol would be YYSYMBOL_YYerror of course.


I have been thinking about this issue for weeks, and the more I think about it, 
the more I believe (d) is the least ugly approach.

But maybe someone would have a better option?


reply via email to

[Prev in Thread] Current Thread [Next in Thread]