help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: about parse-gram of bison itself


From: Akim Demaille
Subject: Re: about parse-gram of bison itself
Date: Wed, 21 Jul 2010 10:43:13 +0200

Le 6 juin 2010 à 19:39, tys lefering a écrit :

> Hi,

Hi Tys!

See, your messages are not lost, just in the queue :)

> In parse-gram.y from GNU bison-2.5 branch.
> 
> The grammar_declaration rule is in 2 parts:
> 
> 1)
> 
> grammar_declaration:
>  precedence_declaration
> | symbol_declaration
> | "%start" symbol
> | "%destructor" "{...}" generic_symlist
> | "%printer" "{...}" generic_symlist
> | "%default-prec"
> | "%no-default-prec"
> | "%code" braceless
> | "%code" ID braceless
> ;
> 
> and 2)
> 
> grammar_declaration:
>  "%union" union_name braceless
> ;
> 
> This is possible according to the manual, but what is
> the reason to do this in this parser ? why not joined?

That's just of question of style.  Historically Bison used to support only the 
"all the declarations then all the uses" style, but I disliked it, and when I 
moved from the hand-written parser to the bison-based parser, that's one of the 
first feature I wanted (for my own grammars).  I prefer to favor locality, and 
have related issues close to each other.  Then, parts of the Bison grammar 
itself were moved to this style, but not all.  And there's plenty of "serious" 
work to do in Bison, so I guess time will fly before we convert the whole 
grammar to a single, more consistent, style.

> Also there is a '%token PERCENT_UNION "%union"' statement
> in the grammar rules section, why not in the declarations
> section? is this a bison feature or general for yacc grammars?

Bison feature.

> The manual does not say it is possible to have declarations
> in the grammar rules section.

That's something we should fix, thanks.

> Worked with the bison parse-gram.y and did not find bugs
> only some notes:
> a FIXME comment in scan-gram.l in convert_ucn_to_byte()
> /* FIXME: Currently we assume Unicode-compatible unibyte characters
>     on ASCII hosts (i.e., Latin-1 on hosts with 8-bit bytes).  On
>     non-ASCII hosts we support only the portable C character set.
>     These limitations should be removed once we add support for
>     multibyte characters.  */

Be my guest :)

> and no error check for errno at the strtoul().

Why would that be needed?  Its return value _is_ checked.  I might be missing 
something.

> in the <INITIAL> section in scan-gram.l most keywords allow
> mixture of '-' and '_' example %pure[-_]parser but not for
> all, like %file-prefix, is there a reason for this ?

We are moving from _ to -, so I guess that what's already in - style is not 
supported in _ style.

> the %term and %token return PERCENT_TOKEN in scan-gram.l
> but nowhere this %term is documented. maybe bison can warn
> about %term usage which only is found in very old grammars.

Why not indeed.  That's a bisonism.

> there is a %token GRAM_EOF in parser but not used and
> not in lexer, does it have a function?

If you leferring, er sorry, referring to

%token GRAM_EOF 0 "end of file"

then the point is to have error messages take about "end of file", not about 
"$end".  Flex returns 0 for us, at eof.

> in grammar is at:
> prologue_declaration:
> | /*FIXME: Err?  What is this horror doing here? */ ";"
> ;
> 
> is it so bad to have a extra ';' floating around ?

I'm certainly responsible for this.  This was a "fight" between Paul and I, 
Paul on the one hand willing to support grammars with "missing ; at the end of 
the rules" (as I would call them), and I, on the other side, willing to drop 
dead this.  Of course Paul is right, as @*#&@*(# POSIX mandates it (it is not 
Paul who was wrong, it was POSIX :).  So he "reverted" changes I had made to 
require ";", but in such a way that ";" get out of hands, imho, in the grammar. 
 I could agree with accepting missing semicolon at the end of rules, but 
certainly not to make semicolon basically optional anywhere.

> in the parser is another FIXME
> /* Request detailed syntax error messages, and pass them to GRAM_ERROR.
>   FIXME: depends on the undocumented availability of YYLLOC.  */
> #undef  yyerror
> #define yyerror(Msg) \
>       gram_error (&yylloc, Msg)
> what is this about ?

It's about the stupid API Bison provides for yyerror: the location is not 
available, so this macro is actually stealing a local variable whose name is 
not "public".

> and because of obstack bison allows unlimited string length
> everywhere, right ? that can be a problem with graph output
> which has a string length max. of 20000 chars for vcg and
> possibly other formats have also limitations and maybe somehow
> it could be a problem in the xml output and xstl, not tested
> and checked yet.

Well, at least the limitation is not from Bison itself.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]