help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Inserting extra tokens


From: Hans Aberg
Subject: Re: Inserting extra tokens
Date: Fri, 18 Aug 2006 15:57:40 +0200

On 18 Aug 2006, at 12:12, Erik Sandberg wrote:

My idea was to check dynamically whether 0 or 1 lookahead tokens has been read, and take different actions depending on this (either queue a token in the lexer, or manipulate the current lookahead token). Would this fail?

The parser must know when it needs to get a lookahead token, so it seems me the information should be present in some of the parser tables.

If so, do you know about any other parser generator that supports insertion of
tokens?

In general, however, people try to avoid this, for any parser. When one sets context switches in the parser .y file to change the behavior of the lexer /as in the .l file), one generally tries to put in some action where the lookahead problem does not matter. Then one checks by hand that it worked out correctly.

You might alos consult the Usenet newsgroup comp.compilers, and its FAQ published there monthly, for more inputs.


3. The argument list uses a grammar similar to
arglist: argument COMMA arglist | argument SEMICOLON ;
(an argument can be a complex expression)

Now, I do not see exactly how this precedence problem relates to
yours, as you have a different syntax. But if you only admit a
limited number of arities, you could list them all in the .y grammar.

We currently use something similar, I'm trying to find a more generic
solution.

The state of the art of these parser generators is that they, with respect to this problem, are quite primitive. So the best solution is to tailor something more special.

Otherwise, you will have to use the other method indicated above:
create a dynamic arguments list object, and then use the token arity
to work it out after the parsing of the rule.

This might be feasible, but I'm not completely sure: There are problems with typechecking, as there are different types (MARKUP, MUSIC) of arguments. If a sequence of MUSIC is expected somewhere, then FUNCTION MUSIC MARKUP MUSIC should give an error if FUNCTION is unary, but not necessary if it's binary. Of course it's possible to work around this, but I fear that lots of logic
will have to be moved out from the parser.

This problem seems to be similar to that of C++, where the choice of function depends on the argument sequence. This is called "name overloading" or sometimes "Koening lookup", after one of the principal C++ designers. I am not sure how people actually solve this problem in the case of C++ - you might want to check in the newsgroup comp.compilers. There is a Yacc'able C++ grammar: http://www.parashift.com/c++-faq-lite/compiler- dependencies.html#faq-38.11

Another method might be to use the GLR (%glr) feature of Bison. When an ambiguity arises in the input grammar, it is split into several parsers, and then these are merged.

But it may suffice first parsing the arguments of the function. After that, do a lookup on the available (= formerly defined, with arity and argument type) function names, to see if there is a match. If not, issue an error, say via YYERROR or YYABORT.

  Hans Aberg






reply via email to

[Prev in Thread] Current Thread [Next in Thread]