help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Using bison without Flex


From: Hans Aberg
Subject: Re: Using bison without Flex
Date: Fri, 24 Apr 2015 10:55:45 +0200

> On 24 Apr 2015, at 00:40, John Levine <address@hidden> wrote:
> 
> In article <address@hidden> you write:
>> 
>>> On 23 Apr 2015, at 16:34, brahim sahbi <address@hidden> wrote:
>> 
>>> can we use bison without flex by describing grammatical
>>> rules for tokens(like number <- list_of_digits).
>> 
>> A lexer like generated by Flex scans forward to find the longest match which 
>> may require significant lookahead. A Bison
>> generated parser just looks ahead at most one token.
>> 
>> So in theory yes, but in practise not really.
> 
> You could have a stub yylex that returned individual characters and
> assemble them using bison rules.  In languages with reasonable
> tokenization rules (not Fortran) that can work.
> 
>>> Can it recognize a language's keywords and prohibit an identifier to be a
>>> keyword?
> 
> You might have to resolve that using GLR, or the hack that a
> reduce/reduce conflict is resolved in favor of the earlier rule in the
> bison script:
> 
> whilekwd: 'w' 'h' 'i' 'l' 'e' ;
> ifkwd: 'i' 'f' ;
> thenkwd: 't' 'h' 'e' 'n' ;
> elsekwd: 'e' 'l' 's' 'e' ;
> 
> identifier: letter | identifier letter ;
> letter: 'a' | 'b' | 'c' ... 'z’ ;

A problem here is that the lexer scans forward to find the whole identifier - 
it must, otherwise it will miss names like “ifs”. When the non-GLR parser sees 
the “if”, it will take it.

Then, when the lexer finds it cannot scan forward anymore, by a character or 
EOS, it takes the longest match according to some rules and puts the rest stuff 
back into the stream for rescanning. That is different from GLR that requires 
one to resolve with what one has in hand.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]