help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Combining tokens


From: Søren Andersen
Subject: Combining tokens
Date: Mon, 15 Mar 2010 10:57:11 +0100

Hello all,

Please forgive me if this is a question that has been asked before. I've 
searched the maillist, but been unable to find what I was looking for. Possibly 
because I don't know the correct term to search for. :)

Consider a language with all the normal expressions - you can add, subtract, 
multiply, etc. 
Now, you'd like for the user to be able to define his own operators - for 
instance, '+?' or something like that. 
In order to help with ambiguities, you decide these user defined operators must 
be at least 2 "elements" long (I'm specifically NOT using the word "tokens" 
here for reasons to become clear).
So, you'll allow '++' and '-+', etc.

Now, the problem is that this still ends in shift / reduce conflicts - mainly 
because if you write this naturally:

UserOp = PossOp PossOp*;
PossOp = '+' | '-' | '*' | ....;

The parser will look for a succession of tokens - you can write '-' '+'. But, 
this is exactly what results in conflicts - obviously, with just 1 token of 
lookahead, this will go wrong.
What I really want is for my specification to specify *a single token* rather 
than a series of tokens, which is the exact opposite from what you usually want 
to happen. 

You could generate the possible tokens up to a certain length automatically:
'++', '+-', '+*', ...
but this would be very large, and you can (obviously) only do it up to a 
certain length.

I realize that what I'm asking is... Somewhat unorthodox. :)
But is it possible? In either Bison or another system? It would seem a 
relatively simple change to make as you basically just need to turn 
whitespace-awareness back on for some rules to disallow whitespace inside them?

Any thoughts very much appreciated!

Regards,

Søren Andersen





reply via email to

[Prev in Thread] Current Thread [Next in Thread]