[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Combining tokens
From: |
Søren Andersen |
Subject: |
Combining tokens |
Date: |
Mon, 15 Mar 2010 10:57:11 +0100 |
Hello all,
Please forgive me if this is a question that has been asked before. I've
searched the maillist, but been unable to find what I was looking for. Possibly
because I don't know the correct term to search for. :)
Consider a language with all the normal expressions - you can add, subtract,
multiply, etc.
Now, you'd like for the user to be able to define his own operators - for
instance, '+?' or something like that.
In order to help with ambiguities, you decide these user defined operators must
be at least 2 "elements" long (I'm specifically NOT using the word "tokens"
here for reasons to become clear).
So, you'll allow '++' and '-+', etc.
Now, the problem is that this still ends in shift / reduce conflicts - mainly
because if you write this naturally:
UserOp = PossOp PossOp*;
PossOp = '+' | '-' | '*' | ....;
The parser will look for a succession of tokens - you can write '-' '+'. But,
this is exactly what results in conflicts - obviously, with just 1 token of
lookahead, this will go wrong.
What I really want is for my specification to specify *a single token* rather
than a series of tokens, which is the exact opposite from what you usually want
to happen.
You could generate the possible tokens up to a certain length automatically:
'++', '+-', '+*', ...
but this would be very large, and you can (obviously) only do it up to a
certain length.
I realize that what I'm asking is... Somewhat unorthodox. :)
But is it possible? In either Bison or another system? It would seem a
relatively simple change to make as you basically just need to turn
whitespace-awareness back on for some rules to disallow whitespace inside them?
Any thoughts very much appreciated!
Regards,
Søren Andersen
- Combining tokens,
Søren Andersen <=