bison-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: terminal @number vs. @user-number


From: Joel E. Denny
Subject: Re: terminal @number vs. @user-number
Date: Sun, 21 Oct 2007 20:18:55 -0400 (EDT)

On Mon, 22 Oct 2007, Wojciech Polak wrote:

> On 2007-10-21 at 17:46 -0400, Joel E. Denny wrote:
> 
> > Currently, Bison puts a terminal's user number (the one returned by yylex) 
> > in its XML "number" attribute.  I think we should rename that to 
> > "user-number" and add a "number" attribute for Bison's internal symbol 
> > number.  This would be more consistent with nonterminals.
> > I'd be happy the write the patch.  Is all this agreeable to you, Wojciech?
> 
> Can you write more about the practical goal (and its further usage)
> of having two numbers, especially Bison's internal symbol number?
> Maybe it's okay to switch, but to have only one kind of number,
> thus changing nonterminal, and not terminal?

While terminals have both user numbers and internal numbers, nonterminals 
only have internal numbers.  Thus, the only way to change the nonterminal 
element that I know of is to eliminate its @number altogether.  Is that 
what you mean?

On the one hand, I suppose we could argue that the user never really needs 
to know any of the symbol numbers for normal and clean usage of the 
generated parser.  On the other hand, when developing and debugging 
Bison's front end, I know I've found all the numbers useful at different 
times.  The user might find them helpful during low-level debugging of the 
generated parser code as well.

At the moment, I'm mainly bothered that @number isn't guaranteed to have a 
unique value for each symbol since it seems like it should.  If we make 
the change I'm suggesting, it will.

Unique @number values are important if someone wants to use @number rather 
than @name for symbol references.  For example, consider a URI fragment 
identifier (like s103 in http://www.example.com/index.html#s103).  @name 
might be long and it might contain special characters that would have to 
be escaped in order to be placed there.  @number usually requires less 
space and could be placed there with no extra processing.

Of course, the user could use "n" and "t" prefixes to make @number based 
fragment identifiers unique, but my point is that it seems unintuitive 
that @number isn't already unique.

I suppose the user could use generate-id() or position() instead of 
@number in that scenario.  However, I'm guessing there might be situations 
when the user is debugging with the aid of some custom report he generated 
from Bison's XML.  It might be less confusing if the number representing a 
symbol is guaranteed to be consistent between his customized report and 
the C parser tables he's examining.  Maybe.

Researchers have been known to instrument Bison and its generated parsers 
for various purposes.  They might find the numbers in the XML output 
useful for generating code that depends on the C parser tables.

Well, I'm brainstorming, so some of my arguments may be flimsy.  In 
general, it seems like there are scenarios when it would be more 
convenient, more consistent, and cleaner for the user to be able to access 
all the symbol numbers than to have to resort to other techniques.  What 
do you think?




reply via email to

[Prev in Thread] Current Thread [Next in Thread]