[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: #line 0.000000oo
From: |
Paul Eggert |
Subject: |
Re: #line 0.000000oo |
Date: |
Wed, 13 Nov 2002 12:46:52 -0800 |
> From: Akim Demaille <address@hidden>
> Date: 13 Nov 2002 10:57:25 +0100
>
> exp: exp '+' exp
> {
> /* Something. */
> #line 123 "foo.y"
> $$ = $1 + $4;
> }
>
> should complain in foo.y:123. But it is clear that the #line must be
> forwarded inside the action.
If that's the case, Bison is currently mishandling #line. E.g., for this:
%%
exp: '+'
#line 1000 "foo.y"
{
$$ = $1 + $4;
#line 2000 "foo.y"
$$ = $1 + $4;
};
the Bison output is currently this:
foo.y:1000.1-1001.15: integer out of range: `$4'
foo.y:1000.1-1003.15: integer out of range: `$4'
so the second "#line" directive did not affect the Bison diagnostics.
> Now, you seem to suggest that once the '}' closed, we should restore
> the previous location context,
No, I wasn't proposing that. All I'm saying is that either #line
should consistently affect Bison diagnostics, or it should
consistently be ignored by Bison (other than passing it through to the
C source code).
> We can do this with an intermediary parser between our scanner, and
> the actual parser. The reason why we might want a secondary parser is
> that anyway, some day we want %if, %else, %endif.
OK, it sounds like you are proposing two preprocessors for Yacc C
code: a new one that uses %if/%endif and is preprocessed by Yacc, to
go along with the existing one that uses #if/#endif and is
preprocessed by the C compiler. I can certainly see the need for
having a separate preprocessor for Yacc, and I also see why you
can't use a single C preprocessor for both purposes.
But in that case, we should use "%line" uniformly for Yacc line
numbers, to go along with the future "%if" and "%endif". This makes
it clearer to users that there are two preprocessors here. We should
just pass #line through in C code, as we do now, and make it an error
to use #line outside of C code (just as it's an error to use #if
outside of C code). The "%line" notation makes a lot more sense to me
than the current approach, which uses the same #line notation to mean
two quite different things.
> I'm really against supporting tortured syntax which only concrete
> use will be the test suite.
It's OK to implement a subset of the C rules, so long as we document
whatever we implement. But the current subset is pretty restrictive,
and it will make it a bit of a pain for humans to use %line. For
example, currently trailing white space is not allowed in #line; nor
is the short form supported.
If we implement %line ourselves, how about if we use the following
subset of the C rules instead:
* Comments are not allowed.
* Newline (including backslash-newline) is not allowed.
* There is no preprocessor-like expansion (e.g., you can't say
`%line __STDC__ "foo"' as you can in C).
Another way of putting it is that we alter Bison as follows:
* Arbitrary white space is allowed between tokens.
* The file name must have properly escaped quotes.
* The file name is optional, and defaults to the previous file name.
In other words, the regular expression changes from this:
^"#line "{int}" \"".*"\"\n"
to this:
^{w}*"%line"{w}+{int}{w}*("\""(\\.|[^\n\"\\])*"\""{w}*)?"\n"
where {w} is any horizontal white space character ([ \f\t\v]).
If we play our cards right, we won't need to parse the escape
sequences in that string: just print them as-is in error messages
(without quoting them), and escape M4 characters like [ and ] when
sending the string via M4 to C code.