[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Bison scanner patch to fix POSIX incompatibilities, etc.
From: |
Paul Eggert |
Subject: |
Re: Bison scanner patch to fix POSIX incompatibilities, etc. |
Date: |
Tue, 5 Nov 2002 12:00:36 -0800 (PST) |
> From: Akim Demaille <address@hidden>
> Date: 05 Nov 2002 09:30:47 +0100
> :) Well, it is not written down, but if they consider C, then yes, I
> suppose they mean this weird thing too.
Pperhaps I should file an interpretation request, since it does seem a
little weird now that you mention it. Please see the proposal at the
end of this message, which tries to cover all the bases.
Does anyone know what the C++ rules are with respect to trigraphs,
digraphs, and backslash-newline? Does C++ have trigraphs?
> I do not understand why we would want to have trigraphs.
We don't, and that's why I left them out.
> I'm referring to using `\' to split the two characters of */ and /*.
> I mean, the code is ``more correct'' than it was before, but we
> might have written code that will just never be used.
True.
> The question was: did you ever see a \ splitting tokens in real code
In hand-written code I have seen it only for strings (and in the
International Obfuscated C code contest -- I won a prize there once
:-). I have heard of it occuring in machine-generated code, because
some programs that generate Standard C code worry about compilers with
silly line length limits (which the standard does allow).
> Paul> While we're on the subject of POSIX conformance, I notice that
> Paul> we're not handling C digraphs correctly. I'll throw that in
> Paul> too; might as well, while I'm at it, since it's easy.
>
> Really, I see no interest at all in making it that perfect. I bet big
> money that it is not Yacc portable
You're probably right. I bet that none of this is Yacc portable, come
to think of it.
> all this is free complications that might make things uselessly more
> complex when we will "parse" other languages with different lexical
> structures.
Languages with different lexical structures will need different
lexical regimes anyway. You can't parse Scheme with a C lexer.
Or perhaps you were thinking we could get away with parsing C++ and/or
Java with a C lexer? That is plausible.
We can certainly construct horrible examples of C statements that no
Yacc parser could reasonably be expected to parse. For example:
%{
#define CLOSE_BRACE }
%}
%%
start: 'x' { { $$ = 0; CLOSE_BRACE } ;
Or how about this one?
%{
#define PERCENT_CLOSE_BRACE %}
%}
%%
start: 'x' ;
POSIX says that both these are valid input to Yacc! Clearly this is a
bug in the standard, but the question is how far the bug extends.
How about if we propose the following changes to the POSIX standard:
* The two characters `%' and `}' cannot appear adjacent within %{
... %}, other than inside a comment or string literal.
* C-language code must have properly nested occurrences of "{" and
"}". If braces are spelled any other way (e.g., via a macro like
CLOSE_BRACE or via a digraph) then the resulting behavior is
undefined.
* Backslash-newline preprocessing does not apply to Yacc comments or
literals, or to occurrences of pseudovariables like `$$'.
* Similarly, trigraph preprocessing does not apply to Yacc comments,
literals, or pseudovariables.
* If the removal of a backslash-newline within C-language code would
change the boundary of the containing comment, string literal, or
character constant, the resulting behavior is undefined.
* Similarly, if the replacement of a trigraph by its corresponding
single character in C-language code would change the boundary of
the containing comment, string literal, or character constant, the
resulting behavior is undefined.
Would that be OK with you?
- Bison scanner patch to fix POSIX incompatibilities, etc., Paul Eggert, 2002/11/03
- Re: Bison scanner patch to fix POSIX incompatibilities, etc., Akim Demaille, 2002/11/04
- Re: Bison scanner patch to fix POSIX incompatibilities, etc., Paul Eggert, 2002/11/04
- Re: Bison scanner patch to fix POSIX incompatibilities, etc., Akim Demaille, 2002/11/05
- Re: Bison scanner patch to fix POSIX incompatibilities, etc.,
Paul Eggert <=
- Re: Bison scanner patch to fix POSIX incompatibilities, etc., Akim Demaille, 2002/11/06
- Re: Bison scanner patch to fix POSIX incompatibilities, etc., Paul Eggert, 2002/11/06
- Re: Bison scanner patch to fix POSIX incompatibilities, etc., Akim Demaille, 2002/11/07
- Re: Bison scanner patch to fix POSIX incompatibilities, etc., Paul Eggert, 2002/11/05
- Re: Bison scanner patch to fix POSIX incompatibilities, etc., Paul Eggert, 2002/11/06
- Re: Bison scanner patch to fix POSIX incompatibilities, etc., Akim Demaille, 2002/11/06
- Re: Bison scanner patch to fix POSIX incompatibilities, etc., Paul Eggert, 2002/11/07
Re: Bison scanner patch to fix POSIX incompatibilities, etc., Akim Demaille, 2002/11/04