help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: "nested" languages


From: Hans Aberg
Subject: Re: "nested" languages
Date: Tue, 28 Jun 2005 14:08:30 +0200

At 11:40 +1000 2005/06/28, Neil Conway wrote:
I'm trying to use Bison to construct a parser for a language A that needs to allow syntax from a second language B to be embedded within it (at well-defined positions within A). For example, suppose a production from the grammar of A is:

        if_stmt: IF expr THEN if_body END IF ;

where "expr" is a production in *another* Bison grammar. Language B is defined by a fairly complex grammar that changes with some regularity (in this case, the embedded language B is SQL, defined by an ~8500 line bison grammar). B's grammar is maintained separately -- merging B's grammar into the grammar for A and maintaining two copies is a headache I'd like to avoid, if possible. Ideally I'd like to have A's parser "call into" B's parser, have it accept as much input as it can, and then return control to A's parser.

Is this possible? One idea would be to use hand-written parsers for both languages, although it would be nice to stick with Bison if possible.

Bison only one single .y file as input, and must be presented such a file. One way to do it, might to use a preprocessor (Perl, M4 perhaps) so merge the files.

The problem might be, though, that the combined grammar isn't LALR(1) anymore.

(At present this is implemented by having the lexer look for a delimiter in the input that we know marks the end of the tokens of the embedded language. We then concatenate these tokens into a string, and eventually pass that string to the yyparse() of the embedded language's parser. So in the above example we would basically consume tokens until we see a "THEN", and assume that anything between the "IF" and the "THEN" is a SQL statement. This is ugly, to say the least.)

There are a few methods in use to combine grammars. One is to merge grammars together, as you suggested. Another is what you already is using. Sometimes, one is using a hybrid:

If one has a large number dynamic operator precedences, as in Prolog for example, then it is not really feasible to write that into a large Bison's grammar directly. So one would a small grammar, putting the operators on a stack, and then let a sort out the precedences in the actions. If the number of precedence is small, as Haskell, which only has ten levels, one could write it in directly in a static grammar, though. There, some do it, others use the first method.

In general, no method is better than the other; in the specific case, one chooses the one most convenient, simply.

You might also get more inputs on this question in the newsgroup comp.compilers.
--
  Hans Aberg




reply via email to

[Prev in Thread] Current Thread [Next in Thread]