help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Two-pass parser or AST with Bison?


From: Luca
Subject: Re: Two-pass parser or AST with Bison?
Date: Tue, 06 Jan 2009 15:01:15 +0100
User-agent: Thunderbird 2.0.0.19 (Windows/20081209)

Matthias Kramm ha scritto:
Hi All,

I'm still busy compiling ActionScript 3.0, which, among
others, supports forward function calls:

    function bar() {
        foo("", 3)
    }
    function foo(s:String, i:int) {
    }

In order to properly resolve these (and check for correct
arguments), I can currently think about three options:
(1) Build an AST, walk that AST two times
Why two times?
In the bison actions, when you're parsing a function, you have to add it to your SymbolTable (or FunctionTable), and build the tree. Once the tree is build, you can traverse it from the root to the leaves checking types. For example, FUNCTION is a token and ID has the semantic value of char*. Function, param_list and function_body are nodes.

function:
FUNCTION ID '(' param_list ')' '{' function_body '}' {AddFunction($2,$4); $$=CreateNode(FUNCTION,$4,$7); SetName($$,$2);}
;

When bison ends to build the tree, you can check all the function bodies. For example, when you're in a expression node + (add), you have to check the left child node (you can insert a "type" field in the node struct/class). If it is a leave, you can direct check it; otherwise invoke the method itself on it (recursion). Then do the same for the right node, and lastly check the node itself.

This is a useful guide:
http://epaperpress.com/lexandyacc/download/lexyacc.pdf
you can also download the source code:
http://epaperpress.com/lexandyacc/download/code.zip

I think 1) is more powerful than 2). Moreover, it is faster because you have to parse the source only one time (store the flex tokens in memory is a good idea but bison is still invoked two times); you can also perform a deep optimization on source using AST.
(2) Do two LALR passes (i.e. store the flex tokens in memory,
    and run bison over them twice)
(3) back-patch (which I don't want to do)

I guess that (2) will use up less memory than (1), and also will be easier to implement.

Still, what are the best practices for AST building or two-pass
compile, if I
(1) (for the AST:)
Don't want to spell out every single AST node class explicitly, but rather, if possible, derive these from the grammar?

(2) (for two-pass:)
    Don't want to surround every single rule with an if like this:
    E = E '+' E {
        if(global.pass==1) {
            // do nothing in first pass
        } else {
            code_append(OP_ADD);
        }
    }

Any pointers to best-practices, existing mailing-list threads etc. are
very welcome.

Thanks,

Matthias




_______________________________________________
address@hidden http://lists.gnu.org/mailman/listinfo/help-bison






reply via email to

[Prev in Thread] Current Thread [Next in Thread]