[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [GSoC] "Parse Makefile.am using AST" Final Report
From: |
Vishal Gupta |
Subject: |
Re: [GSoC] "Parse Makefile.am using AST" Final Report |
Date: |
Thu, 23 Aug 2018 19:05:12 +0530 |
Apology for spacing issues as this was my first time doing the
formatting in plain text mode.
On Thu, Aug 23, 2018 at 6:26 PM Vishal Gupta <address@hidden> wrote:
>
> Hello everyone,
>
> Google Summer of Code provided me the opportunity to work with Automake during
> the summers. Here is the final report on work done during the same.
>
> ___________________________________________
>
> [GSoC'18] Parse Makefile.am using
> Abstract Syntax Tree - Final Report
> ___________________________________________
>
> Branch:- experimental/gsoc/ast
> http://git.savannah.gnu.org/cgit/automake.git/log/?h=experimental/gsoc/ast
>
> Commit History:- cf633fb61 to 5aebec2cc
>
> 1. Introduction
> ============
>
> The goal of the project was to generate an abstract syntax tree after parsing
> the input Makefile.am . It has separate lexer, parser and tree generation
> module so that new features can be added incrementally. For debugging purpose
> the AST generated is displayed as image using graph description language.
>
> The current implementation can parse the following type of statements:
> * Primaries : All the primaries are identified.
> * Multiline statements : Statements on multiple lines ending with backslash
> and its combination with other type of statements
> * Conditional statements : Single or nested conditional statements
> [ if-else-endif block ]
> * Subdirs directive : The parser parses the Makefile.am file in the said
> subdirectory and generates the tree in that subdirectory.
> * Include directive : The parser recursively parses the included file and
> merge the generated tree to the parent tree by serializing at child process
> and deserializing at parent process.
>
> 2. Technical Description
> ===================
>
> * Lexer :- It converts the input file line by line into tokens using perl
> regex. Whenever called by parser, it returns an array of tokens. Each token
> has a name and if required, a value.
>
> * Bison Grammar :- The grammar of Automake is written in GNU Bison. It is used
> to generate state description and state transition diagram. Writing in Bison
> is easy and exact syntax can be captured. Adding new features and debugging
> them becomes easy as no change is required in parser. The tool is only
> required at maintainer end to generate the parsing table.
>
> * Parsing Table :- It is an array of hashes. Each array index indicate the
> state and hash consist of key value pair (Key for token acceptable for that
> state and value is the next transition state). For reduction of grammar and
> generation of tree, a special 'reduce' key is used in Hash. It points to
> array containing number of tokens to reduce and corresponding function to
> call in Tree module. The start state is 0 and acceptance state is stored in
> variable with value decided by converter.
>
> * Converter :- It takes as input the state description and state transition
> file generated from Bison and outputs the parsing table in a format
> understood by perl.
>
> * Parser and AST :- It parses the input file using tokens provided by lexer
> and the parsing table provided by converter. LR parsing algorithm
> similar to this [https://goo.gl/images/a4NWva] is implemented.
> At each reduction of the grammar, the node is generated for the tree
> by calling its corresponding function in tree module specified in parsing
> table. After successful parsing, the parser outputs the generated tree in
> DOT format.
>
> * Graph Description Language and dot utility :- For visualization of abstract
> syntax tree, it is printed in the graph description language. It provide an
> easy interface to display directed and undirected graphs. As the size of
> input file increases, the tree increases in width significantly as compared
> to height as all the statements identified during parsing are attached to
> single master node. Unflatten utility is used to fix this. At last, dot
> utility is used to convert the graph to image file.
>
> * Testing :- Test cases for features implemented are provided in âtâ directory
> A simple script is executed which generates AST for the test files.
> Currently visual inspection of tree is required for validation.
>
> 3. Future Work
> =============
>
> * Make rules :- Currently only basic make rules are identified by parser.
> Grammar needs to be extended to handle complex make rules.
>
> * Using AST to generate Makefile.in :- Current project was to generate AST
> from input file. This AST can be used to generate Makefile.in. Converting
> AST to the output file is another big task. It would require to implement
> how different variables are handled, what to output for corresponding
> statement and how to integrate with the current code base.
>
> 4. Conclusion
> ============
>
> Learnt and enjoyed a lot. Applying the compiler theory learnt in class to
> practical work with a language I was knowing only some basics was a wonderfull
> experience. It also helped me to understand and follow good coding practices.
>
> Most of the automake features are implemented. I will be working to fix the
> grammar to identify complex make rules. With that, AST can be generated for
> most of the files which are identified by Automake. As this is a new feature,
> in the current state it cannot be merged in codebase. When generation of
> Makefile.in from AST be implemented can it be merged. I would like to continue
> working with that part, but some guidance in that direction will be required.
>
> Sorry for delay in presenting the report as I was not well.
>
> ---
> Vishal Gupta