[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Emacs contributions, C and Lisp

From: Jacob Bachmeyer
Subject: Re: Emacs contributions, C and Lisp
Date: Sat, 10 Jan 2015 17:45:02 -0600
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv: Gecko/20090807 MultiZilla/ SeaMonkey/1.1.17 Mnenhy/

I've been reading this in the list archives and as a long-time GNU/Linux user, feel the need to chime in.

Perhaps there is a better option? I seem to remember efforts to adapt Guile to run Emacs Lisp and then to port Emacs to run using Guile instead of its own runtime. I'm not certain of the difficulty, but perhaps GCC could be, over time, moved towards an option to build as Guile extensions? I haven't looked far enough into this to know if it is feasible, or how much work would be needed, or if I'm completely mistaken and it isn't feasible at all.

Obviously, it should still be possible to build "stand-alone GCC", but having the compiler be available from Guile could easily solve the issue at hand here, especially if the extension presents a Lisp-like API for the GCC internal structures. This would also address the concerns about the GCC front-end being abused to feed nonfree backends, since the tree would only be present in memory behind a GPL interface.

But this is years away at best and doesn't solve the immediate problem, which is that Emacs needs a parse tree "now". There are three options for how to get that:

(1)  Write a C parser in Emacs Lisp.
(2)  Get the AST from GCC.
(3)  Get the AST from Clang.

Option (1) leads to an Emacs C Compiler and fully self-hosting Emacs, which is both interesting and an insane duplication of effort. Option (2) has the advantage that it would ensure full support for GNU C, but the problem of actually getting the parse tree from GCC without weakening GCC's copyleft. Option (3) has the advantage that no one will object to dumping an AST from Clang, but Clang isn't a GNU project and has incomplete support for GNU C.

A more useful question is "How can GCC most efficiently provide an AST to Emacs?" Part of the answer is that Emacs already has the complete text of every file involved. Emacs doesn't care what the name of a variable is--that's already in the buffer. Emacs cares what part of the buffer corresponds to a variable name. Dumping an AST that contains only annotations to text, referring to positions in the source files instead of actually including text in the AST, looks to me like a good middle ground. Such an AST (which I will call a "refAST" because it contains only references to program source text) would be a significant pain to use as compiler input, since the symbol names are missing, while also being exactly the information that an editor needs. We can make it harder to use the refAST to abuse the GCC frontend by the same expedient that makes the refAST easier for Emacs to read. Emacs already has the source text, why force it to read duplicates from the AST dump?

Further, the refAST needs to resemble the source text as closely as possible. Most of GCC's value is from the optimizer and code generators. Parsing is relatively simple compared to the rest of GCC. If the refAST is dumped after optimization, it will be next to useless for editing the source. So the refAST must be dumped prior to any optimization. My knowledge of GCC internals is lacking, but a glance at gccint suggests that Emacs needs a dump of GENERIC, which, incidentally, can "also be used in the creation of source browsers, intelligent editors, ..." (<URL:http://gcc.gnu.org/onlinedocs/gccint/C-and-C_002b_002b-Trees.html>). Further reading reveals that for better or for worse, this ship has already sailed and GCC has had an option to dump GIMPLE representation, which is probably far more useful for abusing the frontend than an AST dump, for some time now.

In short, the earlier in the GCC pipeline the parse tree is dumped, the more useful the dump is for editing source and the less useful the dump is for feeding a nonfree compiler backend. Dumping references to source text, but not the text itself, simultaneously makes reading the dump into Emacs easier and feeding the dump into another backend harder.

My proposal:

   -- GCC option for dumping refAST for editor use
-- parse tree is dumped as early as is feasible, definitely prior to optimization
--Near term:
   -- Emacs Lisp ported to Guile
--Longer term:
   -- GCC buildable as Guile extensions
-- provides full access to GCC internal structures, but only to Free software
   -- Emacs ported to Guile

PS: There have been questions raised as to the use of a full syntax tree. One feature that I would find useful that would be trivially enabled by having the syntax tree would be to make M-C-t next to a binary operator swap its operand expressions. This is a contrived example, but shows a simple case:

     a * b +. c * d
C:    a * c +. b * d
Lisp: c * d +. a * b

The dot represents point. The result labeled "C" is what happens currently. The result labeled "Lisp" is what would happen if Emacs actually understood C syntax on the same level as s-expressions. A refAST would enable this as a side-effect of its other uses.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]