emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: native compilation units


From: Lynn Winebarger
Subject: Re: native compilation units
Date: Sun, 12 Jun 2022 13:38:40 -0400



On Sat, Jun 11, 2022 at 4:34 PM Stefan Monnier <monnier@iro.umontreal.ca> wrote:
>> In which sense would it be different from:
>>
>>     (cl-flet
>>         ...
>>       (defun ...)
>>       (defun ...)
>>       ...)
>>
>>
> Good point - it's my scheme background confusing me.  I was thinking defun
> would operate with similar scoping rules as defvar and establish a local
> binding, where fset (like setq) would not create any new bindings.

I was not talking about performance but about semantics (under the
assumption that if the semantics is the same then it should be possible
to get the same performance somehow).
 
I'm trying to determine if there's a set of expressions for which it is semantically sound
to perform the intraprocedural optimizations by -O3 - that is, where it is correct to
treat functions in operator position as constants rather than a reference through a 
symbol's function cell.
 

> (1) I don't know how much performance difference (if any) there is between
>      (fsetq exported-fxn #'internal-implementation)
> and
>      (defun exported-fxn (x y ...) (internal-implementation x y ...))

If you don't want the indirection, then use `defalias` (which is like
`fset` but registers the action as one that *defines* the function, for
the purpose of `C-h f` and the likes, and they also have slightly
different semantics w.r.t advice).

What I'm looking for is for a function as a first class value, whether as a byte-code vector,
a symbolic reference to a position in the .text section (or equivalent) of a shared object that may or may
not have been loaded, or a pointer to a region that is allowed to be executed.
 
> (2) I'm also thinking of more aggressively forcing const-ness at run-time
> with something like:
> (eval-when-compile
>   (cl-flet ((internal-implemenation (x y ...) body ...))
>      (fset exported-fxn #'internal-implementation)))
> (fset exported-fxn (eval-when-compile #'exported-fxn))
>
> If that makes sense, is there a way to do the same thing with defun?

I don't know what the above code snippet is intended to show/do, sorry :-(

I'm trying to capture a function as a first class value.
Better example - I put the following in ~/test1.el and byte compiled it (with emacs 28.1 running on cygwin).
-------------
(require 'cl-lib)
(eval-when-compile
  (cl-labels ((my-evenp (n) (if (= n 0) t (my-oddp (1- n))))
              (my-oddp (n) (if (= n 0) nil (my-evenp (1- n)))))
    (defun my-global-evenp (n) (my-evenp n))
    (defun my-global-oddp (n) (my-oddp n))))
-----------------
I get the following (expected) error when running in batch (or interactively, if only loading the compiled file)
$ emacs -batch --eval '(load "~/test1.elc")' --eval '(message "%s" (my-global-evenp 5))'
Loading ~/test1.elc...
Debugger entered--Lisp error: (void-function my-global-evenp)
  (my-global-evenp 5)
  (message "%s" (my-global-evenp 5))
  eval((message "%s" (my-global-evenp 5)) t)
  command-line-1(("--eval" "(load \"~/test1.elc\")" "--eval" "(message \"%s\" (my-global-evenp 5))"))
  command-line()
  normal-top-level()
The function symbol is only defined at compile time by the defun, so it is undefined when the byte-compiled file is loaded in a clean environment.
When I tried using (fset 'my-global-evenp (eval-when-compile #'my-ct-global-evenp) it just produced a symbol indirection, which was disappointing.
So here there are global compile time variables being assigned trampolines to the local functions at compile time as values.
-------------------------------
(require 'cl-lib)
(eval-when-compile
  (defvar my-ct-global-evenp nil)
  (defvar my-ct-global-oddp nil)
  (cl-labels ((my-evenp (n) (if (= n 0) t (my-oddp (1- n))))
              (my-oddp (n) (if (= n 0) nil (my-evenp (1- n)))))
    (setq my-ct-global-evenp (lambda (n) (my-evenp n)))
    (setq my-ct-global-oddp (lambda (n) (my-oddp n)))))
(fset 'my-global-evenp (eval-when-compile my-ct-global-evenp))
(fset 'my-global-oddp (eval-when-compile my-ct-global-oddp))
-------------------------------
Then I get
$ emacs -batch --eval '(load "~/test2.elc")' --eval '(message "%s" (my-global-evenp 5))'
Loading ~/test2.elc...
Debugger entered--Lisp error: (void-variable --cl-my-evenp--)
  my-global-evenp(5)
  (message "%s" (my-global-evenp 5))
  eval((message "%s" (my-global-evenp 5)) t)
  command-line-1(("--eval" "(load \"~/test2.elc\")" "--eval" "(message \"%s\" (my-global-evenp 5))"))
  command-line()
  normal-top-level()
This I did not expect.  Maybe the variable name is just an artifact of the way cl-labels is implemented and not a fundamental limitation.
Third attempt to express a statically allocated closure with constant code (which is one way of viewing an ELF shared object):
--------------------------------
(require 'cl-lib)
(eval-when-compile
  (defvar my-ct-global-evenp nil)
  (defvar my-ct-global-oddp nil)
  (let (my-evenp my-oddp)
    (setq my-evenp (lambda (n) (if (= n 0) t (funcall my-oddp (1- n)))))
    (setq my-oddp (lambda (n) (if (= n 0) nil (funcall my-evenp (1- n)))))
    (setq my-ct-global-evenp (lambda (n) (funcall my-evenp n)))
    (setq my-ct-global-oddp (lambda (n) (funcall my-oddp n)))))

(fset 'my-global-evenp (eval-when-compile my-ct-global-evenp))
(fset 'my-global-oddp (eval-when-compile my-ct-global-oddp))
--------------------------------
And the result is worse:
$ emacs -batch --eval '(load "~/test3.elc")' --eval '(message "%s" (my-global-evenp 5))'
Loading ~/test3.elc...
Debugger entered--Lisp error: (void-variable my-evenp)
  my-global-evenp(5)
  (message "%s" (my-global-evenp 5))
  eval((message "%s" (my-global-evenp 5)) t)
  command-line-1(("--eval" "(load \"~/test3.elc\")" "--eval" "(message \"%s\" (my-global-evenp 5))"))
  command-line()
  normal-top-level()
This was not expected with lexical scope. 
$ emacs -batch --eval '(load "~/test3.elc")' --eval "(message \"%s\" (symbol-function 'my-global-evenp))"
Loading ~/test3.elc...
#[(n)   !\207 [my-evenp n] 2]
At least my-global-evenp has byte-code as a value, not a symbol, which was the intent.  I get the same result if I wrap the two lambdas 
stored in the my-ct-* variables with "byte-compile", which is what I intended (for the original to be equivalent to explicitly compiling the form).

However, what I expected would have been the byte-code equivalent of an ELF object with 2 symbols defined for relocation.
So why is the compiler producing code that would correspond to the "let" binding my-evenp and my-oddp being dynamically scoped?
That made me curious, so I found https://rocky.github.io/elisp-bytecode.pdf and reviewed it.
I believe I see the issue now.  With the current byte-codes, there's just no way to express a call to an offset in the current byte-vector.
There's not even a way to reference the address of the current byte vector to use as an argument to funcall.  There's no way to reference
symbols that were resolved at compile-time at all, which would require the equivalent of dl symbols embedded in a code vector 
that would be patched at load time.  That forces the compiler to emit a call to a symbol.  And when the manual talks about lexical scope,
it's only for "variables" not function symbols.
That explains a lot.  The reason Andrea had to use LAP as the starting point for optimizations, for example.  I can't find a spec for 
Emacs's version of LAP, but I'm guessing it can still express symbolic names for local function expressions in a way byte-code 
simply cannot.
I don't see how the language progresses without resolving the inconsistency between what's expressible in ELF and what's expressible
in a byte-code object.
One possible set of changes to make the two compatible - and I'd use the relative goto byte codes if they haven't been produced by emacs since v19.
I'd also add a few special registers.  There's already one used to enable GOTO (i.e. the program counter)
  • byte codes for call/returns directly into/from byte code objects
    • CALL-RELATIVE - execute a function call to the current byte-vector object with the pc set to the pc+operand0 - basically PIC code
       If a return is required, the byte compiler should arrange for the return address to be pushed before other operands to the function being called
      No additional manipulation of the stack is required, since funcall would just pop the arguments and then immediately push them again.
      Alternatively, you could have a byte-code that explicitly allocates a stack frame (if needed), push the return offset, then goto
    • CALL-ABSOLUTE - execute a function call to a specified byte-vector object +  pc as the first 2 operands,  This is useless until the byte-code object
      supports a notional of relocation symbols, i.e. named compile-time constants that get patched on load in one way or another, e.g. directly by
      modifying the byte-string with the value at run-time (assuming eager loading), or indirectly by adding a "linkage table" of external symbols
      that will be filled in at load and specifying an index into that table.
    • RETURN-RELATIVE - operand is the number of items that have to be popped from the stack to get the return address, which is an offset in the current
      byte-vector object. Alternatively, could be implemented as "discardN <n>; goto" 
    • RETURN-ABSOLUTE - same as return-relative, but the return address is given by two operands, a byte-vector and offset in the byte-vector
  • Alternate formulation
    • RESERVE-STACK operand is a byte-vector object (reference) that will be used to determine how much total stack space will be required for safety, and
      ensure enough space is allocated. 
    • GOTO-ABSOLUTE - operand is a byte-vector object and an offset. Immediate control transfer to the specified context
    • These two are adequate to implement the above
  • Additional registers and related instructions
    • PC - register already exists
      • PUSH-PC - the opposite of goto, which pops the stack into the PC register.  
    • GOT - a table of byte-vectors + offsets corresponding to a PLT section of the byte-vector specifying the compile-time symbols that have to be resolved
      • The byte-vector references + offset in the "absolute" instructions above would be specified as an index into this table.  Otherwise the byte-vector could
        not be saved and directly loaded for later execution.
    • STATIC - a table for the lexical variables allocated and accessible to the closures at compile-time.  Compiler should treat all sexp as occuring at the
      top-level with regard to the run-time lexical environment.  A form like (let ((x 5)) (byte-compile (lambda (n) (+ n (eval-when-compile x))))) should produce
      byte-code with the constant 5, while (let ((x 5)) (byte-compile (lambda (n) (+ n x)))) should produce byte code adding the argument n to the value of the
      global variable x at run-time
      • PUSH-STATIC
      • POP-STATIC
    • ENV - the environment register.
      • ENV-PUSH-FRAME - operand is number of stack items to capture as a (freshly allocated) frame, which is then added as a rib to a new
                                            environment pointed to by the ENV register
      • PUSH-ENV - push the value of ENV onto the stack
      • POP-ENV - pop the top of the stack into ENV, discarding any value there
  • Changes to byte-code object
    • IMPORTS table of symbols defined at compile-time requiring resolution to constants at load-time, particularly for references to compilation units
      (byte-vector or native code) and exported symbols bound to constants (really immutable)
      Note - the "relative" versions of call and return above could be eliminated if "IMPORTS" includes self-references into the byte-vector object itself
    • EXPORTS table of symbols available to be called or referenced externally
    • Static table with values initialized from the values in the closure at compile-time
    • Constant table and byte string remain
  • Changes to byte-code loader
    • Read the new format
    • Resolve symbols - should link to specific compilation units rather than "features", as compilation units will define specific exported symbols, while
      features do not support that detail.  Source could still use "require", but the symbols referenced from the compile-time environment would have
      to be traced back to the compilation unit supplying them (unless they are recorded as constants by an _expression_ like
      (eval-when-compile (setq v (eval-when-compile some-imported-symbol)))
    • Allocate and initialize the static segment
    • Create a "static closure" for the compilation unit = loaded object + GOT + static frame - record as singleton entry mapping compilation units to closures (hence "static")
  • Changes to funcall
    • invoking a function from a compilation unit would require setting the GOT, STATIC and setting the ENV register to point to STATIC as the first rib (directly or indirectly)
    • invoking a closure with a "code" element pointing to an "exported" symbol from a compilation unit + an environment pointer
      • Set GOT and STATIC according to the byte-vector's static closure
    • Dispatch according to whether compilation unit is native or byte-compiled, but both have the above elements
  • Changes to byte-compiler
    • Correct the issues with compile-time evaluation + lexical scope of function names above
    • Emit additional sections in byte-code
    • Should be able to implement the output of native-compiler pass (pre-libgccjit) with "-O3" flags in byte-code correctly

Lynn















 

reply via email to

[Prev in Thread] Current Thread [Next in Thread]