Hi All,
I am working on a ffi-helper (FH): a program that will read in a C dot-h file
and generate a Guile dot-scm file which defines a module to provide hooks into
the associated C library.
This is a rework of the first part of the documentation. It provides an example
and a section explaining part of the design.
(I have recently dumped my macbook w/ flaky keyboard for a ubuntu laptop. I am
still adjusting. I am missing macports a little.)
Matt
FFI Helper for Guile
********************
Matt Wette
January 2018
With NYACC Version 0.83.0
1 Introduction
**************
The acronym FFI stands for "Foreign Function Interface".
It refers to
the Guile facility for binding functions and variables
from C source
libraries into Guile programs. This distribution
provides utilities for
generating a loadable Guile module from a set of C
declarations and
associated libraries. The C declarations can, and
conventionally do,
come from naming a set of C include files. The nominal
method for use
is to write a _ffi-module_ specification in a file which
includes a
'define-ffi-module' declaration, and then use the command
'guild
compile-ffi' to produce an associated file of Guile
Scheme code.
$ guild compile-ffi ffi/cairo.ffi
wrote `ffi/cairo.scm'
The FH does not generate C code. The hooks to access
functions in the
Cairo library are provided in 100% Guile Scheme via
'(system foreign)'.
The compiler for the FFI Helper (FH) is based on the C
parser and
utilities which are included in the NYACC
(https://www.nongnu.org/nyacc)
package. Development for the FH is currently being
performed in the
'c99dev' branch of the associated git repository. Within
the NYACC
distribution, the relevant modules can be found under the
directory
'examples/'.
Use of the FFI-helper module depends on the
_scheme-bytestructure_
package available from
<https://github.com/TaylanUB/scheme-bytestructures>.
Releases are
available at
<https://github.com/TaylanUB/scheme-bytestructures/releases>.
At runtime, after the FFI Helper has been used to
create Scheme code,
the modules '(system ffi-help-rt)' and '(bytestructures
guile)' are
required. No other code from the NYACC distribution is
needed.
However, note that the process of creating the Scheme
output depends on
reading system headers, so the generated code may well
contain operating
system and machine dependencies. If you copy code to a
new machine, you
should re-run 'guild compile-ffi'.
You are probably hoping to see an example, so let's
try one.
This is a small FH example to illustrate its use. We
will start with
the Cairo (cairographics.org) package because that is the
first one I
started with in developing the FFI Helper. Say you are
an avid Guile
user and want to be able to use Cairo in Guile. On most
systems Cairo
comes with the associated _pkg-config_ support files;
this demo depends
on that support.
Warning: The FFI Helper package is under active
development and there
is some chance the following example will cease to work
in the future.
If you want to follow along and are working in the
distribution tree,
you should source the file 'env.sh' in the 'examples'
directory.
By practice, I like to put all FH generated modules
under a directory
called 'ffi/', so we will do that. We start by
generating, in the 'ffi'
directory, a file named 'cairo.ffi' with the following
contents:
(define-ffi-module (ffi cairo)
#:pkg-config "cairo"
#:include '("cairo.h" "cairo-pdf.h"
"cairo-svg.h"))
To generate a Guile module you execute 'guild' as
follows:
$ guild compile-ffi ffi/cairo.ffi
wrote `ffi/cairo.scm'
Though the file 'cairo/cairo.ffi' is only three lines
long, the file
'ffi/cairo.scm' will be over five thousand lines long.
It looks like
the following:
(define-module (ffi cairo)
#:use-module (system ffi-help-rt)
#:use-module ((system foreign) #:prefix ffi:)
#:use-module (bytestructures guile))
(define link-libs
(list (dynamic-link "libcairo")))
;; int cairo_version(void);
(define ~cairo_version
(delay (fh-link-proc ffi:int "cairo_version"
(list) link-libs)))
(define (cairo_version)
(let () ((force ~cairo_version))))
(export cairo_version)
...
;; typedef struct _cairo_matrix {
;; double xx;
;; double yx;
;; double xy;
;; double yy;
;; double x0;
;; double y0;
;; } cairo_matrix_t;
(define-public cairo_matrix_t-desc
(bs:struct
(list `(xx ,double) `(yx ,double) `(xy ,double)
`(yy ,double) `(x0 ,double) `(y0
,double))))
(define-fh-compound-type cairo_matrix_t
cairo_matrix_t-desc
cairo_matrix_t? make-cairo_matrix_t)
(export cairo_matrix_t cairo_matrix_t?
make-cairo_matrix_t)
... many, many more declarations ...
;; access to enum symbols and #define'd constants:
(define ffi-cairo-symbol-val
(let ((sym-tab
'((CAIRO_SVG_VERSION_1_1 . 0)
(CAIRO_SVG_VERSION_1_2 . 1)
(CAIRO_PDF_VERSION_1_4 . 0)
(CAIRO_PDF_VERSION_1_5 . 1)
(CAIRO_REGION_OVERLAP_IN . 0)
(CAIRO_REGION_OVERLAP_OUT . 1)
... more constants ...
(CAIRO_MIME_TYPE_JBIG2_GLOBAL_ID
.
"application/x-cairo.jbig2-global-id"))))
(lambda (k) (or (assq-ref sym-tab k)))))
(export ffi-cairo-symbol-val)
(export cairo-lookup)
... more ...
Note that from the _pkg-config_ spec the FH compiler
picks up the
required libraries to bind in. Also, '#define' based
constants, as well
as those defined by enums, are provided in a lookup
function
'ffi-cairo-symbol-val'. So, for example
guile> (use-modules (ffi cairo))
;;; ffi/cairo.scm:6112:11: warning:
possibly unbound variable
`cairo_raster_source_acquire_func_t*'
;;; ffi/cairo.scm:6115:11: warning:
possibly unbound variable
`cairo_raster_source_release_func_t*'
guile> (ffi-cairo-symbol-val
'CAIRO_FORMAT_ARGB32))
$1 = 0
We will discuss the warnings later. They are signals
that extra code
needs to be added to the ffi module. But you see how the
constants (but
not CPP function macros) can be accessed.
Let's try something more useful: a real program.
Create the
following code in a file, say 'cairo-demo.scm', then fire
up a Guile
session and 'load' the file.
(use-modules (ffi cairo))
(define srf (cairo_image_surface_create
'CAIRO_FORMAT_ARGB32 200 200))
(define cr (cairo_create srf))
(cairo_move_to cr 10.0 10.0)
(cairo_line_to cr 190.0 10.0)
(cairo_line_to cr 190.0 190.0)
(cairo_line_to cr 10.0 190.0)
(cairo_line_to cr 10.0 10.0)
(cairo_stroke cr)
(cairo_surface_write_to_png srf "cairo-demo.png")
(cairo_destroy cr)
(cairo_surface_destroy srf)
guile> (load "cairo-demo.scm")
...
;;; compiled /.../cairo.scm.go
;;; compiled /.../cairo-demo.scm.go
guile>
If we set up everything correctly we should have
generared the target
file 'cairo-demo.png' which contains the image of a
square. A few items
in the above code are notable. First, the call to
'cairo_image_surface_create' accepted a symbolic form
''CAIRO_FORMAT_ARGB32' for the format argument. It would
have also
accepted the associated constant '0'. In addition,
procedures declared
in '(ffi cairo)' will accept Scheme strings where the C
function wants
"pointer to string."
Now try this in your Guile session:
guile> srf
$4 = #<cairo_surface_t* 0x7fda53e01880>
guile> cr
$5 = #<cairo_t* 0x7fda54828800>
Note that the FH keeps track of the C types you use.
This can be useful
for debugging but may bloat the namespace. The constants
you see are
the pointer values. But it goes further. Let's generate
a matrix type:
guile> (define m (make-cairo_matrix_t))
guile> m
$6 = #<cairo_matrix_t 0x10cc26c00>
guile> (use-modules (system ffh-help-rt))
guile> (pointer-to m)
$7 = #<cairo_matrix_t* 0x10cc26c00>
When it comes to C APIs that expect the user to allocate
memory for a
structure and pass the pointer address to the C function,
FH provides a
solution:
guile> (cairo_get_matrix cr (pointer-to m))
guile> (fh-object-ref m 'xx)
$9 = 1.0
1.1 The Guile Foreign Function Interface
========================================
Guile has an API, called the Foreign Function Interface,
which allows
one to avoid writing and compiling C wrapper code in
order to access C
coded libraries. The API is based on 'libffi' and is
covered in the
Guile Reference Manual. We review some important bits
here. For more
insight you should read the relevant sections in the
Guile Reference
Manual. For more info on libffi internals visit libffi
(https://github.com/libffi/libffi).
The relevant procedures used by the FH are
'dynamic-link'
links libraries into Guile session
'dynamic-func'
generated Scheme-level pointer to a C function
'pointer->procedure'
geneates a Scheme lambda given C function signature
'dynamic-pointer'
provides access to global C variables
Several of the above require import of the module
'(system foreign)'.
In order to generate a Guile procedure wrapper for a
function, say
'int foo(char *str)', in some foreign library, say
'libbar.so', you can
use something like the following:
(use-modules (system foreign))
(define foo (pointer->procedure
int
(dynamic-func "foo" (dynamic-link
"libbar"))
(list '*)))
The argument 'int' is a variable name for the return
type, the next
argument is an _expression_ for the function pointer and
the third
argument is an _expression_ for the function argument
list. To execute
the function, which expects a C string, you use something
like
(define result-code (foo (string->pointer
"hello")))
If you want to try a real example, this should work:
guile> (use-modules (system foreign))
guile> (define strlen
(pointer->procedure
int (dynamic-func "strlen"
(dynamic-link)) (list '*)))
guile> (strlen (string->pointer "hello,
world"))
$1 = 12
It is important to realize that internally Guile takes
care of
converting Scheme arguments to and from C types. Scheme
does not have
the same type system as C and the Guile FFI is somewhat
forgiving here.
When we declare a C function interface with, say, an
uint32 argument
type, in Scheme you can pass an exact numeric integer.
The FH attempts
to be even more forgiving, allowing one to pass symbols
where C enums
(i.e., integers) are expected.
As mentioned, access to libraries not compiled into
Guile is
accomplished via 'dynamic-link'. To link the shared
library 'libfoo.so'
into Guile one would write something like the following:
(define foo-lib (dynamic-link "libfoo"))
Note that Guile takes care of dealing with the file
extension (e.g.,
'.so'). Where Guile looks for libraries is system
dependent, but
usually it will find shared objects in the following
* '(assq-ref %guile-build-info 'libdir)'
* '(assq-ref %guile-build-info 'extensiondir)'
* '/usr/lib' on GNU/Linux and macOS
* $DYLD_LIBRARY_PATH on GNU/Linux and macOS
* directories listed in /etc/ld.so.conf on GNU/Linux
When used with no argument 'dynamic-link' returns a
handle for objects
already linked with Guile. The procedure 'dynamic-link'
returns a
library handle for acquiring function and variable
handles, or pointers,
for objects (e.g., a pointer for a function) in the
library.
Theoretically, once a library has been dynamically linked
into Guile,
the _expression_ '(dynamic-link)' (with no argument) should
suffice to
provide a handle to acquire object handles, but I have
found this is not
always the case. The FH will try all library handles
defined by a ffi
module to acquire object pointers.
1.2 The FFI Helper Design
=========================
In this section we hope to provide some insight into the
FH works. The
FH specification, via the dot-ffi file, determines the
set of
declarations which will be included in the target Guile
module. If
there is no declartion filter, then all the declarations
from the
specified set of include files are targeted. With the
use of a
declaration filter, this set can be reduced. By
declaration we mean
typedefs, aggregate definitions (i.e., structs and
unions), function
declarations, and external variables.
In the C language typedefs define type aliases, so
there is no harm
in expanding typedefs which appear outside the
specification. For
example, say the file 'foo.h' includes a declaration for
the typedef
'foo_t' and the file 'bar.h' includes a declaration for
the typedef
'bar_t'. Furthermore, suppose 'foo_t' is a struct that
references
'bar_t'. Then the FH will preserve the typedef 'foo_t'
but expand
'bar_t'. That is, if the declarations are
typedef int bar_t; /* from bar.h */
typedef struct { bar_t x; double y; } foo_t; /* from
foo.h */
then the FH will treat 'foo_t' as if it had been declared
as
typedef struct { int x; double y; } foo_t; /* from
foo.h */
When it comes to handling C types in Scheme the FH
tries to leave
base types (i.e., numeric types) alone and uses its own
type system
based on Guiles _structs_ and associated _vtables_ for
structs, unions,
function types and pointer types. Enum types are handled
specially as
described below. The FH type system associates with each
type a number
of procedures. One of these is the printer procedure
which provided the
association of type with output seen in the demo above.
One of the challenges in automating C-Scheme type
conversion is that
C code uses a lot of pointers. So as the FH generates
types for
aggregates, it will automatically generate types for
associated
pointers. For example, in the case above with 'foo_t'
the FH will
generate an aggregate type named 'foo_t' and a pointer
type named
'foo_t*'. In addition the FH generates code to link
these two together
so that, given an object 'f1' of type 'foo_t', the
_expression_
'(pointer-to f1)' will generate an object of type
'foo_t*'. This makes
the task of generating an object value in Scheme, and
then passing the
pointer to that value as an argument to a FFI-generated
procedure, easy.
The inverse operation 'value-at' is also provided. Note
that sometimes
the C code needs to work with pointer pointer types. The
FH does not
produce double-pointers and in that case, the user must
add code to the
FH module defintion to support the required additional
type (e.g.,
'foo_t**').
In addition, the FH type system provide unwrap and
wrap procedures
used internal to ffi-generated modules for function
calls. These
convert FH types to and from objects of type expected by
Guile's FFI
interface. For example, the unwrap procedure associated
with the FH
pointer type 'foo_t*' will convert an 'foo_t*' object to
a Guile
'pointer'. Similarly, on return the wrap procedure are
applied to
convert to FH types. When the FH generates a type, for
example 'foo_t'
it also generates an exported procedure 'make-foo_t' that
users can use
to build an object of that type. The FH also generated a
predicate
'foo_t?' to determine if an object is of that type. The
'(system
ffi-help-rt)' module provides a procedure 'fh-object-ref'
to convert an
object of type 'foo_t' to the underlying bytestructures
representation.
For numeric and pointer types, this will generate a
number and for
aggregate types, a bytestructure. Additional arguments
to
'fh-object-ref' for aggregates work as with the
bytestructures package
and enable selection of components of the aggregate.
Note that the
underlying type for a bytestructure pointer is an
integer.
Enums are handled specially. In C, enums are
represented by
integers. The FH does not generate types for C enums or
C enum
typedefs. Instead, the FH defines unwrap and wrap
procedures to convert
Scheme values to and from integers, where the Scheme
values can be
integers or symbols. For example, if, in C, the enum
typedef 'baz_t'
has element 'OPTION_A' with value 1, a procedure
expecting an argument
of type 'baz_t' will accept the symbol ''OPTION_A' or the
integer '1'.
Where the FH generates types, the underlying
representation is a
_bytestructure descriptor_. That is, the FH types are
essentially a
layer on top of a bytestructure. The layer provides
identification seen
at the Guile REPL, unwrap and wrap procedures which are
used in function
handling (not normally visible to the user) and
procedures to convert
types to and from pointier-types.
For base types (e.g., 'int', 'double') the FH uses the
associated
Scheme values or the associated bytestructures values.
(I think this is
all bytestructure values now.)
The underlying representation of bytestructure values
is
_bytevectors_. See the Guile Reference Manual for more
information on
this datatype.
The following routines are user-level procedures
provided by the
runtime module '(system ffi-help-rt)':
'fh-type?'
a predicate to indicate whether an object is a FH
type
'fh-object?'
a predicate to indicate whether an object is a FH
object
'fh-object-val'
the underlying bytestructure value
'fh-object-ref'
a procedure that works like 'bytestructure-ref' on
the underlying
object
'fh-object-set!'
a procedure that works like 'bytestructure-set!' on
the underlying
object
'pointer-to'
a procedure, given a FH object, or a bytestructure,
that returns an
associated pointer object (i.e., a pointer type
whose object value
is the address of the underlying argument); this may
be a FH type
or a bytestructure
'value-at'
a procedure to dereference an object
'fh-cast'
a procedure to cast arguments for varaidic C
functions
'make-type'
make base type, as listed below; also used to make
bytestructure
objects for base types (e.g., '(make-double)' for
'double')
Supported base types are
short unsigned-short int
unsigned
long unsigned-long float
double
size_t ssize_t intptr_t
uintptr_t
ptrdiff_t
int8 uint8 int16
uint16
int32 uint32 int64
uint64
These types are useful for cases where the corresponding
types are
passed by reference as return types. For example
(let ((name (make-char*)))
(some_function (pointer-to name))
(display "name: ") (display (char*->string
name)) (newline))
(let ((return-val (make-double)))
(another_function (pointer-to return-val))
(simple-format #t "val is ~S\n" (fh-object-ref
return-val)))