[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Bytestructures: a "type system" for bytevectors

From: Taylan Ulrich Bayırlı/Kammer
Subject: Bytestructures: a "type system" for bytevectors
Date: Sun, 30 Aug 2015 18:32:53 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)

(I don't endorse GitHub, but I gave in after Gitorious went down.)

I had started working on this project around two years ago but it had a
pretty strange and complex API, an unreadable README, high overhead for
something you might want to use in high-performance settings (byte
crunching), and a bit too hung up on being standards compliant.

I resumed working on it around a week ago, and solved all these issues!

(I still do my best to keep R7RS compliance, but Guile is my prime
*real* target.)

So what is it?

=== Structured access to bytevector contents ===

Don't tell C-standard pedants, but the type system of C is nothing more
than a thin layer over your computer memory: a huge sequence of bytes.

Compilers are allowed some crazy things as per the strict standards
language, but if you define a struct { uint16_t x; int8_t y[3]; }, you
essentially declare that you will be working with memory-snippets of 5
bytes, whose first two bytes stand for an unsigned 16-bit integer, and
the other three bytes for an array of three signed 8-bit integers.
That's all there is to it, and you could populate those bytes directly,
one by one:

    the_struct_t my_struct = { a, b, c, d, e, f }

Not sensible in practice for various reasons (endianness, compiler
optims, unreadable code, etc.), but it's good to be aware of the basic
idea: structs, arrays, unions, integer types, whatever, all types in C
are just windows upon byte-sequences.

Some programs will offer a stable ABI, indeed making promises about the
byte-by-byte structure of their data types.  It might also be a piece of
hardware giving you bytes under such a structure, and you can declare
the structure via C's type system to put some sanity into your
hardware-interfacing code.

Now in Scheme we have some *proper* abstraction.  We can fully forget
about the bytes and work with a purely logical notion of objects.  But
still we have bytevectors, a type encapsulating a raw sequence of bytes,
because they are useful for various purposes, be it talking with a C
library, or with a piece of hardware.

But Scheme doesn't offer anything like C's type system for bytevectors.
There's the bytevector-foo-ref/set! functions, and there's SRFI-4, but
there's nothing as generic as C's type system.

Now there is. :-)

(Bit-fields not yet supported.)

The struct example from above is now:

    (bs:struct `((x ,uint16) (y ,(bs:vector 3 int8))))

and that returns an object (a "bytestructure descriptor") which you can
use as part of other type definitions (the uin16 ant int8 there are
types provided for convenience by the library; they could have been
defined by the user as well), or use to access data in a bytevector
logically via struct field names and vector indices.

    (bytestructure my-struct 'y 2)  ;the last uint8

There is even a "pointer" type, made possible by our FFI module.  Be
warned though, you can write Scheme programs that segfault!

For when you need maximal performance from your code, there's a
macro-generating macro which will arrange for all the type stuff to
happen at compile time.  Granted, you lose some flexibility.

For more details, refer to the README.

I'm still shy of offering an absolutely final API, but I'm pretty happy
with the state of things and would love it if some people played with
it.  Consider this a beta phase announcement or so.  Tell me any
nontrivial issues you see with the library.

That's all for now.  Thanks in advance for any feedback!

Happy hacking,

reply via email to

[Prev in Thread] Current Thread [Next in Thread]