guile-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Need help embedding Guile


From: Dimitris Papavasiliou
Subject: Re: Need help embedding Guile
Date: Wed, 22 Dec 2021 22:05:05 +0000

Thanks to everybody for their suggestions.  I'll respond to all in this single
message to keep the discussion from spreading out too much.  Please let me know
if this is inconvenient for you.  I also apologize in advance for my large
messages.  There's a TL;DR of sorts in the last 3 paragraphs.

Let me start by noting that there are really two distinct, though connected,
problems:

1. Handling garbage collection

This problem is tractable, the only question being how best to handle it.  As I
neglected to say explicitly, but as Olivier pointed out:

On Wednesday, December 22nd, 2021 at 4:46 PM, Olivier Dion wrote:

> Since you have a graph of all the primitives in the second phase, you're
> basicaly doing garbage collection there.

So, yes, for this class of foreign object, I can essentially simply pass plain
pointers to Guile, and let it go about its business with garbage collection as
it sees fit.  I keep these objects in a graph anyway and already handle freeing
them after phase 2.  Then:

> If I understood, objects can be garbage before phase 2, thus not
> appearing in the final graph of operations.

This class of foreign objects, is indeed used during the first phase, and need
to be finalized once it's finished.  These can be handled as above, by passing
plain pointers to Guile, and keeping tabs on them to finalize them explicitly.
This is less than ideal, because all objects will be kept live until the Scheme
code terminates and although these are typically small, there's no guarantee
that there won't be very many of them.  But this might be alleviated to some
extent by combining our own collection with that of Guile, e.g. as Mikael
suggests:

On Wednesday, December 22nd, 2021 at 7:37 PM, Mikael Djurfeldt wrote:

> For example, the C++ application could have a doubly linked list of the C++
> objects. When Guile collects an object, make it unlink it from the C++
> list. Then, when you want to enter your second phase, you can go through the
> list, which now only contains the objects not yet collected, and finalize
> them.

Although this would require relying on finalizers, it would no longer be
necessary that every single object has its finalizer called; just that most do,
so that no too much memory is wasted.  This seems to be the idea behind
finalizers in the BDW-GC, as far as I could see from its documentation.

This then leaves the more substantial difficulty:

2. Making sure Guile has terminated after phase 1

First of all, this is related to the previous problem.  Although it *is* true
that the GC is conservative, this is not the ultimate reason why it is not
possible to deterministically ensure that all objects are collected and
finalized.  As far as I can see, the ultimate reason is that the GC in use by
Guile works under the assumption that it will be in charge until process exit,
at which point collection becomes unnecessary, as the OS will take over. If it
were possible to tell Guile to shut down and clean up, the GC would know that
all tracked objects are now up for collection as there's no-one left to use
them.  This is possible with Lua for instance, another language meant to be
embedded where the Lua state can be closed, at which point all objects are
collected.  This allowed me to embed Lua without much trouble.

This also precludes this suggestion:

On Wednesday, December 22nd, 2021 at 3:52 PM, Thien-Thi Nguyen wrote:
> Do guardians help for this?

Alas no, because a) as far as I can tell guarded objects still refer to the GC
to tell whether they are collectable and its conservatism will still create
problems, but more importantly b) I can see no documented way to sever all
references the Scheme code might have made to the foreign objects (but see more
below).

On this issue Olivier suggests:

On Wednesday, December 22nd, 2021 at 4:46 PM, Olivier Dion wrote:

> One way I think you could do this is to evaluate all the user operations
> in a sandbox environment.

> If SEVER-MODULE? is true (the default), the module will be unlinked
> from the global module tree after the evaluation returns, to allow MOD
> to be garbage-collected.

This is interesting, but sandboxed environments turn out to be too restricting
and not meant for this purpose.  As far as I could tell, I cannot even load code
from within:

scheme@(guile-user)> (use-modules (ice-9 sandbox))
scheme@(guile-user)> (eval-in-sandbox '(load "test.scm"))
ice-9/boot-9.scm:1669:16: In procedure raise-exception:
Unbound variable: load

Inspired by the `sever-module?' argument I tried severing the default module, as
returned by `scm_current_module' and `scm_interaction_environment' like this:

    scm_call_1(
        scm_variable_ref(
            scm_c_private_variable("ice-9 sandbox", "sever-module!")),
            env);

This seemed to work (the severing part), but didn't help in allowing collection
of e.g. foreign objects bound to global variables, presumably because other
references are kept on the default interaction environment.

I also tried creating and then severing a custom-built r5rs environment (made
with `scheme-report-environment'), but this couldn't even be severed.

More out of spite than anything, I tried to clear the default module.  Noting
that a module is really a structure (although I have only a very hazy idea what
this really means), I tried:

    SCM env = scm_current_module();
    scm_struct_set_x(env, scm_from_int(0), SCM_EOL);
    scm_struct_set_x(env, scm_from_int(1), SCM_BOOL_F);

Lo and behold, this succeeded in allowing all objects to be collected!  But one
might say: what of it?  This is still a hack, depending on implementation
details.

But there's a bigger (in my view at least) issue here.  As Mikael notes:

On Wednesday, December 22nd, 2021 at 7:37 PM, Mikael Djurfeldt wrote:

> This creates the following problem: What if some Guile code runs *after* you
> have finalized your remnant objects? [...] All of this indicates that it could
> be nice to have some kind of Guile shutdown call in the C API. Such a shutdown
> call could go through live objects and free them.

On the same matter, Maxime said:

On Wednesday, December 22nd, 2021 at 5:29 PM, Maxime Devos wrote:
> > This makes some sort of
> > forcing/ensuring that Guile has terminated desirable.
>
> ... but I don't see how this follows. The only benefit I see from
> ensuring Guile terminates, is freeing a little memory. But since the
> Guile is basically used as a fancy configuration language, I don't see
> the need. (Except for valgrind memory leak detection.)

(Again this memory doesn't have to be a little.  As a simple illustration, if
the program makes a geometry made out of a 3D grid of 10 * 10 * 10 cubes say, it
will have to allocate 1000 transformation objects to translate the cubes into
place, which will be retained needlessly.  Worse situations are easily
imaginable, depending on how optimistic one is.)

But the real issue is the one brought up by Mikael.  Guile is quite large, with
many features and quite complex control flow mechanisms.  As long as Guile is up
and running, one can't be really sure that it won't somehow interfere with the
execution of the embedding program when it shouldn't (in phase 2 in my case) and
in ways that are not predictable.

This is may be no more than a psychological problem, a mere pseudo-concern, but
I'm not certain of that.  Some user might have the main code start threads for
instance, which persist past the point of its return and while that's easily
fixed by joining all threads before phase 2 say, other such issues may not be as
tractable.  Embedding Guile requires effort and having the possibility of
discovering hard problems late in the game, is not entirely insignificant in
this respect.

So TL;DR:  I think the issue boils down to whether it is possible to shut down
Guile and have it clean up before process exit.  If this is not currently
possible, another interesting question might how well such a feature would fit
into Guile's current design and whether it would be desirable to implement it.
I would argue that, although perhaps not indispensable, it would certainly not
be unnecessary for a language specifically designed to be embedded.

Dimitris

PS: Let me know if you think I should start a new thread for this.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]