guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Implementing the guix-dameon in Guile


From: Caleb Ristvedt
Subject: Re: Implementing the guix-dameon in Guile
Date: Thu, 14 Sep 2023 12:31:41 -0500
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)

My old university email address started demanding a phone number to
"verify the security of my account", which was pretty funny considering
it never had a phone number to begin with, so I'm locked out of
that.  Same GPG public key, though.

Christopher Baines <mail@cbaines.net> writes:

> Hey!
>
> I think this has been talked about for a while [1], but I want to make it
> happen. Currently the guix-daemon is still similar to the nix-daemon
> that it was forked from, and is implemented in C++. I think that a Guile
> implementation of the guix-daemon will simplify Guix and better support
> hacking on and around the daemon to add new features and move Guix
> forward.

I'd like to help with this if at all possible.

> Still though, I'd like to hear what people think about which direction
> the implementation should go, and what features they'd like to see. Even
> if those are not essential to make the Guile implementation viable, it
> still might inform the direction to take.
Okay, brain dump time:

I think that using fibers has a lot of potential, but there are
obstacles that need to be worked around.  In the single-threaded case,
we risk a big slowdown if multiple clients are active at once, since
we're doing what used to be done by n processes with one single thread.
It would be especially noticeable during big disk reads and writes,
since those basically ignore O_NONBLOCK, and most procedures that act on
entire files at once would therefore probably not hit many yield points.
The worst situation would be where multiple worker fibers are attempting
to do reference scanning at the same time.  Posix asynchronous disk IO
could be used, but glibc currently implements it... using a thread pool.
There is the RWF_NOWAIT flag to preadv2, though it's only available on
newer linuxes and has bugs in 5.9 and 5.10.

Additionally, being single-threaded means use of F_SETLKW is a no-go, so
you're stuck with polling there.  Granted, that's not such a big issue if
in 99% of use cases there is only one process doing the locking, so it
can all be managed internally.

Speaking of file locks, the risk of accidental clobbering of locks jumps
way up once it's all moved in to one process, and IIRC we already have
bugs with accidental clobbering of locks.  You can get a half-decent
interface by doing what sqlite does, which is a combination of
intra-process locking and holding on to open file descriptors until all
locks on the underlying file are released.  There are some subtle
pathological cases there that are a lot more likely in the guix daemon
than in sqlite, though.  For example, suppose you open a file twice to
get ports p1 and p2, acquire read locks on both of them, then close p1,
then open the file again to get p3, acquire a read lock on it, close p2,
get p4, acquire a read lock on it, close p3, get p5... and so on.  This
will cause unbounded file descriptor usage, and eventually you'll run
out.  There is no workaround in this model other than "hope that usage
pattern doesn't come up much".  Additionally, you need to ensure that
every close of a potentially-locked file goes through a special
close-wrapper.

I'm actually in the middle of working on a solution for this that
involves a separate locker process that gets passed file descriptors to
lock via a unix socket.

Speaking of file descriptors, running the entire daemon in one process
is going to mean much higher pressure on file descriptor resource usage.
IIRC, while building a derivation, the closure of its inputs needs to be
locked, and that means a file descriptor for each and every store item
in its input closure, simultaneously.  The separate locker process would
make it possible to retain those locks while not having them open in the
main process.

Another issue that will need to be addressed, whether single-threaded or
not, is the use of memoization caches in various places.  These aren't
weak hash tables, so they are both not-thread-safe and will retain
strong references to both the cached results and the arguments used to
obtain them for as long as the procedure it is based on remains.  In a
long-running server process, this is less than ideal.  One approach
could be to put a bound on how large they can grow, with some eviction
policy for deciding what gets discarded first.  If memoization is used
to ensure pointer equality as a matter of correctness, though, that
probably won't work well.  The simplest solution would probably be to
change them to use weak hash tables, though perhaps with an option
available to bring back non-weak hash tables on the client side.

In the multithreaded case, fork() and clone() become concerns, since
they can no longer be safely run from guile.  One way around this would
be to use posix_spawn to produce a single-threaded guile process, then
have that do the fork or clone as necessary.  The fork case shouldn't
actually be necessary, though, as the child process can just exec
directly.  In the clone case, CLONE_PARENT can be used to make the
resulting process a child of the original, main process, though I don't
know how portable that is to hurd (really, I don't know how namespace
setup works in general on hurd).  Instead of doing this
spawn-two-processes-to-spawn-one routine every time we want to set up a
container, we could create a spawner-helper process once and just keep
it around.  If we can do that before any threads are created, we don't
even need posix_spawn (though it is nice to have around, and I do have
bindings to it).  I remember reading that that's what the apache web
server did.

This would however mean some places would need to use interfaces like
"eval-with-container" instead of "call-with-container", which is
somewhat less convenient.  But code staging shouldn't be a terribly
foreign concept to most guixers.

Another concern is child process management; a blocking waitpid will of
course block the calling thread, so something like a SIGCHLD handler or
a dedicated reaper thread would be needed in order to simulate a
blocking waitpid.  Personally I think it would be a good idea to go with
something like Shepherd's process monitor approach, but with some
changes.  First, move child reaping into the process monitor itself, so
that all the SIGCHLD handler does is send a notification to the process
monitor (and it should do this via a condition variable, not a channel,
so that it doesn't block, since asyncs run on whatever fiber happens to
be current on that thread's scheduler at the time, and this means it is
possible for a signal's handler to be run from within the process
monitor fiber).  Second, wrap all process-spawning procedures such that
they now return <process> objects instead of PIDs.  A <process> object
contains a PID, a condition variable signaled when the process is
terminated, and a slot for holding the exit status.  Immediately before
spawning a process, send a message to the process monitor temporarily
disabling reaping, then spawn the process, create the <process> object,
and register it with the process monitor, resuming reaping at the same
time.  Then a waitpid replacement can very easily operate on these
process objects.

Sqlite is yet another concern.  I haven't yet looked at how you've
handled this in the build coordinator, but I'm curious.  Any blocking
interface it has, such as a busy handler, isn't going to work very well.
We could wrap the sqlite procedures with ones that retry with
exponential backoff (which is what "PRAGMA busy_timeout = ..." does
internally).  That would work, though not optimally.  I'm not sure of a
better way, though - https://www.sqlite.org/c3ref/unlock_notify.html
looks sort of right, but the documentation says that's just for "shared
cache mode".  It seems a bit of a shame to keep guessing at when the
database "might" next be available when the other entities accessing the
database may well be in the very same process and so could just give us
an earliest-possible checking point directly.

Those are all the design concerns that I had off the top of my head, I
might recall some more later on.  Personally I think it would be prudent
to design as if for multiple threads.

On the subject of features, I would like it if downloaders (fixed-output
derivations) had access to /gnu/store/.links so that they can easily
look up whether a file with a given hash already exists, and copy it
over if so.  Often when writing a package definition I'll run "guix
download" to get the hash, and that will put it in the store as a side
effect, but then when it comes time to build the package it will
re-download it all over again because the file name is different.
Technically that should already be achievable just by tweaking
chroot-dirs and the downloaders.  It would also be nice if the same
concept could be applied to directories, such as git repositories -
perhaps a /gnu/store/.dirlinks with symbolic links?  Of course, those
wouldn't be used for deduplication, just for easy access for
fixed-output derivations.  A similar approach could also include e.g. a
mounted DVD with The GNU Source Distribution on it.

Oh, another feature that just occurred to me - the ability to
automatically retry a failed derivation with fewer builds in parallel,
or depending on system load or available memory.  It's quite annoying to
run a guix command that is supposed to take multiple days, and have it
end up taking more than a week because it fails several times each day -
often bringing down several hours worth of progress with it due to very
large derivations - because of some small package's flaky tests that
fail under high load.

- reepca

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]