[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

leaky pipelines and Guix

From: Ricardo Wurmus
Subject: leaky pipelines and Guix
Date: Tue, 9 Feb 2016 12:25:23 +0100

Hi Guix,

although I’m comfortable packaging software for Guix I’m still not
confident enough to tackle bioinformatics pipelines, as they don’t play
well with isolation.

In the pipeline that I’m currently working on as a consultant packager
I’m trying to treat the pipeline itself as a first-class package.  This
means that the locations of the tools it calls out to are all
configurable (thanks to auto{conf,make}) and they certainly do not have
to be in the PATH.  This allows us to install this pipeline (and the
tools it needs) easily alongside other variants of tools.  The pipeline
is also not just a bare Makefile but has a wrapper script to provide a
simplifed user interface.

However, most pipelines do not take this approach.  Pipelines are often
designed as glue (written in Perl, or as Makefiles) that ties together
other tools in some particular order.  These tools are usually assumed
to be available on the PATH.  Pipelines aren’t treated enough like
packages (which will be the subject of an inflammatory, click-baiting
blog post that I’m working on), so they usually come without a
configuration script to override implicit assumptions.

In the context of Guix this means that each pipeline would need its very
own isolated environment where the PATH is set up to contain the
locations of all tools that are needed at runtime (that’s what I mean by
“leaky”).  As many pipelines do not come with wrapper scripts there is
no easy way to sneakily set up such an environment for the duration of
the run.

So, how could I package something like that?  Is packaging the wrong
approach here and should I really just be using “guix environment” to
prepare a suitable environment, run the pipeline, and then exit?  I know
that there is work in progress to support profile-based environments
that would make this a little more feasible (as the environment wouldn’t
be as volatile as they are now), but it seems somewhat inconvenient.

This pains me especially in the context of multi-user systems.  I can
easily create a shared profile containing the tools that are needed by a
particular pipeline and provide a wrapper script that does something
like this (pseudo-code):

    eval $(guix package --search-paths=prefix)
    do things

But I wouldn’t want to do this for individual users, letting them
install all tools in a separate profile to run that pipeline, run
something like the above to set up the environment, then fetch the
tarball containing the glue code that constitutes the pipeline (because
we wouldn’t offer a Guix package for something that’s not usable without
so much effort to prepare an environment first), unpack it and then run
it inside that environment.

To me this seems to be in the twilight zone between proper packaging and
a use-case for “guix environment”.  I welcome any comments about how to
approach this and I’m looking forward to the many practical tricks that
I must have overlooked.

~~ Ricardo

reply via email to

[Prev in Thread] Current Thread [Next in Thread]