bug-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

a gentler introduction to slashpackage (was: glibc; introducing slashpac


From: Paul Jarc
Subject: a gentler introduction to slashpackage (was: glibc; introducing slashpackage-foreign)
Date: Thu, 10 Mar 2005 17:01:47 -0500
User-agent: Gnus/5.110003 (No Gnus v0.3) Emacs/21.4 (gnu/linux)

Forgive the long-windedness, but there are some misunderstandings I'd
like to straighten out, and I'd like to be thorough.  Off-topic for
bug-hurd; Mail-Followup-To set.  Feel free to ignore this, but in that
case please don't hold too dearly to any bad impressions you have
about slashpackage, because they're probably based on a
misunderstanding.

Alfred M. Szmidt" <ams@kemisten.nu> wrote:
> This does exactly--from what I can see-- what stowfs/package will
> do, but in a less flexible, less Hurdish, less GNUish way, and in a
> less clean way.

Slashpackage is indeed less Hurdish, because it's based on different
goals, intending to solve different problems.  AIUI, stowfs/packagefs
is like rpm/apt/etc... it's a way for an admin or a distro to manage
all the packages installed on the system.  Although slashpackage can
be used for that - I and others do that - it was designed with rather
different problems in mind.

Suppose I, a software author, write a package foo.  How will other
software that depends on foo find it?  The traditional approach is to
use search paths: $PATH for commands, the compiler's usual paths for
headers and libraries, etc.  But this only covers certain kinds of
files, and is problematic even so.  Scripts need absolute paths for
their interpreters - #!/usr/local/bin/foo will work if foo was
installed manually by the admin, but it has to be changed to
#!/usr/bin/foo when foo is installed by the distro.  Search paths for
headers and libraries may be incomplete, so the files are not found;
or may be in the wrong order, so the wrong version is found.
Different packages may provide commands, headers, etc. with the same
basename, so any one search path will certainly be wrong some of the
time.  Such conflicts will be resolved differently by different
distros - often by renaming on of the conflicting files - but then
code that depends on one of those files can't use just one name and be
confident that it will find the right file.  Real-life examples:
http://cr.yp.to/slashpackage/studies.html

Of course, people have survived with these problems for a long time,
using downstream workarounds.  It's possible to live with them, but
it'd be nice if we could solve them upstream instead.  This requires
two things: two files from different packages cannot use the same
name, and a package's files must be accessible by the same path no
matter whether it was installed by the admin, distro A, distro B, ....
Dependent packages will then have a much easier time finding the files
they need.

The first requirement - no name collisions - means we need a central
registry for names that could collide.  This involves global
coordination, which can be a lot of work, so we'd like to keep it to a
minimum.  Delegation accomplishes that - since we have a hierarchical
namespace in the filesystem, we can assign "ownership" of a directory
to a certain package.  That package can include any files within that
directory without colliding with any other packages.

The second requirement requires that everyone use the default
installation path.  No one can enforce this on anyone else, of course,
but as long as you're not using the path /package/admin/daemontools to
refer to anything else, you might as well make it refer to the
registered daemontools package, even if that package is stored
somewhere else in the filesystem.  (/package -> /usr/local/package is
a common symlink here at my job.)  I can't see any reason other than
taste to insist that this path not exist at all, and that would cause
real problems - dependent packages will have to work harder to find
their dependencies, and may guess wrong.  I have to take utility over
aesthetics.

The registry may seem like it intrudes on admins' authority over their
own systems, but I don't think there's much practical value in that
objection.  DNS is a perfect comparison point: it's entirely possible
for me to set up my network so that www.yahoo.com refers to something
of my choosing, but this namespace is much more useful when everyone
uses the same names to refer to the same things, and delegations
simplify the job enough to make it practical.  I can send you a URL
containing a domain name and be confident that it will give you what
it gives me; I can send you a program containing a pathname and be
confident that it will behave for you as it behaves for me.

Registering package names ensures that one file won't clobber another,
but if we stop there, it means that all files have to be accessed by
their full paths.  For headers, libraries, etc., this really isn't
much trouble with an automatic build system - I've written one.  But
commands are typed quite often, and everyone is accustomed to invoking
them just by their basenames.  So we need to register command names as
well, or else we may end up invoking the wrong one, even though both
files are intact.  Now since we have a collision-free command set, we
symlink all the installed, registered commands into /command so we
only have one directory to add to $PATH.  Symlinks are also created in
/usr/local/bin, for compatibility.  Your old $PATH setting will still
work, unless there's a collision between a registered command name and
an unregistered one - in that case, you'll have to munge $PATH or
specify an absolute path.  The benefit we get out of the system is
proportional to the amount of participation put into it.

So what we're looking at so far is a central registry that keeps track
of package and command names; packages that install under their
registered paths by default; and hopefully, distros either not moving
the packages' files around, or if they do, hopefully they will leave
symlinks behind so that the registered path can still be used to
access the package's files.

Note how different in spirit this is from traditional package
managers: adoption of slashpackage is something an author does, for
the sake of helping other authors and users.  There isn't much for
distros and admins to do except to not mess with it.  The methods
available for this are determined by the fact that authors are not in
a position to make changes to the rest of the system; they have to use
what's there, which means traditional filesystem semantics.  I don't
know of any other way of accomplishing these goals, from the
author's/maintainer's perspective.

Already, this system is looking different enough from the traditional
layout that there's no point in trying to remain similar for
similarity's sake.  So now we come to the while-we're-at-it features.

Categories: it was decided that package paths would look like
/package/admin/daemontools instead of just /package/daemontools, to
keep any one directory from getting too big, although not everyone
agrees that this is worthwhile.  Personally, I don't have strong
feelings either way.

Multiple concurrent versions: we can have
/package/admin/daemontools-0.75 and /package/admin/daemontools-0.76
both installed at the same time.  /package/admin/daemontools is a
symlink to the current version; other packages access daemontools'
files via the symlink, so they don't need to know which version it is.
For testing a new version, etc., we can explicitly refer to
/package/admin/daemontools-0.76/command/supervise.

Atomic upgrades: after we've installed 0.76, we play with it for a
while until we're satisfied that it works, while the
/package/admin/daemontools symlink still points to 0.75.  When we're
ready, we can replace the symlink with a new one, pointing to
daemontools-0.76.  This just takes one rename() call, and the whole
package (as seen by the rest of the system) is upgraded in the same
instant.  There's zero (not just low, but zero) risk of invoking a
command and having it link to the wrong version of a library because
only one of the command or the library had been upgraded at that
moment.  (Actually, daemontools in particular doesn't include any
libraries, but the principle applies in general.)  We can also revert
the upgrade if we later decide 0.76 isn't as good.  It's a bit like
revision control for the package database.

Consistent installation interface: to install a package, you cd to
/package and unpack the tarball - the tarball holds the package's
files under its registered path, so they end up in the right place.
Then you cd to the package's directory and run an install script.
This script is in the same place for all packages.  (Again, it's up to
authors to do this.  Repackaging by distros should not be needed, and
should not make incompatible changes in any case.)  So far, there is
no established standard for representing build-time configuration
options, but there aren't too many different schemes in use, and they
share some significant ideas, so I think it's reasonable to expect
that a standard will be established.

At first, this certainly appears to be more complex, perhaps
unjustifiably so, but really it's a tradeoff.  The traditional layout
is simpler in some obviously ways, but more complex in others
(collisions, incompatible repackaged versions, non-atomic
upgrades...).  You might say that the traditional way is optimized for
reading (a file's path depends on how it is used), while slashpackage
is optimized for writing (a file's path depends on what package it is
installed with).

As for reading, it's not hard to locate files in a package-specific
directory, and to specify those files concisely, if you're using an
automatic build system.  It's already common to see configure scripts
that have options like --with-zlib=/path/to/zlib.  The difference with
slashpackage is that package-specific paths are the default.  It
*seems* harder to find files when they're scattered all over, but I
think a more precise way of putting it is that it's harder to *guess*
which package might contain a particular file.  Once you find out,
though, you'll put that knowledge in your code and finding that file
is trivial thereafter.  And when the file belongs to a slashpackage
package, you probably won't know much about the file without knowing
which package it belongs to anyway.

Some might say that no matter what the default install path is as set
by the original author, OS maintainers *should* move things around
when they integrate packages into their systems.  This, also, is a
tradeoff.  Similarity of packages within one system makes system
administration easier, as long as all your systems are homogenous.
But it comes at the expense of compatibility of a package on one
system with the same package on another system.  It's easy to say that
compatibility is not worthwhile, because the first steps toward
preserving compatibility make some significant dents in intra-system
consistency.  But just as people have put significant effort into
working around compatibility problems, I think with some work we can
reduce the pain from inconsistency, and I think that result would be
better overall.

One might also object that requiring the same absolute paths
everywhere doesn't work for non-root users installing things in their
home directories.  I think that is a valid objection, and that's the
reason that the packages I've written, while they install to
/package/... by default, can also be unpacked and installed in
~/package/..., etc., instead.  Paths to dependencies also default to
those packages' own default installation paths, but can be
reconfigured at build time.

That covers the original intent of slashpackage.  Whew.  I hope anyone
still reading by now will at least understand and be somewhat
sympathetic to the idea of slashpackage, even if you don't fully
agree with it.

Now we come to slashpackage-foreign.  This is a project I started
where all packages are installed in their own directories under
/package.  So in a way, this is just another distro, though an
unusual-looking one.  I registered the prefix /package/misc/spf (an
unfortunate and ironic name collision with the email antiforgery
effort, there).  I install packages as /package/misc/spf/python-2.4,
with a symlink /package/misc/spf/python pointing to the current
version, and symlinks for commands in the usual places, so $PATH
remains the same.

My installation scripts configure each package to find its
dependencies in the appropriate place.  Specifying all those
dependency paths is indeed extra work compared to what other distro
maintainers have to do (but it's usually not difficult; --prefix= and
--with-foo= are commonplace and work well).  It even violates the
original intent of slashpackage - distros should go with the author's
defaults, and not make incompatible changes - but I consider it
worthwhile for the sake of multiple concurrent versions and atomic,
reversible upgrades.  Anyone who thinks distro maintainers should move
things around, shouldn't be too upset here - slashpackage *is* the
unified consistent layout on my system, and I'm making other packages
fit into it. :)

I could go on about more advantages of this system, but I don't want
to flood bug-hurd any more than necessary.  My goal here was just to
help people understand that slashpackage is so different from other
packaging systems because it is aimed at solving fundamentally
different problems, and that these problems pretty nearly dictate that
the solution must look more or less the way slashpackage does.  You
may still not like slashpackage, but I hope that by now you will at
least agree that slashpackage is a decent way to attack the problems
it is meant for, and the disagreement will be confined to the holy war
over which problems are more important to solve. :)


paul




reply via email to

[Prev in Thread] Current Thread [Next in Thread]