gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] tagging-method explicit implementation


From: Bruce Stephens
Subject: Re: [Gnu-arch-users] tagging-method explicit implementation
Date: Fri, 29 Aug 2003 09:28:04 +0100
User-agent: Gnus/5.1003 (Gnus v5.10.3) Emacs/21.3 (gnu/linux)

"Stephen J. Turnbull" <address@hidden> writes:

>>>>>> "Bruce" == Bruce Stephens <address@hidden> writes:
>
>     Bruce> it's not just that ext2 filesystems don't do small files
>     Bruce> efficiently, it's that the implementation is just wrong.
>
> Yeah, but this is also a problem of the filesystem.  Unix files have
> no logical identity.  There's the inode, or physical file, which has
> identity.  But lots of the ways in which we edit files (eg, Oh,
> damn!  well, OK, restart: "mv file~ file") change the physical
> identity, although the logical identity is the same.  There's the
> name, but that's real useful: "mv /boot/vmlinux /boot/arch.html".
> Now, arch.html is really a Linux kernel, right?  Note that in the
> first case, the logical identity associates with the name, in the
> second, with the inode.  There is a semantic difference between mv
> and rename, but the Unix file system doesn't allow it to be
> expressed.

And then there's copying.  I suspect a filesystem design which
included the semantics of logical identity would be quite
complex---probably it wouldn't get used for that reason.  One of the
advantages of Unix (and more so for Plan-9) was that files were
bytestreams, and programs imposed whatever structure they wanted onto
them.  Making the filesystem more complex means making programs know
about this complexity, and I'm not sure there's enough justification
in this case.

>     Bruce> And the implementation goes right into the changeset and
>     Bruce> archive storage.
>
> There's no choice.  The identity of a file must go with the file.
> Since the file system doesn't implement it, arch must.  Lose, lose,
> yes.  But what are ya gonna do?

I'm going to keep an inventory---an inventory is what I actually need
to use anyway.  When merging (or whatever) Arch doesn't go through
using taglines or .arch-ids files: it scans all of those to form an
inventory, and uses that.  (This information is even stored in the
archive, in the form of the mod-files-index, etc., files.)

> That's why tagline (y amigos) is attractive.  What we really want is
> a virtual file system (ie, dentry) that allows attaching arbitrary
> properties to a create a logical file identity, a "resource fork",
> as MacOS has.  We don't have that, so put it in the file.

Indeed, and sticking it in the file makes lots of sense, regardless of
how you use the information.  

I can see the idea catching on in systems which don't currently use
the idea (it wouldn't be hard to write a scanner which could generate
the appropriate sequence of svn operations to move files, for
example).  Or perhaps it won't catch on, because renaming files is a
sufficiently unusual thing to do that it's not so hard to remember to
tell the CM system that you've done it.

[...]

>     Bruce> if everything just used an inventory (directly or
>     Bruce> indirectly), then the tagging method would be almost
>     Bruce> entirely a user-interface preference.
>
> But "inventory" implies "audit".  Go to any company or school.
> Every desk, every personal computer, in really anal places even the
> stapler and cutting board will have an inventory number stickered or
> stenciled to it.

Sure.  But some places needn't do that---they could keep an inventory
somewhere, and rely on people updating it when the move things (or add
new things or remove things).  That's error-prone, sure, but in some
circumstances it might be sufficient---and it saves putting labels on
everything, so you can start even when you don't have permission to do
that.  Ultimately, the labels aren't useful in themselves
anyway---they're useful because they allow you to build an inventory,
and it's the inventory which allows you to discover that things have
moved or gone missing.

I'm just saying we should make the inventory more visible, so it can
be manipulated explicitly.  Then we don't need .arch-ids.  We can
still use taglines, but the tagline method will become more useful
even when many files don't contain taglines, because you won't need to
guess beforehand which files to explicitly add.

> I think that the only mistake Tom made was to allow human-readable
> inventory tags.  That leads immediately to issues of "is whitespace
> significant?" etc.  (Worse, comment delimiters---but only the
> trailing ones!---are significant.  Oops!)  All the rest of the warts
> were unavoidable, they came with the frog.

In terms of taglines, I agree.  But .arch-ids aren't
unavoidable---storing all the information in one file is an
alternative implementation.  It also happens to be what you want to do
anyway, even if you use taglines to collate the information.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]