[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


From: Greg A. Woods
Date: Sun, 17 Jun 2001 22:03:41 -0400 (EDT)

[ On Sunday, June 17, 2001 at 20:26:34 (-0400), Ralph Mack wrote: ]
> Well, that depends upon what you mean by "an addition", doesn't it,
> Greg? If you mean the deletion of the old object with all of its baggage
> and the creation of a new object _with all of the old object's baggage_,
> I would tend to agree that the sequence implements the concepts of
> renaming or moving an object.

Yes, of course -- a rename is an addition "with all the old object's 

> In a Unix file system, that baggage consists of a set of permissions,
> owner and group assignments, and perhaps a couple of things that I
> forget, but is nonetheless a relatively light bundle.

Funny you should mention that.  A unix filesytem directory structrue is,
from the user's point of view, really just a map over the real
underlying unix filesystem.  When you "rename" a file you have to remove
an inode pointer and name entry from one directory file and add them to
another (or possibly the same) directory file.  In ancient unix versions
the rename was quite literally the creation of a "link" in the new (or
same with a new filename) directory file followed by the deletion of the
link in the old directory file (assuming the first half of the operation
succeeded, and of course assuming both directories are on the same
filesystem).  This of course made one of the most useful and simple
forms of file locking impossible since it created a race condition that
though when first described was though to be impossible to exploit
actually turned out to be big enough to float a large hot-air balloon
through!  The "atomicity" of the modern rename(2) system call is simply
a "hack" to make this type of file locking possible without resorting to
creating new directory files (which has always been an "atomic"
operation due to the sensitivity of filesystem meta-data should the
system crash or otherwise fail during this operation -- underneath of
course it's no more atomic than any other kind of operation that takes
more than one step).

> A file under revision control has a history of revisions (and - one hopes -
> ancestors and migrations) that describe all the important events of its
> colorful life. Does moving or renaming a file by deletion and addition
> retain the file's family history? It sounds to me as if it reduces it to a
> Unix file with none of these attributes and then recreates it as if it were
> newborn. All connection to the past is lost.

In CVS the history stays in place (it must, by design).  It's like the
unix filesystem in the sense that the data blocks, and indeed the inode
structure itself, stay in place when you rename the file (though in some
senses that's a terrible analogy for my purposes since in CVS you do
actually have to carry more of the baggage yourself! ;-).

In the filesystem analogy the rename simply removes a pointer from one
file and adds it to another.  In CVS you simply record a stop in one
file and a "start" in another file ((using the old file's contents as
the initial contents of the other file).  Perhaps if you pretend that in
CVS in theory all files exist at all times and in all locations in all
possible directory hirarchies, you'll be in a better position to
understand what's happening (consider the actual implementation simply
an optimisation of this ideal scenario).  I suppose having a bit of a
handle on temporal theory might help.  Try pretending your files are the
worlds from that not-so-great SF television series "Sliders".  They're
all there at the same time but you (or at least one instance of "you")
can only be in one of them at a time.  In CVS of course their history
does not progress while you're not there either -- they're not
preemptively scheduled!  ;-)

(the nifty, and complicating, difference is that with RCS, CVS, and
SCCS, etc. there's actually another dimension in space too [that's five
in total] -- branches!  ;-)

> Do you regard this loss as a frivolous concern?

THERE IS NO LOSS!  In fact if you do the remove and add with appropriate
commit comments there's actually an information gain -- and anyone
looking at the revision history of the files (from either direction)
will be able to follow what has happened with great accuracy.

> To me the power of source
> control lies not just in its ability to store all the versions of software
> but to accurately reflect the historic relationships of those versions.

CVS will not delete any revision history if you do a rename by way of
"cvs add" and "cvs rm" -- on the contrary it will add history entries
which detail this ``change'', provided that you (or your wrapper script
or front-end client) provide the appropriate change comments for each
operation.  Time for a concrete CVS example, I think:

        cp oldfile.c newfile.c
        cvs rm -f oldfile.c
        cvs add newfile.c
        cvs commit -m 'renamed oldfile.c to newfile.c' oldfile.c newfile.c

Now if you do "cvs log" on either name you'll find it very hard to miss
what's actually happened.  If the comment is even more detailed the
reason might even be apparent after the fact too!  ;-)

Think of the "cp" and combined "commit" parts of this sequence as the
baggage carrying bits.  The "cp" copies the current contents, and the
"commit" leaves pointers to the past and future revisions.
Hmmm... maybe that analogy with the unix directory tree isn't so bad
after all.  In Unix you have to "carry" the name (transforming it on the
way if it stays in the same directory or changes in the new directory),
while in CVS you have to carry both the name and the current file
contents.  In both the attributes (in Unix the file blocks and the inode
structure; and in CVS the revision history) stay in place.

What I find is really most important about the relationships of
revisions is not how they relate to one another along one "timeline",
but rather how the different revisions of *different* files relate to
one another at any given point in time.  CVS does this (best) with tags.

Certainly the relationships of the revisions for any one file are
important across time too, especially if you do branching and merging.
Obviously in CVS if you do the "Sliders" thing too often with too many
files then you'll make your branching and merging into a much more
difficult job.  Each renamed file will have to have all the appropriate
branches created in it and any merges will require manual handling of
all renamed files (which can turn into a real nightmare if you move
all/many of the files in one directory into another).  These limitations
in the base CVS functionality should encourage users to think very hard
about renames and indeed the multi-step operation is a good reminder of
the potential for future difficulties.

In all my time of tracking various incarnations of *BSD with CVS I've
only renamed about a half-dozen files.  I often do dread merging them,
but it's not really that hard to do once I get in the mood to do
it.... :-)  (I've actually moved whole sets of files to new directories
too, but also transformed them drastically at the same time -- eg.
moving gnu/awk to bin/awk and changing it into The One True AWK at the
same time, and of course then I don't have to do any merging....)

> After corporate restructurings, when there is nobody to ask, sometimes only
> the source control system can give any clue as to what really happened -
> and sometimes we can even infer why.

Indeed -- which is why CVS, when used carefully, is a very powerful tool!  :-)

                                                        Greg A. Woods

+1 416 218-0098      VE3TCP      <address@hidden>     <address@hidden>
Planix, Inc. <address@hidden>;   Secrets of the Weird <address@hidden>

reply via email to

[Prev in Thread] Current Thread [Next in Thread]