info-cvs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: checking in links to source control


From: Paul Sander
Subject: Re: checking in links to source control
Date: Tue, 11 Sep 2001 21:23:04 -0700

>--- Forwarded mail from address@hidden

>So here's my proposal:

>1) that CVS have a generic way of storing - and representing in storage - *any*
>   type of data that can occur on an OS, and how they are stored. We already 
>   have two types - 'file', and 'dir'. To that - on unix - we could add 
>   'symlink', 'permission', 'user', and 'group'. On NT, we could add 
> 'shortcut',
>   'properties' (and god knows what else)

Unfortunately, CVS has only one type of artifact that it controls:  files.
It does NOT control directories in the sense that they're under source
control.  What it does instead is it creates directories as needed in the
repository and in workspaces to contain files.  This is not the same thing
as controlling them; there is no history, no branching, no tags, etc.  This
is borne out by numerous holes in the design in this regard, e.g. the
inconsistent handling of empty directories upon checkout.

I do agree that CVS should add support for more kinds of filesystem
artifacts, particularly directories.  I'm skeptical about symlinks because
they're not portable, and I prefer to be able to store stuff that can be
retrieved on any platform.

On the other hand, I can always ignore the capability to store them if I
don't like it.  But I caution you on this:  If you have similar capabilities
on different platforms, symlinks on Unix, shortcuts on Windows, aliases on
MacOS, you will not be able to store them in the same place in the filesystem
under CVS.  I believe that the temptation is great to lump these all together
as a single artifact given their similarity, hence developers will
prefer to store them in the same place in the filesystem and will complain
if they can't.

I'm also not convinced that permissions, users, and groups are actual
filesystem artifacts.  I see them more as attributes of filesystem artifacts,
though there is something to be said for maintaining a user database
separate from the system's password file.

Finally, I believe that CVS should provide a means of extending its
notion of what's allowed to be stored in it.  Replacing RCS with a storage
mechanism that's more suited to the type of data stored in an artifact
(storing AppleDouble files is one ready application of such a feature,
as is storing entire directory trees such as those produced by NextStep),
changing the diff and merge algorithms for different types of data, and
other stuff come to mind.

>2) that this 'generic way of storing' is all due to work on the client side -
>   the client 'serializes' the special files by making and storing them as 
>   attributes inside the ,v file description itself. If the client can't 
>   serialize them properly, then too bad.

Do you mean that the client should reduce the filesystem artifacts to a
form that the CVS back-end can store?  Interesting notion, but you must
be very careful to make sure that all of the clients agree on how to
manage the artifacts.  My preference is to have the back-end define the
interface, thus requiring the clients to conform to some standard.  The
clients and back-end can negotiate which parts of the standard they agree
upon, perhaps through a simple protocol version number.

However, I do recognize the need for the client to cooperate with the
server to pack the artifacts properly.  That will be necessary to store
some of the artifacts I've listed above.

>3) that each attribute is handled by the client, but that only certain types of
>   attributes are turned on by 'default', ie: that have intelligent defaults
>   given the OS that they are working in, and that others need to be explicitly
>   stated when put in. For example -

>   cvs commit <link_path>

>   can be on by default because there are no backwards compatibility issues, 
> but

>   cvs commit -p (preserve permissions on files on checkin)

>   can not because old versions of cvs aren't capable of handling it.

Okay, so I add a file using my Unix client, which sets an attribute.  You
check out that file with your Windows client, and it interprets that
attribute.  Given that the client's interpretation of the attribute is
outside the protocol specification, how can either of us be sure that the
interpretation of the attribute is usable.  Worse yet, what if two
separately implemented clients happen to use the same attribute for different
purposes?

>4) On checkout, whatever attributes are set are processed by the client. If a 
>   given attribute can not be processed by a client, a warning is given 
>   (suppressible via an 'ignore' command line option or attribute) saying that 
>   that file is not portable on a given system.

>5) attributes can be programmed in the same way that apache can be extended:
>   by modules that hook on to the the client and both do the serialization,
>   and process it upon checkout. 

>   However, I don't think that it should be *trivially* programmable. One 
> should
>   need to write a C-module to hook into the CVS code - this would prevent a 
> lot
>   of frivolous modules and keep the cvs source coherent. I think though that
>   these modules should be released as part of the cvs tree, and not as 
> separate
>   entities.

I think that attributes should be first-class features, which convert to
newphrases in the RCS files.  That way, we can guarantee that they'll work
the same way for all of the clients that support them.

The down side to putting them in the RCS files is that they can't be
efficiently queried for by the user.  But then, the folks who need that
capability in CVS have traditionally augmented it with a relational database
to handle this.

I also believe that the extensions I've listed above should be trivially
programmable.  Not only that, but there should be some kind of inheritence
mechanism that allows shops to define new types are artifacts that are very
much like other types, except in some small way.

>6) things like diffs between versions in source control are handled by doing 
>   diffs *with the tags intact*. 

>   For example, suppose someone checks in a file with an attribute showing 
>   permissions 644, and then checks in the same file with permissions 755.

>   If they type:

>   cvs diff -r1.1 -r1.2 <file>

>   then it should show something like:

>   < @permission=0644
>--
>   > @permission=0755

I disagree that the diffs should list all of the attributes, at least
by default.  Diffs should operate on the content of the file, not its
metadata.  However, I can see the value of adding a switch to enable
attribute dumps.  There is also the possibility of refining an existing
artifact type to replace its diff algorithm with one that displays changes
to attributes.

But I have to ask:  How do you propose storing changes of attributes when
the content of the file itself doesn't change?  Some attributes have the
scope of the entire version tree (e.g. keyword expansion), while others may
change from version to version (e.g. RCS "state").

>7) things like diffs between a version in source control and a version in a 
>   directory are handled by:

>   a) serializing the version in a the directory with all 'default' 
>      serializations for the given os. 

>   b) serializing the version in the directory with 'optional' serializations
>      on the command line.

>   c) comparing the resulting serialized file to the file inside CVS


>Examples - 

>   cvs commit linkname

>head    1.1;
>...
>...
>desc
>@@
>@link=/path/to/link

>   cvs commit shortcut

>head    1.1;
>...
>...
>desc
>@@
>@shortcut=.... \
>       attributes=archive
>       shortcutkey=alt-x


>Now - does anyone have any technical comments on how feasible this would be to 
>implement? Or holes in my design that I am missing that I need to fill? I 
>understand philosophical objections to my style of project management - they 
>are duly noted - but I'd rather that the tool fit my versioning needs. 

>Which at the basic level is to store a bunch of 'stuff' where stuff can be 
>basically anything under the sun and not just files and directories. And which
>is probably - but not necessarily - OS independent.

>--- End of forwarded message from address@hidden




reply via email to

[Prev in Thread] Current Thread [Next in Thread]