gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Why we might use subversion instead of arch.


From: Pierce T . Wetter III
Subject: Re: [Gnu-arch-users] Why we might use subversion instead of arch.
Date: Fri, 20 Feb 2004 12:41:05 -0700

  Now for the bad stuff:

  Ok, so I tried experimenting with arch. The first thing I did was
check out something from a public arch repository. I got quite a shock.
Evidentially, every arch repository stores the "base code", then
follows that with a series of forward patches. This is quite different
from most other version control systems, which store the head version
as "truth" and then keep reverse patches going backwards. The net
effect of this is that checking out that version required downloading
not just the latest code, but downloading all the patches in between.

That's not quite true.   Arch is actually quite flexible in this area
and is becoming more so.

What you've described could be called the "baseline behavior" of arch
archives.  If nobody takes any other steps, things work as you've
described.

You're obviously Mr. "nuts and bolts", while I'm Mr. "UI" which sort of implies a 10,000 foot perspective. Half my problem with arch is that every version control task seems to break down into several steps. Its more flexible that way
I suppose but harder to learn for the general case.


However, arch has a some facilities that change that baseline
behavior:

"Archive revision caching" is one mechanism -- see `tla cacherev -H'.
That can spare you from downloading all of those intermediate
revisions.

 Ah, I didn't know about that. So basically, as part of our deployement
cycle, we would tag like we already do, then cache revisions in the master
archive.

"Revision libraries" are another mechanism.   They can spare you from
having to build various historic revisions that arch may need during
its operation.

"Local mirrors" are another mechanism.   They can help you minimize
network costs.

In practice, I think most arch users are like me and use a combination
of those to get very good performance.   The specific combination of
those features that's best for your situation varies depending on your
network and hardware topology and usage patterns.


  Ok, so we would have this topology:
   master repository:  available via web dav or rsync.
     Per user:
Local mirror of master (can be one per user's network for telecommuters) work in progress archive created as needed from local mirror.

So to commit, I would first have to update my local mirror. (tla archive-mirror)
  Then I update my work in progress archive. (tla update)
Then I commit directly to to the master repository, since I can't commit to the local mirror?
    (I'm a little iffy on exactly how you do this, but ok.)
Then I update my local mirror, which pulls down my change. (tla archive-mirror)

So that doesn't sound so bad, except I wish that archive mirrors knew they were
mirrors, so that update/commits were a single operation.

Now I read some stuff on the wiki about how you can make all that
faster by making a new archive (which moves the base), but I shouldn't
have to change my work process to make the version control system
efficient.

You aren't really moving the base -- you're just mirroring it
locally.  It's a foreign concept if you are used to CVS but, once you
get the hang of it, you'll find that it's fast and convenient.

Yeah, I guess I'd rather see something like "--make-local-cache ~/tlacaches" that made a sort of transparent mirror of a remote archive, with the understanding
that updates and commits would flow through the mirror.



  The next thing I noticed was that while CVS and Subversion let you
structure your projects and sub projects via the filesystem, arch
really tries to grab the whole filesystem as one unit. You can override
this a bit, but it involves setting up some config files. Config files
that are kind of poorly documented (based on the fact that I couldn't
make heads or tail of the explanation). This makes a lot of sense for
open source projects focused on a single executable, but makes much
less sense for us. I suspect most people deal with this but just having
lots of arch repositories:

    /archrepositories/blessed/tool
    /archrepositories/blessed/library
    /archrepositories/blessed/application

  But that would be a nightmare for us.

I wonder if, at your shop, you have someone whos job it is to manage
your process and set up tools that make it easy for developers.

  That would be one of the developers most likely.

Far from being a nightmare, _if_ you can get over the initial hump of
figuring out how to set things up, I think you'll find that arch
enhances your processes and makes them work better.

Oh, yeah, arch is really intriguing. It in many ways seems closer to how we
actually need to work. Its the learning curve I'm having a problem with.


  The next thing I found was that it was SLOW. tla is kind of brute
force, and all that diff-ing, tar-ing, and compressing can take quite a
while.

It does take a little bit of care to set up a tla environment that is
fast but many people have done so successfully.   Once you do so, I
think you'll find many advantages compared to CVS and SVN -- but it
does take some care.

  It seemed slow even on the local filesystem.


  So at this point, while the distributed repository stuff was cool, I
had to conclude that arch works best for working on open-source
development where you don't submit code so much as you submit patch
files, and you need to merge patches from multiple places. From that
point of view, arch is great. From ours, ugh.

I think you are overlooking how similar the workflow you've described
is to "open source stuff".

Well, everyone has patches that they have to submit to the master. The thing that bothers me about arch is that I'm not sure it would handle our three level deep nesting of projects well. There's this archive/category/project heirarchy, but we really have:
 archive/category/sub-category/project.



  How I would improve arch:

    Fundamentally, I think that arch should store HEAD, with reverse
patches, rather then START with forward patches.

Arch already allows you achieve, in essense, the same effect.   You
just need some help deploying it in your environment.

Arch lets me cache if I understand correctly. My objection is more philosophical:

  The build source should match some set of files on the repository.

(pretend you don't trust patch. In that case, you'd really like a snapshot of HEAD
to always be stored somewhere...)



    The rsync protocol would make more sense then webdav or ftp.

For mirroring?   You can already use rsync.

 Ah, ok.

Improve the documentation, especially needed is a section with some arch concepts, so that you don't have to pick up everything by osmosis.

Yup.   In the meantime, you might consider a small-scale consulting
gig for one of the experienced arch users.   I would hope that you can
find someone on the list who can help you with your evaluation and
prospective infrastructure design for a few hundred bucks.

  Yeah, I think I'd have them write up a:

   "Getting started with arch for a development group'

 document for public use, rather then setup our archive in particular.

Or, like I said in the bottom of that long mail, have a bunch of higher-level scripts that automated things from the use case point of view. It would have to make some assumptions that tla doesn't make, but it would make switching to arch easier to do until
you wanted to do something off the beaten path.

 Pierce





reply via email to

[Prev in Thread] Current Thread [Next in Thread]