gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Google Summer of Code


From: Stephen J. Turnbull
Subject: Re: [Gnu-arch-users] Google Summer of Code
Date: Sat, 22 Apr 2006 21:54:55 +0900
User-agent: Gnus/5.1007 (Gnus v5.10.7) XEmacs/21.4.19 (linux)

>>>>> "Thomas" == Thomas Lord <address@hidden> writes:

    Thomas> It seems to me that what you are really saying is that
    Thomas> performance was, in general, a barrier to adoption.

Definitely.

    Thomas> Performance of what operations, though?  For example, my
    Thomas> non-scientific impression is that if `commit' were much
    Thomas> faster we would have had a far better chance of winning
    Thomas> the heart and mind of Linus.  That wasn't his only issue
    Thomas> with Arch but I think it was the deciding one.

Sorry to say, but after trying git pretty seriously, I don't think you
had a shot.  I'm not sure I can put my finger on it, but I'll give it
a try.

Arch, like Darcs, is patch-oriented.  It's also history-oriented.
This is inherently a bottleneck in any merge-oriented process, because
when you hit a conflict, it pertains to a given patch in some order.
A patch-oriented SCM needs to stop there to avoid horking things even
worse.  Darcs tries to get around this with "patch theory," but there
are some things (eg, files with "hot spots," like ChangeLogs), that
Darcs doesn't handle any better than anything else.  And its algebraic
manipulation of the patch chain also has bad algorithmic properties,
sometimes I think it's exp-exp. ;-)

git, on the other hand, is snapshot-oriented, with an efficient
representation of the snapshots.  A patch is defined as the diff of
two snapshots.  It is no better than Arch or Darcs at avoiding
conflicts, of course, probably substantially worse, in fact.  In my
limited experience with all three, what it does substantially better
than either, though, is two things.  (1) Since you "teleport" directly
from here to there, *all* of the conflicts show up in one shot, giving
the manager a better shot at deciding whether the merge is feasible,
or if he needs to go to "Plan B."  (2) It's much easier (again in my
limited experience) to figure out what "Plan B" is.

Regarding performance, I'm not using git across a network, except to
occasionally update git itself.  However, local operations all seem to
be basically O(diffsize).  Commits are usually instantaneous, diffs
seem to go about as fast as the output device can handle, etc.  I
tried some experiments with micro-branching in arch; they ran into a
performance bottleneck pretty quickly.  Branches in git are plenty
lightweight for micro-branching (which git people call "topic
branches").  They're easy to make, they're easy to commit to, and
they're fast enough to switch back and forth in a single workspace for
many of my purposes.  I *never* would have tried that in Arch, but
even on the abysmally performing Mac file system, it's very doable in
git.

The other thing that I like about git is that it constructs history
from a tree of objects, it doesn't store it centrally.  This means
that you can back-build a repository fairly easily.  This is important
to me because I'm transitioning from a badly broken CVS repository and
want to untwist the history.

Do you care?  I don't know.  Good performance on simple operations is
always nice, but might not be the sine qua non if you can get superior
merge capability.  The success of various cvs2* scripts show that most
CVS repositories aren't as broken as XEmacs's.


-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]