gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: cvs2arch (was Re: [Gnu-arch-users] an hack.. one night long)


From: Jan Harkes
Subject: Re: cvs2arch (was Re: [Gnu-arch-users] an hack.. one night long)
Date: Sat, 23 Aug 2003 13:17:25 -0400
User-agent: Mutt/1.5.4i

On Sat, Aug 23, 2003 at 04:04:29PM +0200, wave++ wrote:
>   http://www.yuv.info/~wavexx/hacks/cvs2arch-perf.png

Interesting. Did you by any change disable the pristine tree?

> I managed to place all useful informations in the same graph.
> We have:
> 
>   red: the time (in milliseconds) needed to update source using cvs.
>   green: the time needed by tla to commit the tree.
>   blue: the size (in kilobytes) of the whole arch working tree (includes
>         {arch} size).
>   magenta: the size of the sources (same as blue, but without {arch}).

Is red a CVS checkin, or a checkout?

Very interesting how the tla commit time closely tracks the additional
size of the {arch} directory tree. I guess the only change there should
be the addition of one logfile, which should be very quick for diff.

> As we see, cvs times tend to descend. This is probably thanks to the
> reversed patch format that RCS uses. tla instead tends to take a time
> that's roughly linear to the patchset number (not really affected by the
> sources: note the spike at patchset #50).

I don't think that has anything to do with forward or reverse deltas. In
fact commits should be faster with forward deltas. Only the modified
data is written to disk instead of having to copy the whole tree to the
head revision. It is probably not noticable on a 4MB tree, but just try
to import a couple of linux kernel revisions.

Reverse deltas really only improve a full checkout of the head revision.
Which really should only be noticable when a new developer joins a
project, or if you like to throw your working tree away once in a while.

Arch uses the pristine tree to avoid many performance problems in
situations where it would need to perform a full checkout of the head
revision (update/what-changed/etc.). Updating a tree using an
incremental updating technique (something like tla replay) shouldn't be
be better or worse in either situation because it will have to apply the
individual deltas either way.

> arch @ patch 280 is roughly 8 times slower than arch @ patch 5 in this
> case.

Notice how it closely tracks the size of the complete archive, while the
size of the actual source tree isn't changing much. Either you do not
have a pristine tree and tla has to check out the last head revision on
every commit, or something fishy is going on in the code that tries to
diff the logs for the working directory with those of the pristine tree.
Only one file should be added and that can not be all that expensive for
diff although I guess that tla is reading all the data in both trees
twice, once by tla because of a bad call to safe_stat in libarch/diffs.c
and the second time by diff because timestamps are not trusted.

Jan





reply via email to

[Prev in Thread] Current Thread [Next in Thread]