info-cvs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Reduced performance related to the number of branches


From: Mark D. Baushke
Subject: Re: Reduced performance related to the number of branches
Date: Tue, 07 Feb 2006 08:22:15 -0800

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

simmo <address@hidden> writes:

> I have been using CVS for some time now and regularly make use of
> branches to carry out new lines of development without affecting the
> core code base and as far as I am concerned it works very well.
> 
> A friend of mine has been using CVS for some time and also wants to
> start making more use of branches, but he has had some opposition from
> some of his team members. One of the main reasons they have come up
> with, for not using branches is that when lots (not sure how many lots
> are) of branches are created, the performance of CVS is greatly
> reduced! I have never heard this ever mentioned before. I certainly
> haven't read it on any posts on here, in the CVS book I have or after
> doing a quick search on the web.
> 
> As I understand it, there is not a great deal of difference between a
> tag and a branch both of which are  essentially entries in a file
> somewhere in the repository. So I guess that the more entries that are
> made the longer it could take to search for a given branch or tag name,
> but as I see it the cost would be negligable. I also assume the same
> could be said about creating lots of revisions of files as well.
> 
> Unfortunately I can't really offer my friend a cast iron guarantee
> whether I am right or his team are right - which I'm finding
> frustrating! I certainly haven't got the time or the inclination to do
> any tests so I was hoping someone would be able to confirm one way or
> the other, or maybe point me in the direction a web page or something
> that does.

RCS tags are not expensive. Even lots of tags are not a big problem.

CVS is mostly using RCS format under the covers. A new CVS branch when
created does not actually introduce any new delta records until the
first commit to that branch which diverges from the parent revision.

It is always fastest to checkout the main trunk revision as that
revision is essentially kep intact in the RCS ,v file. To go back
to previous revisions means applying reverse deltas.

A branch is a forward delta from the baseline revision on which the
branch is formed. So, as new revisions are added to both the main trunk
and a branch, it becomes necessary for RCS to apply reverse deltas
back to the common ancestor of the main trunk and the branch and then
to apply forward deltas until the tip of the branch revision is reached.

So, over time, long-lived branches will take more work and potentially
longer time. However, this will NOT impact the length of time it takes
to checkout a new revision and only minimally impact the amount of time
it takes to rewrite the file when adding a new delta revision into that
file.

You may find this document to be of interest:

  http://www.uvm.edu/~ashawley/rcs/tichy1985rcs/html/ar01s03.html

So, generally, you may make the following statements:

   - As you add to the number of revisions in a file, the time it takes
     to rewrite that file will increase. This is true regardless of the
     location of the delta location on the main trunk or a branch.

   - RCS/CVS is optimized for checking out the main trunk. Any work on a
     branch will take more time than work on the top-of-tree main trunk.

   - Long lived branches will indeed take longer to checkout than the
     main trunk and that time will grow as the number of reverse and
     forward deltas continues to grow.

In the real world, I have seen a few cases where a branch that was
forked for parallel development early on in the project. In a particular
file, the results of a baseline regression test was being committed to
both the branch and to the main trunk. The number of deltas was around
six per day to both the branch and the main trunk.

In a particular file in the repository, after a many thousand deltas
between the top-of-tree main trunk (it had around 10200 revisions on the
top-of-tree and ~25000 deltas total) and the top-of-branch, the time
needed to checkout or update the branch began to take a few seconds for
just that one file. Renaming the branch and then forking a new branch
- From the main trunk reduced the checkout time to a negligible time once
more. However, checkins to the file did take more time as the file had
grown into one that was a ~2GB in size. (However, this approach may not
be a suitable one depending on your needs.)

So, I would recommend that CVS branches be used only for a limited time
rather than for projects of an extended (multi-year) projects.

The most reasonable choice (in my opinion) is to have development of the
next major release performed on the main trunk while branches are used
to stabalize a 'release' of the code base and branches on branches for
bug fixes for a new patch release of the main product.

Your applications may differ.

I hope you find the above information useful.

        Enjoy!
        -- Mark
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (FreeBSD)

iD8DBQFD6Mk2Cg7APGsDnFERAhtWAJ9lvsf88QhJ7ldB3XVVTk7ru7GYnwCg8eD3
kJBO6gUoKk7HoB7o/pSTwqk=
=DaFB
-----END PGP SIGNATURE-----




reply via email to

[Prev in Thread] Current Thread [Next in Thread]