[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Medium sized binaries, lots of commits and performance

From: Doug Lee
Subject: Re: Medium sized binaries, lots of commits and performance
Date: Wed, 9 Feb 2005 11:04:06 -0500
User-agent: Mutt/1.5.6i

You first asked (or at least seemed to want to know :-) ) why
performance on a large binary CVS file goes way down when you update
from a branch instead of from HEAD.  Answer:  CVS stores the trunk
such that getting the HEAD revision is simply a matter of retrieving
a copy of it from the CVS file.  To get a branched revision, however,
requires the retrieval of the first version in the branch, then all
the deltas from then to the revision you want, going forward through
branch revisions.  I would therefore regard this performance hit
as a natural consequence of your use of CVS for binary source code,
unfortunately.  For a binary file, as you know, a "delta" can be a
considerable percentage of the original file size.

Your second question was how to remove old revisions in order to
improve performance.  I don't have a CVS manual URL handy like most
participants on this list seem to have, but check out the cvs admin
command.  It can indeed permanently delete revisions and ranges of
them.  You could, for example, delete all the revisions from the start
of a branch until two or so revisions behind its current state, so as
to speed up retrieval of revisions on that branch.

Good luck.

On Wed, Feb 09, 2005 at 04:37:14PM +0100, Jesper Vad Kristensen wrote:
Hi folks,

I've searched the net and mail archives for some help or workaround to
my problem, but most binary issues tend to deal with the impossibility
of diff/merge or whether very large files can be stuffed into CVS.

I and the rest of us out here work with Oracle Forms and that means
binary source code. At first I was very suspicious of moving to CVS
because we were having binary source code, but as it turns out I and
everyone else have become extremely happy with CVS. We can't merge,
granted, but with our external diff application we reap enormous
benefits from using CVS. Even branching is manageable.

But here's the problem, especially with our largest 3,5 MB file that's
been committed approx. 70 times. When doing a

        cvs update -r HEAD <filename>

things work real fast (5 seconds). But if we do a

        cvs update -r <branch version> <filename>

performance drops from 5 seconds to a minute and a half. I can imagine
something ugly happening with the "filename,v" file on the cvs server
which is 200 MB large.

The performance isn't killing us right now, but in maybe 6 months to a
year, who knows how bad it may have gotten?

So the question is if there are any administrative tools one can use to
compress/rationalize/index the file so branch access becomes faster? Is
there a way to permanently erase "stuff older than 6 months"?

And if not: opinions about my ideas below would be great? My ideas so

MOVE variant: I wouldn't _like_ to lose the history of the application,
but it might be acceptable if performance degrades too much. I figure I
could move the filename,v file on the cvsroot repository (to a "backup"
folder), then delete from client and add a fresh one and the 1-2 active
branches - but can any history be kept if you do this? Will the old
history be in the "backup" folder?

MIGRATE: An alternative would be to create a new folder (while keeping
the old one) and simply migrate _all_ 85 files to the new folder (grab
HEAD, add all in HEAD to new folder, grab endpoints on branches, add all
branches as I best can).


Jesper Vad Kristensen
Aarhus, Denmark

Info-cvs mailing list

Doug Lee           address@hidden
Bartimaeus Group   address@hidden
"It is difficult to produce a television documentary that is both
incisive and probing when every twelve minutes one is interrupted by
dancing rabbits singing about toilet paper."  --Rod Serling

reply via email to

[Prev in Thread] Current Thread [Next in Thread]