[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CVS corrupts binary files ...

From: Paul Sander
Subject: Re: CVS corrupts binary files ...
Date: Sat, 5 Jun 2004 20:52:06 -0700

>--- Forwarded mail from address@hidden

>Doug Lee <address@hidden> writes:

>> In other words, if I set -kb on a binary file and then do
>> nothing to it but commit updates and sometimes request an
>> old revision, keeping my sandbox in the OS in which it was
>> checked out, could I ever get a bad result?

>I have not ever noticed such a bad result.

>I have noticed non-optimal checkout and commit
>performance of binary versions of files when the
>,v file starts to exceed a few hundred megabytes.

>If the server is not properly configured with lots
>of memory and temporary space, there can be problems
>doing checkouts of lots of files of this type.

'Course, this holds true for multi-megabyte text files, too.
But those are rare...

>> This discussion of binary files has gone on a long time,
>> but either I missed the answer to this, or I never saw it
>> stated. If Greg Woods is reading this, you implied it once
>> in a rather angry message. I welcome the proof, preferably
>> without the anger. <smile>

>The problem arises when two users concerntly modify a binary
>file. After the first user commits, the second is stuck with
>trying to figure out how to merge in the other set of changes
>by hand with no help.

There's no reason for CVS not to offer help, other than nobody's
added it, yet.  The only reason CVS doesn't help is because the
only merge tool it knows is diff3, which is text-based.  There's
not a reason in the world why it couldn't be replaced with
something more general, as was shown back in 2001.

>Adrian Constantin <address@hidden> writes:

>> > >Adrian Constantin writes:
>> > >>=20
>> > >> Or maybe projects for Unix/Linux platforms
>> > >> do not usualy have binary files, but I
>> > >> don't really think so...
>> >=20
>> > >CVS is a *source* control system; source
>> > >files are rarely binary.
>> >=20
>> > I disagree with this statement. Source files
>> > are any files that cannot be reproduced
>> > automatically. That is, changes must be made
>> > to them manually using some editor, and that
>> > editor need not be the likes of vi or emacs.
>> > MS Word (or Frame Maker) documents, images of
>> > various formats, and documents from various
>> > design tools (e.g. GUI builders) are common
>> > examples.
>>   I tend to see this as a serious break in the
>>   cvs design .

>Many people do. This is why I and others
>consistently suggest that cvs may not be the best
>source control system for general use.

Yeah, well, sending such hapless people away is easier
than fixing the tool.  Demonstrating that such support
could be added to CVS was done in less than eight man-hours;
that's less effort than installing and training on a
second tool.

>>   Today it is not relistic to assume serious
>>   projects with many developers involved will
>>   only contain text files.=20

>It is realistic to suggest that cvs is not
>optimized for many kinds of 'serious' projects
>with constraints of using binary files as a part
>of the 'source' to be controlled.

It is also realistic to suggest that it need not be
this way.

>svn, monotone, gnu arch and others have all arisen
>more recently than cvs and try in various ways to
>address the legacy problems that cvs continues to

And ignore.

>>   Also note that diff has the same problem, only
>>   for diff it might not be as acute as for cvs.

>It is bacically the same problem given that diff
>is being used internally to store deltas between

Well, it's certainly possible to abstract out the RCS
layer to provide better storage management for files
that don't make small deltas.  But the problem at hand
is not due to the diff algorithm being used to generate
the deltas; it's due to the diff and merge algorithm
used at the UI level.

>>   Please note this problem is legacy since the
>>   days computer graphics were an advanced
>>   technology. When computers were text-based I
>>   can understand binary source files were a
>>   strange thing in any project.

>Possibly. We have always had binary objects, but
>many of them have source formats that are
>compatible with the older text-based
>representations. It is just that most folks choose
>not to save the files in those formats for source

Possibly.  But what's the point if the text-based
representation is also unmergeable?

>>   I would have expected things to evolve.

>Things have evolved.

>>   I think there are some binary diff algorithms...

>Indeed. Consider svn which uses xdelta internally.

>To be fair, CvsNt also has ways of dealing better
>with binary formats than the cvshome version.
>There is a hope that we can move toward merging
>the feature sets between those two major varitions
>of cvs over time, but it will likely take a bit of
>time as no one is being funded full time to work
>on just cvs development.

Which is why it doesn't support capabilities that
have already been demonstrated with alpha-quality
code.  As long as CVS development remains a hobby
project, we can't expect many new features to be
added, no matter what the level of demand is.

>> > >It does support them as an afterthought, but
>> > >that's not what it was designed to do.
>> >=20
>> > While this may be true, it turns out that CVS'
>> > design can accomodate such files. Support can
>> > be added with a relatively small amount of
>> > effort, which was demonstrated circa Sept. 18,
>> > 2001 in this forum in the form of a patch of
>> > the then-current release. All that's needed is
>> > a pluggable diff/merge tool based on the type
>> > of data.
>> >=20
>>    The design itself does not needs litte
>>    changes. cvs
>>    design can perfectly acomodate binary
>>    sources. It just has to be done (or
>>    implemented).

>Should I look forward to seeing your patches to
>help out on the project? They would be looked at
>with favor by all of your fellow cvs users.

Well, I've already posted a patch, but it was pretty
much ignored...

>--- End of forwarded message from address@hidden

reply via email to

[Prev in Thread] Current Thread [Next in Thread]