[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: how to checkin binary files

From: Jim.Hyslop
Subject: RE: how to checkin binary files
Date: Fri, 23 Apr 2004 10:21:45 -0400

Sarah Gonzales wrote:
> On Apr 22, 2004, at 12:14 PM, Larry Jones wrote:
> > Theoretically, the RTF file would allow CVS to perform its usual 
> > merging
> > of changes, but I'm not sure that Word's RTF generation is 
> repeatable
> > enough for it to actually work well.
> We use the RTF format of MS Word with CVS frequently and haven't had 
> any particular issues with it.
The issues Larry's referring to are not issues you would see as an end user,
unless maybe you try to use CVS to merge two revisions of a file. The issue
is how efficiently CVS can store the file.

CVS - or rather, RCS - always stores files using deltas between lines, where
a line is defined as anything delimited by the ASCII Line-feed character
(LF, or 0x0A). The "line" does not have to be human-readable text - it is
any sequence of bytes, including zeros.

RTF, like HTML, does not consider a line-terminator to be part of the actual
text. A new paragraph is indicated by the keyword \par. So if you create a
document that looks like this:

The quick brown fox<CR>
jumped over the lazy dog<EOF>

(where <CR> indicates a hard-paragraph, <EOF> indicates end-of-file)

The RTF text file could contain, among other stuff:
The quick <EOL>
brown fox<EOL>
jumped <EOL>
over <EOL>
the lazy <EOL>

(<EOL> indicates a line terminator - CRLF on Windows, LF on UNIX, etc.)

CVS considers the file to contain 8 distinct lines, even though it looks to
you like it contains two lines.

I believe Larry's point was, that if you modify the document and then save
it again, there's no guarantee that Microsoft Word will place the line
terminators in the same locations. Suppose you changed "brown" to "black",
MS Word might save it as:

The <EOL>
quick black <EOL>
fox\par jumped over <EOL>
the <EOL>
lazy dog<EOF>

In that case, CVS would believe that each line in the file had changed, and
have to store a delta for each line.

If, on the other hand, MS Word is consistent and predictable in its
placement, then CVS would detect only one line different, and store only the
lines "brown fox<EOL>"/"black fox<EOL>". Much more efficient.

In either case, the RTF file will look exactly the same when you view it and
print it. The difference is how big the repository file will be.

Jim Hyslop
Senior Software Designer
Leitch Technology International Inc. (
Columnist, C/C++ Users Journal (

reply via email to

[Prev in Thread] Current Thread [Next in Thread]