monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] Re: moving forward on delta storage


From: Nathaniel Smith
Subject: Re: [Monotone-devel] Re: moving forward on delta storage
Date: Mon, 20 Feb 2006 02:03:14 -0800
User-agent: Mutt/1.5.11

On Mon, Feb 20, 2006 at 09:15:22AM +0000, Lapo Luchini wrote:
> I *may* also be the right time to evaluate different binary-diff schemes?

Maybe.  We can always do that later too, though -- holding up a fix
for initial pull speed in order to get minor improvements in database
size is probably not a good idea.

> bsdiff seems very promising to me, tough it is *very* memory-intensive (but we
> could always switch to a "lower grade" binary diff on files that are too big, 
> in
> the worst case).
> 
> It was "strangely licensed" up to release 4.2, but 4.3 is covered by plain old
> good BSD license.
> 
> http://www.daemonology.net/bsdiff/
> regarding size: "bsdiff routinely produces binary patches 50-80% smaller than
> those produced by Xdelta"

However, it also attributes this partly to "taking advantage of how
executable files change".

I guess it's important to test these things on real data.  Matt
Mackall (the main mercurial guy) says that in the benchmarks he ran
when starting out, recursive longest-common-substring matching gave
better results than xdelta on source files.

> regarding memory: "bsdiff is quite memory-hungry. It requires
> max(17*n,9*n+m)+O(1) bytes of memory, where n is the size of the old file and 
> m
> is the size of the new file. bspatch requires n+m+O(1) bytes."

Is there any documentation on the actual algorithm used available?
One very important question in particular is whether it outputs
copy/insert deltas like xdelta does, or something else.

-- Nathaniel

-- 
"If you can explain how you do something, then you're very very bad at it."
  -- John Hopfield




reply via email to

[Prev in Thread] Current Thread [Next in Thread]