monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] Re: Extensions to automation, again


From: Nathaniel Smith
Subject: Re: [Monotone-devel] Re: Extensions to automation, again
Date: Wed, 18 Oct 2006 14:00:19 -0700
User-agent: Mutt/1.5.13 (2006-08-11)

On Wed, Oct 18, 2006 at 09:44:34PM +0100, Bruce Stephens wrote:
> Huh?  Why?  If you've got a sequence of copies and inserts (presumably
> each including the number of bytes to copy or insert) isn't it easy to
> calculate the size of the result?  (And just as easy for text and
> binary files.)  Not counting any conversions necessary (specifically
> line-ending conversions), anyway.

This is correct -- in terms of the code, you can just implement a
trivial delta_applicator subclass (in fact, I think we used to use
this subclass for some reason or another).

However, the expensive part of file reconstruction is not applying the
deltas -- it's loading the deltas from disk and uncompressing them.
So probably an algorithm that calculates just the size of the final
file is not much faster than an algorithm that calculates the file
contents as a whole, since the expensive part is the same for both.

It also depends on the access pattern.  If clients generally call
get_file immediately after get_file_size, then it is actually a win
for get_file_size to be implemented in terms of get_file, because you
get:
  client
    -> get_file_size
      -> get_file
        -> loads deltas (expensive)
        -> reconstructs file
        -> caches it
    -> get_file
       --> cache hit
instead of:
  client
    -> get_file_size
      -> loads deltas (expensive)
      -> calculates size of file
    -> get_file
      -> loads deltas (expensive)
      -> reconstructs file

This depends a bit on caching strategy too, of course; if you have a
piece cache then it's the "loading deltas" part that gets cached, so
these become pretty much the same.  get_file_size would still not be
much faster than get_file, though; for that you need to store
file sizes in the database directly.

-- Nathaniel

-- 
In mathematics, it's not enough to read the words
you have to hear the music




reply via email to

[Prev in Thread] Current Thread [Next in Thread]