monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] newcomer (rude, but hopefully not to rude) question


From: graydon hoare
Subject: Re: [Monotone-devel] newcomer (rude, but hopefully not to rude) questions
Date: 15 Sep 2003 11:01:13 -0400
User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2

Tom Lord <address@hidden> writes:

> Oh... also.... hmm.  The use of a database and "backward" patches
> .... these things are slightly confused from my perspective.

all I mean is that the storage system tends to store deltas in the
opposite direction as time passes: the stored deltas start from full
current versions and lead back into history. this is a common storage
decision, to speed access to recent versions, which are typically of
more interest. it is like a move-to-front list. it is not related to
the direction of deltas calculated for transmission, nor the logical
representation of history (which is an ancestry certificate relating
tree states, not a delta at all).

> There's no good reason not to make changesets symmetric.  The
> question is where are there full-texts and how are they used.  Just
> more of the same: break down the design space a little differently
> and it looks entirely different.)

this is incorrect. you are right that there is a implicit symmetry to
all changes, but choice of representation is a practical issue, and
there are good practical reasons not to store symmetric deltas (by
which I assume you mean "directly invertable"): such deltas take much
more space, and are slightly more expensive to calculate. I don't want
to throw stones, because monotone databases still aren't as space
efficient as they ought to be, but I also don't want you to get the
impression I haven't made a conscious choice here. I have.

I will give an example. suppose we change "good evening wide world of
happiness" to "good evening happy wide world".

in monotone the storage system will calculate and store this:

C 0 13
I 5
happy
C 18 11

this is a 25 byte copy/insert delta, based on byte ranges (so it works
equally well on non-text binaries). it cannot be directly inverted,
because the information which was ignored (" of happiness") is not
written into the delta. you can of course *calculate* inverses when
you want them by re-doing the delta calculation in the opposite
direction.

in RCS files, a different form is used:

d0 1
a0 1
good evening happy wide world

this is a 40 byte delete/add delta, calculated from a line-based
LCS. the best algorithm for this calculation is slower (O(ND)
vs. O(N)) than the best algorithm for calculating a copy/insert
script, but not generally enough to notice. this format is also not
directly invertable, because it also does not store the information
which was deleted. you must calculate the inverse explicitly.

in arch, it seems like you're storing unified diffs, so you would
store it like this:

@@ -0,1 +0,1 @@
- good evening wide world of happiness
+ good evening happy wide world

this is an 87 byte delete/add delta (assuming you've turned off
context), also based on a line-oriented LCS. this representation is
directly invertable, but it has more than three times as much data in
it as the copy/insert delta. 

when a user asks to see a unified diff, monotone will calculate one
and display it. but it is considered a format for human consumption,
not storage. in most other cases, the relevant delta -- forward,
backward, between any two versions -- is calculated on the fly.

-graydon





reply via email to

[Prev in Thread] Current Thread [Next in Thread]