gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gnu-arch-users] [OT] Darcs data model [was: the state of the union]


From: Juliusz Chroboczek
Subject: [Gnu-arch-users] [OT] Darcs data model [was: the state of the union]
Date: 30 Aug 2004 13:26:03 +0200
User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.1

> Can anyone explain darcs' patch algebra and all that jazz in a way that
> a [non-specialist] can understand?

Sorry for the off-topic -- I'll try to be brief.

Suppose that you have two patches A and B that modify different files,
or separate parts of a file.  Intuitively, the order in which these
patches are applied is pretty much irrelevant -- you could apply A and
then B, or else first B' and then A' (where A' and B' are A and B with
line numbers suitably shifted).  In darcs terminology, A and B /commute/.

On the other hand, if D modifies lines that are inserted by C, D must
be applied after C.  Here, C and D don't commute, in darcs terminology
D /depends/ on C.

In systems such as SCCS, RCS, CVS or Arch, patches are ordered (within
a given branch) even when they commute.  In other words, even though A
and B commute, they are stored in a given order, the ordering persists
forever, and is something the user is aware of.

The basic idea behind darcs is that patches are only ordered when they
depend on each other.  Hence, if the patches A and B are committed
(``recorded'') to a darcs repository, they have no intrinsic ordering
-- the repository contains both A and B, or equivalently both B and A.

Now this is extremely confusing at first -- unlike CVS or Arch, darcs
doesn't record history, it records an unordered set of patches (except
where dependencies force an ordering).  I find that you need a lot of
discipline in darcs just in order to preserve some invariants that are
very difficult to violate by accident in CVS or Arch.

(Side note: darcs actually stores patches in a given order; however,
there are a number of operations that will randomly reorder commuting
patches.  Hence, you cannot rely on patches remaining in the order
they are in.)

On the other hand, this model significantly reduces conceptual load.
For example, there no longer is the notion of merge between two
branches -- there's the conceptually much simpler notion of copying a
patch from one repository into another.  In cases when you want to
keep track of history, you can create an empty patch that is
artificially made to depend on all the patches committed to date (such
a patch is called a ``tag'' in darcs).

The ``patch algebra and all that jazz'' is basically about two things.
First, when do patches commute, and how do you commute them
(i.e. given A and B, construct B' and A'); this part is fairly simple
and elegant.  Second, what to do when patches don't commute, but darcs
needs to commute them anyway -- something that happens when there is a
merge conflict; this part of the algorithm is messy, and I still do
not fully understand it.

Now my personal feeling is that while the notion of changeset used by
darcs is similar to the one used by Arch, the global data model is
fundamentally different, which is why I don't understand Tom's claim
that darcs is a subset of Arch.

                                        Juliusz




reply via email to

[Prev in Thread] Current Thread [Next in Thread]