gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gnu-arch-users] Deconstructing Star-Merge


From: David Allouche
Subject: [Gnu-arch-users] Deconstructing Star-Merge
Date: Sun, 19 Sep 2004 20:40:37 +0200

Deconstructing Star-Merge, or Everything You Always Wanted to Know About
Star-Merge, But Were Afraid to Ask.

In the trail of "star-merge fatally wounded" I decided to put out a
specification of what star-merge really _does_, in a completely
unambiguous form, so I could have a firm foundation to think about what
star-merge really _means_.

I hope this document will set the first building blocks for a precise
formalism for Arch operators, and that it will be the first step towards
a formal specification of Arch.

First, I wrote a specification for star-merge by examining the help
message (the documentation I could find in tla, larch, and the mailing
list did not help) and getting some clarifications from Robert Collins.
Then, I wrote a specification by examining the implementation of star-
merge.

I would like the Ones Who Really Get Arch around to validate this work
and clarify the issues I have found, so we can put out the _real_
specification of star-merge.

Thanks for your help.


= Star-merge =

== Fundamentals ==

The Fundamentals section defines notation and some invariants of the
Arch model.

This notation is used in other sections to provide unambiguous
descriptions of algorithm and observed behaviour.

=== Changesets ===

diff: Tree, Tree --> Changeset
patch: Changeset, Tree --> Tree, Conflicts

diff(TA, TB) = C <=> patch(C, TA) = (TB, None)

=== Patchlogs and Revisions ===

There is a relation of equivalence between Patchlogs and Revisions.

patchlogOf: Revision --> Patchlog
revisionOf: PatchLog --> Revision

patchlogOf(R) = L <=> revisionOf(L) = R

=== Commit ===

getTree: Revision --> Tree
commit: Tree, Revision --> Revision, Tree
getChangeset: Revision --> Changeset

commit(W, RP) = (RN, TN) where
    RN is the immediate successor of RP
    getTree(RN) = TN
    diff(W, TN) = addition of PatchlogOf(RN)

Note: sealing and fixing are not modelled.

=== Versions ===

versionOf: Revision --> Version
versionOf: Patchlog --> Version

R is a Revison, L a Patchlog
If R <=> L, then versionOf(L) = versionOf(R)

Revisions (resp. Patchlogs) of the same Version have total ordering.
If a Revision R exists, R is a base-0 or R-1 exists.
If a Patchlog P is present in a tree T, P-1 may or may not be present in
T.

latest: Version, Date --> Revision
latest(V, d) |--> Latest revision in V at the date d.

The logset of a Tree is the set of Pachlogs present in that Tree.
logSet: Tree --> Set of Patchlogs
logSet: Revision --> Set of Patchlogs
R is a Revision, logSet(R) = logSet(getTree(R))

projection: Set of Revisions, Version --> Set of Revisions
projection: Set of Patchlogs, Version --> Set of Patchlogs
projection(RR, V) = { R in RR \ versionOf(R) = V }

logsFrom: Version, Tree --> Set of Patchlogs
logsFrom: Version, Revision --> Set of Patchlogs
logsFrom(V, T) = projection(logSet(T), V)

Corollary: A logsFrom set has total ordering.

logIn: Patchlog, Tree --> Boolean
Patchlog is in logSet(Tree

lastIn: Version, Tree --> Revision
lastIn: Version, Revision --> Revision
lastIn(V, X) = max(logsFrom(V, X))

=== Merges ===

newPatches: Revision --> Set of Patchlogs
newPatches: Patchlog --> Set of Patchlogs
newPatches(R):
    If R is a COMMIT, set of PatchLogs added by diff(R-1, R).
    If R is a CONTINUATION or IMPORT, set of Patchlogs present in
getTree(R)

Note: A continuation is created by "tla tag" or a "tla commit --base".

Note: The new patches are recorded in the New-patches patchlog header.

newPatchesFrom: Version, Revision --> Set of Patchlogs
newPatchesFrom: Version, Patchlog --> Set of Patchlogs
newPatchesFrom(V, R) = projection(newPatches(R), V)


== Documented Algorithm ==

This is a description of the star-merge algorithm, as understood from
the "tla star-merge" help message and with some help from Robert
Collins. Faults are mine.

Inputs:
  TREE: a WorkingTree
  FROM: a Revision
  REF: a Revision

Note: REFERENCE defaults to the tree-version of TREE

Naming:
    FV: FROM Version
    RV: REF Version
    MA1: MAYBE_ANCESTOR_1
    MA2: MAYBE_ANCESTOR_2
    LMIF: LAST_MERGE_INTO_FROM

FV = versionOf(FROM)
RV = versionOf(REF)

MA1 = RevisionOf(max(logsFrom(FV, REF)
                     inter logsFrom(FV, FROM) 
                     inter logsFrom(FV, TREE)))

MA2 = RevisionOf(max(logsFrom(RV, REF)
                     inter logsFrom(RV, FROM)))

LMIF = max({R in FV \ MA2 is in newPatches(R)})

If MA1 and MA2 are undefined, then FAIL.

Elif MA1 is undefined, then ANCESTOR := MA2.

Elif MA2 is undefined, then ANCESTOR := MA1.

Elif LMIF < MA1, then ANCESTOR := MA1.

Elif MA1 < LMIF, then ANCESTOR := MA2.

Apply delta(ANCESTOR, FROM) to TREE.


== Implemented Algorithm ==

This is a description of the star-merge algorithm, as understood from a
superficial examination of the implementation.

Naming:
    FV: FROM Version
    MA1: latest common from revision
    HMFTIF: highest merge from to into from
    MA2: highest common to revision
    LMIF: from rev of highest common to revision

Inputs:
    TREE: a WorkingTree
    FROM: a Revision
    REF: a Version

FV = versionOf(FROM)

MA1 = max(logsFrom(FV, FROM) inter logsFrom(FV, TREE))

HMFTIF = max(UNION of newPatchesFrom(REF, P)
                   for P in logsFrom(FV, FROM))

If HMFTIF is in logSet(TREE), MA2 = HMFTIF 

LMIF = choose({R in logsFrom(FV, FROM) \ MA2 in newPatches(R)})

The implementation does not seem to select one merge point explicitely.

If LMIF and MA1 are undefined, then FAIL.

Elif LMIF is undefined, then ANCESTOR := MA1.

Elif MA1 is undefined, then ANCESTOR := MA2.

Elif LMIF > LCRF, then ANCESTOR := MA2.

Else ANCESTOR := MA1.

Apply delta(ANCESTOR, FROM) to TREE.


= Conclusion =

There seem to be shortcomings in the documentation as well as in the
implementation.

Apparently, REF is not a Revision as the documentation suggests:

<ddaa> > MAYBE_ANCESTOR_1 is defined as the highest patch level of FROM
in REFERENCE for which both TREE and FROM have a patch log.
<ddaa> Actually, the first FROM designates "the version of FROM", the
second FROM designates "the revision FROM", and "REFERENCE" actually
designates "the latest revision of the REFERENCE version"!

Actually the "of FROM" part does not seem correspond to anything in the
implementation.

The selection of LMIF does seem to account for multiple merge point.
That may be a problem with versions containing a continuation after
base-0.

There seems to be an assumption that for every Revision R which is not a
continuation or an import, patchSet(R-1) is included in patchSet(R).

The computation of MA2 seems to be incorrect, the previous message I
posted pointed out the part of the implementation which is buggy or that
I misinterpreted. For the record:

  for (x = 0; !highest_common_to_revision && (x < rel_n_records
(from_merges_from_to)); ++x)
    {
      if (arch_tree_has_log (to_tree_root, to_archive,
from_merges_from_to[0][1]))
        {
          highest_common_to_revision = str_save (0, from_merges_from_to
[0][1]);
          from_rev_of_highest_common_to_revision = str_alloc_cat_many
(0, from_version, "--", from_merges_from_to[0][0], str_end);
        }
    }

-- 
                                                            -- ddaa





reply via email to

[Prev in Thread] Current Thread [Next in Thread]