monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] considering no merge-into-dir


From: graydon hoare
Subject: Re: [Monotone-devel] considering no merge-into-dir
Date: 12 Oct 2003 15:30:51 -0400
User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2

"Zack Weinberg" <address@hidden> writes:

> * A rename conflict could be reconceptualized as a text conflict of
>   the manifest file.

if there's actually a *conflict* in a rename, yeah, I'll probably just
invoke the text file conflict-resolution hook. currently it just
bails, but tightening up the rename and manifest-conflict handling is
high on my todo list.

note that *most* (non-conflicting) manifest activity is evaluated in
finite-map terms, rather than in lines. but yeah, I think the UI for
getting a user to intervene should just be "resolve these lines".

> * Microbranches are really cool.  But, in my head, their primary
>   utility is, if you don't want to deal with a merge conflict, you can
>   rewind to a known good version and work from there.

what I was getting at with the term "microbranch" is: should monotone
*automatically* try to collect non-conflicting mergeable heads on
either side of a conflict? 

for example: if bob and alice are working on Major Changes which
conflict, within the same branch, and 100 small non-conflicting bug
fixes get committed by other users to the same branch, should
monotone:

  - synthesize a Bob-with-100-fixes and an Alice-with-100-fixes node? 
  - synthesize a Bob-with-100-fixes node, and skip Alice?
  - synthesize an Alice-with-100-fixes node, and skip Bob?
  - synthesize a 100-merged-fixes node, and skip Bob and Alice both?

in other words, I'm trying to get a picture for what monotone's "best
effort" should be, in the face of conflicts you're not willing to
resolve. it has to be deterministic, and prduce something you might
like, and ought to aim to produce common ancestors when run by
different users (with different sets of nodes on-hand).

>   - CVS conflict markers are not very helpful and I would not miss
>     them if they went away.  Something feedable to Ediff is much
>     better.  However, to ease transition, I think a mode in which
>     similar conflict markers are generated would be useful.

agreed. tromey likes them, I know some other people will like them, so
we need to be able to synthesize them. but I don't want to make that
the "only" way of expressing a conflict to a user.

>   - CVS conflict detection is too conservative.  It is often the case
>     that revisions A and B with ancestor S conflict, but applying the
>     diffs S->A, S->B in succession succeeds with no complaints.

the meaning of "conflict" is vague, as is the meaning of "applying
diffs". I am taking this algorithm (after locating S as the true least
common ancestor of A and B, not their first divergence point as in
CVS):

    1. calculate left = diff(S,A) as a line-based edit script
    2. calculate right = diff(S,B) as a line-based edit script
    3. calculate a map of how the edits in 'left' change each
       line number in S to a line number in A
    4. transform all the edits in 'right' under this map,  and apply
       them to A

an edit on 'right' in step #4 is said to conflict if:

    - it deletes a line which was inserted in 'left'
    - it inserts a line which was deleted in 'left'
    - it deletes a line with different text than was deleted in 'left'
    - it inserte a line with different text than was inserted in 'left'

this is not perfect, and doesn't work on binary files, but it's
reasonably strong and quite predictable. it does not do "fuzzy
matching" based on context similarity or anything. we can argue the
merits of that forever, but at this point I'm not going to support
such merging.

>   - It is very useful to be able to rewind THE ENTIRE TREE to a
>     known-good state.  In fact, I want an 'update' mode in which if
>     any file version causes a conflict, the entire tree version
>     containing that file version is backed out.

I think the term "backed out" is wrong here: we're certainly not going
to remove a tree version from your database just because you get a
merge conflict (and you can't affect anyone else's db anyways). but
the current monotone 'update' command does, I think, what you mean: if
there's a conflict during the merge phase of an update, the update is
aborted and your tree is left in its pre-update state, as though you
did not run 'update' at all.

>   - I want pluggable conflict detection and resolution, so that
>     someone can innovate a merge tool that understands the syntax
>     of each programming language.

there is already pluggable resolution (there's a hook for it, right
now it runs ediff but you can override). adding one for detection is
easy.  I'm also intending to add a general "let me do this merge,
please" hook (for merging changelog entries and whatnot) using a
smarter algorithm than the line-merge routine I described above.

> * I'm not sure how useful partial merges would be.  I do take a huge
>   change sometimes and split it into chunks - but on logical
>   boundaries, not file boundaries, and I don't know how to automate
>   logical boundaries.
> 
>   Maybe: given a version graph like so
> 
> 
>     BASE --- MEL.1  --- MEL.2  --- MEL.3
>          \-- DAVE.1 --- DAVE.2 --- DAVE.3
> 
>   where it is now desired to merge the heads, try applying DAVE.1,
>   .2, .3 in succession to the MEL branch.  Or the other way round.
>   Or each .1 and then each .2 and so on.  This is something where
>   a spiffykeen user controllable merge tool would be nice.

this is interesting. I think the algorithm I'd try is: 
  - start with MEL.3 <-> DAVE.3, and if there's a failure:
    - try both MEL.2 <-> DAVE.3 and MEL.3 <-> DAVE.2
    - if those fail, try  
       MEL.1 <-> DAVE.3
       MEL.1 <-> DAVE.2
       MEL.3 <-> DAVE.1
       MEL.2 <-> DAVE.1
    - ... etc, doubling the number of attempts each cycle until you
      find a pair which merge.

but this is only one case of deciding on partial merges. another one,
which I've been more concerned about, is what to do with:

           +- MEL
           +- DAVE
     BASE -+- NORA
           +- FRED
           +- SUZY
           +- LOU

and only *some* of them merge with one another, while others
conflict. do we commit "all pairwise merges which succeed", and all
succeeding pairwise merges of those results, ...? or do we try to
cluster them into cliques? if so, should the cliques share all
non-conflicting members? this is the "microbranch" issue I mentionned
above.

-graydon





reply via email to

[Prev in Thread] Current Thread [Next in Thread]