[Monotone-devel] Merge3

monotone-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] Merge3

From:	Torsten Rueger
Subject:	[Monotone-devel] Merge3
Date:	Wed, 21 Apr 2004 13:28:56 +0300
User-agent:	Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.7b) Gecko/20040316

Moi,

I've been researching merge algorithms for xml recently and togetherwith a colleague have found an algorithm that merges text with very goodresults.

I am afraid that I can't publish my code (it's ruby anyway), due tolicensing reasons. But I'm happy describing the algorithm and if someonewants to implement it in c++, I can help (offline). I just thinkmonotone is such a fresh wind to the topic of version control, i'd behappy to contribute at least a little to this merging topic.


In merging one can must take one of the two aproaches:
-- add/remove
-- add/copy

Quite seperate from that is the issue of line based versus byte based.The second is obviously more general (works on binary files) but moreimportantly also more acurate (produces less conflicts for changes onone line that can still be merged)

Add/remove is what unix diff/patch does (on a line basis). Add/copy iswhat more recent algorithms do, namely xdelta and vcdiff.

Now the problem with add/remove is that almost by definition is does notdetect copies or moves. So that when a part of text is moved in onebranch and edited in another, a conflict occurs. Often quite unneccessarily.

Xdelta and vcdiff not only work on binary data, but detect copies. Theydo this for efficiency reasons (the produced delta is smaller) and onlyfor diffing 2 files. Unfortunately, both can only apply diffs to theexact original, so no 3 way merging. So that's what I wrote and decribehere.

I'll make an example (if you're still reading, great, keep going), thenumbers are for explaining the algorithm later. The example is short,the strings are actually too short, but it's just an example.The merge is perfectly feasable and unambiguous. You can put each wordon a line and run it through diff3 to see it fail.


Original:  <html><body>Stuff</body><head>Merge Example</head></html>
             1     2     3     4      5    6     7      8       9
One:       <html><head>Merge Example</head><body>Stuff</body></html>
             1  <--5     6      7      8  <--2     3    4   <--9
Two:       <html><body>Algorithm</body><head>Example</head></html>
             1     2   <--10    <--4     5 <-- 7      8      9
Merge:     <html><head>Example</head><body>Algorithm</body></html>
             1     5     7       8      2     10       4     9

Matching phase: Find the string in One and Two to map to the original.Add added strings

         One:  1   5-8   2-4  9
         Two   1   5     7,8    2  add:10  4 9
Merging phase: go backwards through original following the change pattern:
      Start with 9 in either
      Go to 4, because of change in One
      Go to 10 because of change in Two
      Go to 2 because of change in Two
      Go to 8 because of change in One
      Go to 7 because no change in original order
      Go to 5 because of change in Two
      Go to 1 because of change in One

Output in reverse order and get Merge!

While going through the original matches of the matching phase, one hasto recognise the "changes" inside the matches. It's either that orsplitting all matches into the non overlapping pieces that are thenumbers. This second option has proven to be more complicated.

Matching is done al la xdelta, by splitting files into chunks,calculating hashes for each chunk. Then looking for equal hashes andexpanding the match as far as possible. At the end one can add thestrings in the "gaps" that have not been matched.

I hope the example makes it clear that the algorithm is relatively easyto understand (maybe more so than existing ones) and creates good mergesor avoids unneccessary conflicts.

So if you got this far, and are intested in implementing go for it. I'dbe happy to help.


Torsten

[Prev in Thread]

Current Thread

[Next in Thread]

[Monotone-devel] Merge3, Torsten Rueger <=

Prev by Date: [Monotone-devel] database schema questions
Next by Date: Re: [Monotone-devel] Warning: monotone can't use system wide LUA! ,was Re: Always uses bundled SQLite/LUA (error in autoconf.ac)
Previous by thread: [Monotone-devel] database schema questions
Next by thread: [Monotone-devel] Outstanding CVS import issues
Index(es):
- Date
- Thread