monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] diff bug 20447


From: Derek Scherger
Subject: [Monotone-devel] diff bug 20447
Date: Sun, 9 May 2010 22:06:46 -0600

So I spent quite a few hours on Saturday working on https://savannah.nongnu.org/bugs/?20447 which turned out to be somewhat more involved that I was expecting.

The basic problem is this: given a file f in a directory d

$ mtn mv d d2
$ edit f

$ mtn status
# rename d to d2
# patch d2/f from [1234] to [5678]

$ mtn status d2
# rename d to d2
# patch d2/f from [1234] to [5678]

$ mtn status d2/f
# patch d/f from [1234] to [5678]

Note that in the third status call the workspace changes are restricted to those on the file f and exclude the rename of the directory d to d2. The basic problem in bug 20447 is that when diff attempts to get the content for file f it uses the restricted name d/f rather than the unrestricted name d2/f. i.e. it doesn't know about the rename of d to d2 because that operation was excluded when the diff was limited to the single file.

After looking over the code for dump_diffs and a few other functions I had the feeling that the problem was that they were written using csets rather than roster comparisons, probably left over from before rosters existed and never fully converted. Before looking into this too far (or thinking too hard) I just jumped in and converted everything to use rosters so that the proper names are available and there is no need for reverse rename maps and such. Almost all such code has been replaced with roster based implementations which are generally much simpler and correct.

At the moment these changes result in slightly different diff output for additions and deletions which both now use /dev/null for one file (the source for adds and the target for deletes). There's a big comment in diff_output.cc that made this seem like the right approach. This new implementation currently does output a diff against /dev/null for a deletion where previously we simply didn't display anything. I'm not sure which is preferable but it's trivial to change, maybe we even want an option to control this? The new implementation also shows the both names involved when a rename occurs although I'm not sure if this is such a good idea as it may confuse patch and friends. Changing this is also trivial.

Sadly, these changes didn't actually fix the problem, but having done this work I now understand what the problem is and have some ideas on what we might do.

This problem is closely related to the problem of adding a directory d, then adding a file f to it and then restricting a commit to just the file d/f which fails because the directory addition is excluded. One fix for this would be to change the restrictions code so that including a file implicitly includes changes to all of its parent directories (non-recursively). So a commit of d/f would include d and d/f but not other children of d. I'm not sure if there are other ways we could deal with this.

In the case of mv d d2 above if we include parents of f committing d2/f would also commit the rename of d to d2. This would effectively rename all other children x of d to d2/x as well which seems somewhat questionable, although it might be better than failing with 'mtn: misuse: file d/f does not exist'.

Another way to possibly solve this might be to keep 3 rosters around (old, restricted and new) rather than 2 (old and restricted) and then when looking for files in the workspace use the unrestricted new roster (which represents the actual workspace content) to get the path and load the file to diff against. I think this could probably be made to work for diff but I'm not so sure about commit and I'm pretty sure that status, diff and commit must agree on what's changed when given the same set of paths to operate on. i.e. the point of status and diff is essentially to see what you are about to commit.

Trying to use this approach for commit leads to the case where a commit of d2/f would commit the content change to f but not the rename of d so you're really only committing d/f which also seems questionable and doesn't fix the problem with committing a restricted addition.

So, is there a general preference for (1) implicitly including changes (renames and attributes) to all parent directories of each included path or (2) being able to commit content changes to a file that is in a directory that has been renamed without including the directory rename?

At the moment I think I'm in favour of (1) but I'm sure there are cases I haven't thought of yet and maybe (2) is a better choice. I'll note that add a/b/c will add a, a/b and a/b/c if they don't already exist in the project which is essentially implicitly including the parent directories of the specified file.

Cheers,
Derek


reply via email to

[Prev in Thread] Current Thread [Next in Thread]