gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Re: File-tpye plug-in architecture for Arch?


From: Tom Lord
Subject: Re: [Gnu-arch-users] Re: File-tpye plug-in architecture for Arch?
Date: Sun, 21 Dec 2003 08:57:51 -0800 (PST)

    > From: michael josenhans <address@hidden>

    >>    such that B.diffs is at least smaller than B, ideally a useful
    >>    (browsable) description of the changes, and critically such that:

    >>         % xpatch -o B_2 B.diffs A 
    >>         % cmp B B_2 && echo yes
    >>         yes

    > I am not sure that this is always needed.

    > In XML the following terms are devared as equivalent:

    > a) <nodename attiribute='5656'></nodename>
    > b) <nodename attiribute='5656'/>

    > Spaces outside the nodes are irrelevant. Thus according to
    > standard after reading and saving a XML-file, the XML-file might
    > look different, even if its content has not changed.

Yikes.

On the one hand, sure, you could abstract the `cmp' and conceptually
the world doesn't fall apart.

But on the other hand, that would mean (for example) that `get' would
sometimes return a tree whose source files are not byte-wise
equivalent to those that were passed to `commit'.   It's a pretty
big leap of faith to think that that's desirable.



    >> b) Can you do inexact patching?

    > The tool (http://www.cs.wisc.edu/~yuanwang/xdiff.html) claims to achieve 
    > this by using hashes on XML nodes.

    > Alternatively, if we would havel the file format under control, we could 
    > tag the XML nodes.

I'm not at all convinced that generic XML-diff/patch tools are what
you ought to be looking at.   They only make sense if, either
deliberately or by accident, the document formats are robust and
meaningful under the transformations of a generic XML-diff/patch.

I think it unlikely that the document formats were designed with any
XML-diff/patch algorithm in mind.   I think it implausible that they
will be robust under those transforms "by accident".

In short, I think you really do want an OO-diff/patch, for which a
generic XML reader / traversal / transform / writer interface is handy
-- but I'm not so sure how much generic diff/patch will help.

    >> One idea is that the diffs for such files should include a checksum of
    >> the ORIG file ("A" in the examples above), apply themselves exactly to
    >> copies of that file, and otherwise invoke a configurable sub-program
    >> to just extract a copy of the MOD file ("B", in the examples above)
    >> so the tool would leave behind a conflict consisting of two files:
    >> A.orig and A.mod, leaving it to the user to merge them by hand.  In
    >> the context of arch, that configurable sub-program can be arch itself
    >> (roughly `tla file-find').

    >> Archives created using xdiff (and containing whatever special file
    >> types you want to handle) will be readable only by other people who
    >> have configured arch to use the corresponding xpatch.    So if there
    >> were very good progress on the x* programs, one way arch could help is
    >> to endorse them -- to say "use xdiff" rather than "use GNU diff".

    >> The the important thing is that with the _possible_ exception of
    >> mechanisms for recording "file type" information, tla doesn't need to
    >> be changed at all to handle file types needing a special diff/patch
    >> algorithm.   If you want these kinds of features, you need to hack
    >> diff, patch, and diff3 -- not arch.

    > This makes sense to me.

The brute-force merge-by-not-merging-but-give-ORIG-and-MOD-copies
trick -- I think that may be the only hope for document formats that 
weren't designed with merging in mind.   It doesn't much help users
merge -- but then it doesn't given uselessly illegal bogusly merged
documents either.

    > Would be modifications to arch needed to enable working with
    > compressed files?

I doubt such modifications would really help you.  They are unlikely
to be seriously considered for arch.

    > Does diff in this cases need to call Arch recurively?

It could -- although much as generic XML-diff/patch is unlikely to
produce useful results for these documents, I suspect that the generic
tree-diff/patch that arch does will produce useful results.

I think you really do need something like OO-diff/patch and that 
putting it together out of existing generic diff/patch-family tools 
is an unlikely prospect.

-t






reply via email to

[Prev in Thread] Current Thread [Next in Thread]