gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] File-tpye plug-in architecture for Arch?


From: Tom Lord
Subject: Re: [Gnu-arch-users] File-tpye plug-in architecture for Arch?
Date: Fri, 19 Dec 2003 08:46:04 -0800 (PST)


    >>= Thomas Zander

    >> Since OOo will save some 5 different XML streams inside a zip file its
    >> quite easy to do revision management on XML only streams once you have a
    >> XMLDiff application in place. (which should be first priority)

    >= Anselm Lingnau

    > I would like to see a diff/patch process optimized to ignore line 
breaking in 
    > paragraphs. This would make it easier to use Arch for LaTeX documents.

    [and others]


In general these sorts of things aren't really an arch issue.  The
thread should really be called "file-type plugin-in architecture for
_diff_and_patch_?".

The problem breaks down into four parts:

a) Can you do do exact diffing and patching?

   Given files A and B can you write:

        % xdiff [options] A B > B.diffs

   such that B.diffs is at least smaller than B, ideally a useful
   (browsable) description of the changes, and critically such that:

        % xpatch -o B_2 B.diffs A 
        % cmp B B_2 && echo yes
        yes

b) Can you do inexact patching?

   Suppose that A is modified to produce A_changed.

   Will:

        % xpatch -o B_changed A_changed B.diffs 

   produce useful output?  If the merge can't be fully automated, will
   it at least produce useful output?

   (How we are doing, for example, on merge tools for word-processor
   documents?  Are there some around that reliably produce a valid
   output document using formatting and mark-up to present the merge
   conflicts to users in an easy-to-resolve format?)

   Extra credit if your xpatch can do something reasonable with a
   `--forward' option.



c) Can you do diff3-style merging?

   Will:

        % xdiff3 -o merged MINE OLDER HIS

   produce properly merged output, perhaps with useful conflict
   markers?



d) How should "file type" be represented?

   Arch _might_ want to help with that -- but I'm not so sure
   it really should.    It might be better to make the "standard"
   for recording file type entirely separate from Arch so that 
   xdiff, xpatch, and xdiff3 will work well "stand-alone" and 
   when invoked even from outside of arch trees.

   I understand that it's tempting to say "Well, arch already
   maintains a little database of `file properties' so file-type might
   as well go in that database."  Except that that wouldn't be true:
   arch maintains no such database -- only file-ids.  And anyway, that
   would make xdiff and friends and arch mutually dependent tools
   where currently there is just a one-way dependency of arch on diff.

   Ideally, xdiff, xpatch, and xdiff3 will work correctly (like
   diff, patch, and diff3) on regular text files and tla can be
   then be configured to use the x* programs rather than 
   ordinary GNU diff/patch/diff3.




Note that for some file types (e.g,. images) fancy support for
"inexact merging" is unlikely anytime soon.  What should you do?

One idea is that the diffs for such files should include a checksum of
the ORIG file ("A" in the examples above), apply themselves exactly to
copies of that file, and otherwise invoke a configurable sub-program
to just extract a copy of the MOD file ("B", in the examples above)
so the tool would leave behind a conflict consisting of two files:
A.orig and A.mod, leaving it to the user to merge them by hand.  In
the context of arch, that configurable sub-program can be arch itself
(roughly `tla file-find').

Archives created using xdiff (and containing whatever special file
types you want to handle) will be readable only by other people who
have configured arch to use the corresponding xpatch.    So if there
were very good progress on the x* programs, one way arch could help is
to endorse them -- to say "use xdiff" rather than "use GNU diff".

The the important thing is that with the _possible_ exception of
mechanisms for recording "file type" information, tla doesn't need to
be changed at all to handle file types needing a special diff/patch
algorithm.   If you want these kinds of features, you need to hack
diff, patch, and diff3 -- not arch.

-t





reply via email to

[Prev in Thread] Current Thread [Next in Thread]