[Monotone-devel] [PATCH] mtn automate diff

monotone-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] [PATCH] mtn automate diff

From:	Thomas Keller
Subject:	[Monotone-devel] [PATCH] mtn automate diff
Date:	Fri, 12 Jan 2007 14:36:21 +0100
User-agent:	Thunderbird 1.5.0.9 (X11/20060911)

Hi all!

So I changed my opinion about the idea of having a separate
revision_diff and content_diff command for automate, but integrated the
unified diff output into basic_io for a more generic format. The
content_diff command from the current mainline is dumped completly with
this patch.

My "dream" is that this "new" format (which is basically a cset with
some minor modifications) evolves to the general patch interchange
format. Yeah, there is still a reader missing for it, I know, but I
thought having a defined format at first might help getting a reader
developed... =)

What this currently not solves is how binary diffs are expressed. In the
current version (also in nvm.revision_diff) those diffs are simply
skipped, but one might think of putting the actual file delta in it
somehow (maybe with some base encoding to avoid screwed consoles when
looking at those patches).

The textual diff format is unified diff and this can't be changed by the
user. I don't have an idea how monotone's internal textual diff format
looks like (I guess mtn makes no difference in this regard anyways), but
I chosed unified diff because the format is condensed, but still human
readable (at least for me a "killer" feature, because I'd rather like to
check the patch before I would try to apply one blindly).

Opinions?

Thomas.

-- 
ICQ: 85945241 | SIP: 1-747-027-0392 | http://www.thomaskeller.biz
> Guitone, a frontend for monotone: http://guitone.thomaskeller.biz
> Music lyrics and more: http://musicmademe.com

# 
# 
# patch "cmd_automate.cc"
#  from [cd359f474386a86ab988cfda2faa672806113bde]
#    to [087d698c66df08a1b69b3ee3041edf3ac9e3ebfe]
# 
# patch "cmd_diff_log.cc"
#  from [5db45374e646424aab11a5e822d204ad92eb3bca]
#    to [2ef2e7406ed92a1879d805d873821a41a8482735]
# 
# patch "diff_patch.cc"
#  from [a8b1d3481a4ff79239887399a16f224df14a408a]
#    to [b8987a29624554eb78cf21910612ed6dafdfde48]
# 
# patch "diff_patch.hh"
#  from [e1a61d6117d79bec3584a14d3994652911dd5ee7]
#    to [b57bd7626f520d040ca9a43cd23381c54f2cc1bb]
# 
# patch "monotone.texi"
#  from [0246c4a5e9d216ebc2b5b8d30ad411f8fdd44345]
#    to [f4b1a0d2127a416cb42a7321fdef5913e82d282d]
# 
============================================================
--- cmd_automate.cc     cd359f474386a86ab988cfda2faa672806113bde
+++ cmd_automate.cc     087d698c66df08a1b69b3ee3041edf3ac9e3ebfe
@@ -61,7 +61,7 @@ automate_command(utf8 cmd, vector<utf8> 
   find_automation(cmd, root_cmd_name).run(args, root_cmd_name, app, output);
 }
 
-static string const interface_version = "4.0";
+static string const interface_version = "5.0";
 
 // Name: interface_version
 // Arguments: none
============================================================
--- cmd_diff_log.cc     5db45374e646424aab11a5e822d204ad92eb3bca
+++ cmd_diff_log.cc     2ef2e7406ed92a1879d805d873821a41a8482735
@@ -13,6 +13,7 @@
 #include <sstream>
 #include <queue>
 
+#include "basic_io.hh"
 #include "cmd.hh"
 #include "diff_patch.hh"
 #include "localized_file_io.hh"
@@ -332,14 +333,14 @@ dump_diffs(cset const & cs,
   dump_diffs(cs, app, new_is_archived, output, dummy);
 }
 
-// common functionality for diff and automate content_diff to determine
+// common functionality for diff and automate diff to determine
 // revisions and rosters which should be diffed
 static void 
 prepare_diff(cset & included,
              app_state & app, 
              std::vector<utf8> args,
              bool & new_is_archived,
-             std::string & revheader)
+             revision_id & old_revision)
 {
   temp_node_id_source nis;
   ostringstream header;
@@ -358,10 +359,10 @@ prepare_diff(cset & included,
   if (app.opts.revision_selectors.size() == 0)
     {
       roster_t new_roster, old_roster;
-      revision_id old_rid;
+      revision_id r_old_id;
 
       app.work.get_base_and_current_roster_shape(old_roster, new_roster, nis);
-      app.work.get_revision_id(old_rid);
+      app.work.get_revision_id(r_old_id);
 
       node_restriction mask(args_to_paths(args),
                             args_to_paths(app.opts.exclude_patterns),
@@ -374,7 +375,7 @@ prepare_diff(cset & included,
       check_restricted_cset(old_roster, included);
 
       new_is_archived = false;
-      header << "# old_revision [" << old_rid << "]" << "\n";
+      old_revision = r_old_id;
     }
   else if (app.opts.revision_selectors.size() == 1)
     {
@@ -403,7 +404,7 @@ prepare_diff(cset & included,
       check_restricted_cset(old_roster, included);
 
       new_is_archived = false;
-      header << "# old_revision [" << r_old_id << "]" << "\n";
+      old_revision = r_old_id;
     }
   else if (app.opts.revision_selectors.size() == 2)
     {
@@ -453,13 +454,12 @@ prepare_diff(cset & included,
       check_restricted_cset(old_roster, included);
 
       new_is_archived = true;
+      old_revision = r_old_id;
     }
   else
     {
       I(false);
     }
-
-    revheader = header.str();
 }
 
 CMD(diff, N_("informative"), N_("[PATH]..."),
@@ -476,10 +476,10 @@ CMD(diff, N_("informative"), N_("[PATH].
         "try adding --external or removing --diff-args?"));
 
   cset included;
-  std::string revs;
+  revision_id old_rev;
   bool new_is_archived;
   
-  prepare_diff(included, app, args, new_is_archived, revs);
+  prepare_diff(included, app, args, new_is_archived, old_rev);
   
   data summary;
   write_cset(included, summary);
@@ -489,7 +489,7 @@ CMD(diff, N_("informative"), N_("[PATH].
   cout << "# " << "\n";
   if (summary().size() > 0)
     {
-      cout << revs << "# " << "\n";
+      cout << "# base_revision [" << old_rev << "]\n" << "#\n";
       for (vector<string>::iterator i = lines.begin(); 
            i != lines.end(); ++i)
         cout << "# " << *i << "\n";
@@ -506,30 +506,150 @@ CMD(diff, N_("informative"), N_("[PATH].
     dump_diffs(included, app, new_is_archived, cout);
 }
 
+void
+dump_diffs_basic_io(app_state & app, basic_io::printer & printer, const cset & 
cs, bool new_is_archived)
+{
+    for (map<split_path, file_id>::const_iterator
+         i = cs.files_added.begin(); i != cs.files_added.end(); ++i)
+    {
+      data unpacked;
+      vector<string> lines;
 
-// Name: content_diff
+      if (new_is_archived)
+        {
+          file_data dat;
+          app.db.get_file_version(i->second, dat);
+          unpacked = dat.inner();
+        }
+      else
+        {
+          read_localized_data(file_path(i->first),
+                              unpacked, app.lua);
+        }
+
+        // FIXME: if this should _ever_ become a transferrable format
+        // we need to express binary data here, otherwise we can't re-create
+        // a full and valid revision
+      if (guess_binary(unpacked()))
+        {
+          continue;
+        }
+
+      basic_io::stanza st;
+      st.push_hex_pair(symbol("diff"), i->second.inner());
+      st.push_hex_pair(symbol("to"), i->second.inner());
+      
+      std::stringstream diff_out;
+      std::string pattern("");
+      bool omit_header = true;
+      
+      make_diff(file_path(i->first).as_internal(),
+                file_path(i->first).as_internal(),
+                i->second,
+                i->second,
+                data(), unpacked,
+                diff_out, unified_diff, pattern, omit_header);
+      
+      st.push_str_pair(symbol("data"), diff_out.str());
+
+      printer.print_stanza(st);
+    }
+    
+    map<split_path, split_path> reverse_rename_map;
+
+    for (map<split_path, split_path>::const_iterator
+         i = cs.nodes_renamed.begin();
+       i != cs.nodes_renamed.end(); ++i)
+    {
+      reverse_rename_map.insert(make_pair(i->second, i->first));
+    }
+    
+    for (map<split_path, pair<file_id, file_id> >::const_iterator i = 
cs.deltas_applied.begin();
+       i != cs.deltas_applied.end(); ++i)
+    {
+     
+
+      file_data f_old;
+      data data_old, data_new;
+      app.db.get_file_version(i->second.first.inner(), f_old);
+      data_old = f_old.inner();
+
+      if (new_is_archived)
+        {
+          file_data f_new;
+          app.db.get_file_version(delta_entry_dst(i), f_new);
+          data_new = f_new.inner();
+        }
+      else
+        {
+          read_localized_data(file_path(delta_entry_path(i)),
+                              data_new, app.lua);
+        }   
+
+        // FIXME: if this should _ever_ become a transferrable format
+        // we need to express binary data here, otherwise we can't re-create
+        // a full and valid revision
+      if (guess_binary(data_old()) || guess_binary(data_new()))
+        {
+          continue;
+        }
+        
+      basic_io::stanza st;
+      st.push_hex_pair(symbol("diff"), i->second.first.inner());
+      st.push_hex_pair(symbol("to"), i->second.second.inner());
+      
+      split_path dst_path = delta_entry_path(i);
+      split_path src_path = dst_path;
+      map<split_path, split_path>::const_iterator re;
+      re = reverse_rename_map.find(dst_path);
+      if (re != reverse_rename_map.end())
+        src_path = re->second;
+    
+      std::stringstream diff_out;
+      std::string pattern("");
+      bool omit_header = true;
+      
+      make_diff(file_path(src_path).as_internal(),
+                file_path(dst_path).as_internal(),
+                delta_entry_src(i),
+                delta_entry_dst(i),
+                data_old, data_new,
+                diff_out, unified_diff, pattern, omit_header);
+      
+      st.push_str_pair(symbol("data"), diff_out.str());
+
+      printer.print_stanza(st);
+    }
+}
+
+// Name: diff
 // Arguments:
-//   (optional) one or more files to include
+//   (optional) one or more paths to restrict the output on
 // Added in: 4.0
 // Purpose: Availability of mtn diff as automate command.
 //
-// Output format: Like mtn diff, but with the header part omitted (as this is
-// doubles the output of automate get_revision). If no content changes 
happened,
-// the output is empty. All file operations beside mtn add are omitted,
-// as they don't change the content of the file.
-AUTOMATE(content_diff, N_("[FILE [...]]"),
-    options::opts::revision | options::opts::depth | options::opts::exclude)
+// Output format: basic_io changesets and diffs
+AUTOMATE(diff, N_("[FILE [...]]"), options::opts::revision)
 {
   cset included;
-  std::string dummy_header;
   bool new_is_archived;
+  revision_id old_rev;
   
-  prepare_diff(included, app, args, new_is_archived, dummy_header);
+  prepare_diff(included, app, args, new_is_archived, old_rev);
+
+  basic_io::printer pr;
   
-  dump_diffs(included, app, new_is_archived, output);
+  basic_io::stanza st;
+  st.push_hex_pair(symbol("base_revision"), old_rev.inner());
+  pr.print_stanza(st);
+  
+  print_cset(pr, included);
+  dump_diffs_basic_io(app, pr, included, new_is_archived);
+  
+  data dat = data(pr.buf); 
+  output << dat;
 }
 
-
 static void
 log_certs(app_state & app, revision_id id, cert_name name,
           string label, string separator,
============================================================
--- diff_patch.cc       a8b1d3481a4ff79239887399a16f224df14a408a
+++ diff_patch.cc       b8987a29624554eb78cf21910612ed6dafdfde48
@@ -1222,7 +1222,8 @@ make_diff(string const & filename1,
           data const & data2,
           ostream & ost,
           diff_type type,
-          string const & pattern)
+          string const & pattern,
+          bool omit_header)
 {
   if (guess_binary(data1()) || guess_binary(data2()))
     {
@@ -1317,18 +1318,24 @@ make_diff(string const & filename1,
     {
       case unified_diff:
       {
-        ost << "--- " << filename1 << "\t" << id1 << endl;
-        ost << "+++ " << filename2 << "\t" << id2 << endl;
+        if (!omit_header)
+          {
+            ost << "--- " << filename1 << "\t" << id1 << endl;
+            ost << "+++ " << filename2 << "\t" << id2 << endl;
+          }
 
-        unidiff_hunk_writer hunks(lines1, lines2, 3, ost, pattern);
+          unidiff_hunk_writer hunks(lines1, lines2, 3, ost, pattern);
         walk_hunk_consumer(lcs, left_interned, right_interned, hunks);
         break;
       }
       case context_diff:
       {
-        ost << "*** " << filename1 << "\t" << id1 << endl;
-        ost << "--- " << filename2 << "\t" << id2 << endl;
-
+        if (!omit_header)
+          {
+            ost << "*** " << filename1 << "\t" << id1 << endl;
+            ost << "--- " << filename2 << "\t" << id2 << endl;
+          }
+        
         cxtdiff_hunk_writer hunks(lines1, lines2, 3, ost, pattern);
         walk_hunk_consumer(lcs, left_interned, right_interned, hunks);
         break;
============================================================
--- diff_patch.hh       e1a61d6117d79bec3584a14d3994652911dd5ee7
+++ diff_patch.hh       b57bd7626f520d040ca9a43cd23381c54f2cc1bb
@@ -36,7 +36,8 @@ void make_diff(std::string const & filen
                data const & data2,
                std::ostream & ost,
                diff_type type,
-               std::string const & pattern);
+               std::string const & pattern,
+               bool omit_header = false);
 
 bool merge3(std::vector<std::string> const & ancestor,
             std::vector<std::string> const & left,
============================================================
--- monotone.texi       0246c4a5e9d216ebc2b5b8d30ad411f8fdd44345
+++ monotone.texi       f4b1a0d2127a416cb42a7321fdef5913e82d282d
@@ -1456,7 +1456,7 @@ @section Adding Files
 @group
 $ mtn diff
 # 
-# old_revision []
+# base_revision []
 # 
 # add_file "include/jb.h"
 #  content [3b12b2d0b31439bd50976633db1895cff8b19da0]
@@ -1792,7 +1792,7 @@ @section Making Changes
 @group
 $ mtn diff
 # 
-# old_revision [2e24d49a48adf9acf3a1b6391a080008cbef9c21]
+# base_revision [2e24d49a48adf9acf3a1b6391a080008cbef9c21]
 # 
 # patch "src/apple.c"
 #  from [2650ffc660dd00a08b659b883b65a060cac7e560]
@@ -2022,7 +2022,7 @@ @section Dealing with a Fork
 @group
 $ mtn diff
 # 
-# old_revision [80ef9c9d251d39074d37e72abf4897e0bbae1cfb]
+# base_revision [80ef9c9d251d39074d37e72abf4897e0bbae1cfb]
 #
 # patch "src/banana.c"
 #  from [7381d6b3adfddaf16dc0fdb05e0f2d1873e3132a]
@@ -6826,7 +6826,7 @@ @section Automation
 
 @end table
 
address@hidden mtn automate content_diff address@hidden address@hidden 
address@hidden ...]
address@hidden mtn automate diff address@hidden address@hidden address@hidden 
...]
 
 @table @strong
 @item Arguments:
@@ -6851,36 +6851,48 @@ @section Automation
 
 @item Added in:
 
-4.0
+5.0
 
 @item Purpose:
 
-Prints the content changes between two revisions or a revision and the current
-workspace. This command differs from @command{mtn diff} in that way that it 
only
-outputs content changes and keeps quite on renames or drops, as the header of
address@hidden diff} is omitted (this is what @command{mtn automate 
get_revision} 
-already provides).
+Prints the changeset between two revisions or a revision and the current
+workspace. This command differs from @command{mtn diff} in that way that it 
+outputs the actual diff data in a complete basic_io format (including the cset)
+and omits diff stanzas for binary files.
 
 @item Sample output:
 
 @verbatim
-============================================================
---- guitone/res/i18n/guitone_de.ts      
9857927823e1d6a0339b531c120dcaadd22d25e9
-+++ guitone/res/i18n/guitone_de.ts      
0b4715dc296b1955b0707923d45d79ca7769dd3f
-@@ -1,6 +1,14 @@
- <?xml version="1.0" encoding="utf-8"?>
- <!DOCTYPE TS><TS version="1.1">
- <context>
-+    <name>AncestryGraph</name>
-+    <message>
-[...]
+base_revision []
+
+add_dir ""
+
+add_file "foo"
+ content [3432945fed45564841c38b75277b3205bbe96966]
+
+diff [3432945fed45564841c38b75277b3205bbe96966]
+  to [3432945fed45564841c38b75277b3205bbe96966]
+data "@@ -0,0 +1 @@
++Hello, I'm the foo file!
+"
 @end verbatim
 
 @item Output format:
 
-The GNU unified diff format. If there have been no content changes, the output
-is empty.
+The output format are basic_io stanzas, formatted just like the output of 
address@hidden automate get_revision} and ordered equally by type. However, the
+stanzas 'format_version' and 'new_manifest' are not (yet) part of automate 
diff.
 
+An additional stanza, named 'diff' is introduced. A 'diff' is identified by the
+content ids of the added or changed file. The format is always unified diff and
+does not contain the diff separator and header which are present in the normal 
+diff output. If either the old or the new contents of the file are recognized 
as
+binary, the diff stanza will be omitted for the patch.
+
+All diff stanzas are outputted at the very end of the cset.
+
+If there are no changes recorded, the output is empty.
+
 @item Error conditions:
 
 If more than two revisions are given or a workspace is required, but

[Prev in Thread]

Current Thread

[Next in Thread]

[Monotone-devel] [PATCH] mtn automate diff, Thomas Keller <=
- Re: [Monotone-devel] [PATCH] mtn automate diff, Thomas Keller, 2007/01/16

Prev by Date: [Monotone-devel] Deterministic *-merge
Next by Date: Re: [Monotone-devel] Deterministic *-merge
Previous by thread: [Monotone-devel] Deterministic *-merge
Next by thread: Re: [Monotone-devel] [PATCH] mtn automate diff
Index(es):
- Date
- Thread