Re: [Monotone-devel] Re: [patch] roster-deltas (40% pull speedup)

monotone-devel
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Monotone-devel] Re: [patch] roster-deltas (40% pull speedup)

From:	Nathaniel Smith
Subject:	Re: [Monotone-devel] Re: [patch] roster-deltas (40% pull speedup)
Date:	Wed, 23 Aug 2006 03:42:51 -0700
User-agent:	Mutt/1.5.12-2006-07-14
On Tue, Aug 22, 2006 at 02:24:00PM -0700, Graydon Hoare wrote:
> First off: fantastic work!
> 
> I only have minor nit-picky comments at this point; as far as "the plan" 
> we've been discussing, this looks basically perfect, I'm surprised you 
> got it done so fast.

Thanks for looking it over -- I know you don't have much time these
days :-).

> I haven't carefully gone over the transaction commit / writeback / flush 
> ordering in database.cc, which I think you probably want. I'll try 
> taking a whole-file read soon, but I'd be very surprised if any bugs in 
> there would avoid tripping a fatal assertion due to a missing db row.

Well, it's a tricky bit of state tracking.  For instance, at one
point, I had converted things so that rosters were indexed by
roster_id, but hadn't actually changed their storage format or
anything yet.  I was still caching them in the vcache.  I only
realized much later that this meant there was a major bug -- because
rollback() does not flush the vcache, I could have left stale rosters
that never existed in there... this doesn't apply to files in the
vcache, because they're indexed by content, but it did to rosters once
I switched them over to being indexed by...index.

That's the kind of bug I'm worried about -- somehow dropping a step in
the dance between roster_cache insertion, marking things clean/dirty,
and letting them eventually get written or not -- depending on how the
transaction resolves, and whether a delta gets written straight to the
db first.  I _think_ I got all this stuff right, but I would feel
better if someone independently convinced themself of the same thing.

The other place I'm worried for similar reasons is in the roster delta
code -- this is much simpler, but again, I'd appreciate it if someone
else independently read through and convinced themselves that we
aren't going to accidentally drop some field or whatever.

> >+ * This template creats a simple collection of key-value pairs that grows
> >+ * until the size specified at construction is reached and then begins
> >+ * discard the Least Recently Used element on each insertion.
> 
> Modify this comment to describe the dirty set.

Good point.  Done.

> >+  struct roster_delta_t
> >+  {
> >+    typedef std::set<node_id> nodes_deleted_t;
> >+    nodes_deleted_t nodes_deleted;
> >+    typedef std::map<std::pair<node_id, path_component>, node_id> 
> >dirs_added_t;
> >+    dirs_added_t dirs_added;
> >+    typedef std::map<std::pair<node_id, path_component>, 
> >std::pair<node_id, file_id> > files_added_t;
> >+    files_added_t files_added;
> >+    typedef std::map<node_id, std::pair<node_id, path_component> > 
> >nodes_renamed_t;
> >+    nodes_renamed_t nodes_renamed;
> >+    typedef std::map<node_id, file_id> deltas_applied_t;
> >+    deltas_applied_t deltas_applied;
> >+    typedef std::set<std::pair<node_id, attr_key> > attrs_cleared_t;
> >+    attrs_cleared_t attrs_cleared;
> >+    typedef std::set<std::pair<node_id, std::pair<attr_key, 
> >std::pair<bool, attr_value> > > > attrs_changed_t;
> >+    attrs_changed_t attrs_changed;
> 
> Stylistic: I think I'd prefer all the typedefs first, then the slot 
> defs. Also wrap at 80 cols please.

Okay, done.

> Semantic: I'm slightly concerned that this is the non-invertable 
> representation. The version I was working on (which, granted, is 
> horribly incomplete) was invertable. You can only reconstruct backwards. 
> Does this concern you at all? The only inversion cases I could imagine 
> were just that: imaginary. I guess it's private so we can always revisit.

The only one I can think of is restricted log with the --next switch
--- and even that one might go away, if we do further optimization for
log and friends.  So yeah, I'd prefer to defer this part for now.  (It
should be really easy to add later on top of the machinery this patch
adds.)

> >+    // attach everything
> >+    for (dirs_added_t::const_iterator i = dirs_added.begin(); i != 
> >dirs_added.end(); ++i)
> >+      roster.attach_node(i->second, i->first.first, i->first.second);
> >+    for (files_added_t::const_iterator i = files_added.begin(); i != 
> >files_added.end(); ++i)
> >+      roster.attach_node(i->second.first, i->first.first, 
> >i->first.second);
> >+    for (nodes_renamed_t::const_iterator i = nodes_renamed.begin(); i != 
> >nodes_renamed.end(); ++i)
> >+      roster.attach_node(i->first, i->second.first, i->second.second);
> 
> Stylistic: a bit more whitespace and wrapping, please.

Okay, done.

> Also I'm shifting 
> commenting style towards full sentences starting with a capital letter 
> and ending with a period. I know, ghastly. What's happening?!

Who are you, and what you have you done with graydon?!?

(Done.)

> >+
> >+    // roight, all that tricky tree-rearranging done, just have to do some
> >+    // individual node edits now
> 
> "roight"?

Err... roight.  Done.

> >+  void
> >+  delta_only_in_to(node_t new_n, roster_delta_t & d)
> >+  delta_in_both(node_t old_n, node_t new_n, roster_delta_t & d)
> 
> Smells like duplicated code from roster.cc. Is it refactorable?

Probably because it was copy-and-pasted :-).  I couldn't figure out
how to refactor it, though; make_cset and make_roster_delta_t have
basically the same control structure, but all the leaves of the
control graph are different.  I guess we could write some sort of
visitor strategy object thingie, but I doubt it would clarify much;
parallel::iter is basically that already...

> >+    for (roster_delta_t::nodes_deleted_t::const_iterator i = 
> >d.nodes_deleted.begin(); i != d.nodes_deleted.end(); ++i)
> >+    for (roster_delta_t::nodes_renamed_t::const_iterator i = 
> >d.nodes_renamed.begin(); i != d.nodes_renamed.end(); ++i)
> >+    for (roster_delta_t::dirs_added_t::const_iterator i = 
> >d.dirs_added.begin(); i != d.dirs_added.end(); ++i)
> 
> ... wrap.

Done.

> >--- roster_delta.hh  1939a0f9715abe7b3e01f3c6887fecd5fbb05323
> >+++ roster_delta.hh  1939a0f9715abe7b3e01f3c6887fecd5fbb05323
> >@@ -0,0 +1,30 @@
> >+#ifndef __ROSTER_DELTA_HH__
> >+
> >+//////////
> >+// Experimental roster delta stuff
> 
> Axe this comment.

Oops... already did, just forgot to commit it :-).

> >         lru_cache.h                                                 \
> >+        lru_writeback_cache.h                                        \
> 
> Do we really need lru_cache.h to exist? Why not use lru_writeback_cache 
> in all cases and ignore the dirty bits when irrelevant?

Good point.  Done.

> >-#undef _REENTRANT
> >-#include "lru_cache.h"
> >+#include "lru_cache.hh"
> 
> If you're using lru_cache.hh (not .h), make sure you change the makefile 
> and commit it. Or just toss lru_cache.{h,hh} as discussed above.

Right...

> >@@ -77,9 +78,10 @@ namespace
> > {
> >   struct query_param
> >   {
> >-    enum arg_type { text, blob };
> >+    enum arg_type { text, blob, s64 };
> >     arg_type type;
> >     string data;
> >+    ::s64 integer;
> >   };
> 
> Hm. I accept that sqlite prefers signed integers, but I sure don't. Can 
> we push the conversion to and from unsigned down inside this interface, 
> so it only happens in database.cc?
> 
> >@@ -748,6 +784,12 @@ database::fetch(results & res,
> >                               SQLITE_STATIC);
> >           }
> >           break;
> >+        case query_param::s64:
> >+          {
> >+            sqlite3_bind_int64(i->second.stmt(), param,
> >+                               idx(query.args, param - 1).integer);
> >+          }
> >+          break;
> >         default:
> >           I(false);
> >         }
> 
> I guess here, in particular. You could isolate it in one line here.

Hm.  I'm not sure.  Basically, IIUC, the difference between signed and
unsigned is in:
  1) comparison operators
  2) widening coercion
  3) IO/printing/parsing
and in particular, arithmetic couldn't care less _which_
representatives us silly humans think we're using for our 2^n modulo
equivalence classes, the computer'll happily use the same bit strings
either way?

(1) and (2) are irrelevant here, because these are just uninterpreted
blobs, we never _do_ anything with them (also, they're 64 bit, so not
much widening going on :-)).  (3) is, sadly, actually somewhat
relevant -- we do often convert these things to and from strings,
either to send to sqlite3 (because we are using code shared with file
etc. reconstruction that wants to use strings), or get back (because
results objects are always strings).

In fact this doesn't matter, because just like we know that a 64-bit
value will never overflow, we know a 63-bit value won't overflow
either, and that's what would have to happen for any of this to
matter.

Given that it's purely a question of style, though, I guess I have a
slight preference for sticking with signed integers, just because the
IO stuff makes this style seem cleaner to me :-).

However, I really do not care all that much, and am curious what your
reasons are.

> >@@ -783,10 +825,19 @@ database::fetch(results & res,
> > 
> >   i->second.count++;
> > 
> >-  E(want_rows == any_rows || want_rows == nrow,
> >-    F("wanted %d rows got %d in query: %s") % want_rows % nrow % 
> >query.sql_cmd);
> >+  I(want_rows == any_rows || want_rows == nrow);
> > }
> 
> Is there a reason for this hunk? I prefer the former message: more 
> detail for diagnostics.

Fair enough -- the point was just to force it to give me a dump of the
MM stack, which I needed for debugging.  (There are some other I()
insertions for this reason too.)  You can get a dump of the stack out
of E() though (if you use --debug), so changing it back seems fine.

Done.

> >+unsigned long
> >+database::roster_size_estimator::operator()(cached_roster const & cr)
> >+{
> >+  I(cr.first);
> >+  I(cr.second);
> >+  // do estimate using a totally made up multiplier, probably wildly off
> >+  return cr.first->all_nodes().size() * 175;
> >+}
> 
> Maybe pick this more precisely and move it to constants.{cc,hh} (more 
> generally, maybe those numbers should be come hooks). For example: 
> Assume dir nodes make up a trivial portion of the roster, and file nodes 
> generally have no attributes. Then:
> 
>        - 40 bytes of sha1
>        - average path component length, say 10 bytes
>        - 5 * sizeof(void*) for internal fields
>          (+/- a factor of 2 depending on how strings, shared_ptrs, maps
>           are implemented)
> 
> Should be more like 70-80 bytes.

Each node in the cache also has an associated marking_t.  Assuming
that most mark sets have exactly one entry (very reasonable), and
again that there are no attributes, that's another 3 sha1's, so
another 120 bytes.  (Plus internal fields, overhead for set<>, map<>,
etc., again.)  If anything, it's maybe a little low, I guess?

> >-database::remove_version(hexenc<id> const & target_id,
> >-                         string const & data_table,
> >-                         string const & delta_table)
> 
> I wonder if it's wise to remove this function. Isn't it useful?

Its only use was so that 'db kill_rev_locally' could remove a
killed revision's roster (and potentially files?).  I have a bit of a
passive aggressive relationship with this feature :-).

My memory of the history is: originally kill_rev_locally left the
revision's roster (well, manifest, back then) alone, on the theory
that trying to remove it was tricky and error-prone, and that leaving
it there could save potentially someone's butt in extremum.  Then some
other people went and added remove_version (though leaving the bit in
the manual where it claims that these things are left behind), and
after corrupting a few user databases, even got the bugs worked out.

So, err, I'm not very excited about trying to rewrite it.

> >--- database.hh      5fd8600ce422d017284baa8fa0e38b6899a32e02
> >+++ database.hh      441546750699828b5f3c25d69e27427009b29d6d
> >@@ -30,6 +30,9 @@ int sqlite3_finalize(sqlite3_stmt *);
> > #include "selectors.hh"
> > #include "vocab.hh"
> > 
> >+// hmm, would be better not to include this everywhere...
> >+#include "lru_writeback_cache.hh"
> >+
> 
> Indeed. Maybe move the caches to a pimpl (can be done later, I don't mind).

Yeah, I'll worry about that later.

> Also might want to take the opportunity to remove the hilarious comment 
> about reverse deltas being the most confusing part of the program. 
> Hohoho happy days.

Hmm, what else is more confusing?  roster.cc and roster_merge.cc seem
perfectly clear to _me_... ;-)

> >--- roster.cc        d03806d489e8b30fc4564b9a4f025669770403d6
> >+++ roster.cc        4996a8586b27a298548151e97c3b59e966bcb704
> >+void
> >+roster_t::set_delta(node_id nid, file_id const & new_id)
> 
> s/set_delta/set_content/

Good point.  Fixed.

> >@@ -2439,7 +2486,6 @@ parse_marking(basic_io::parser & pa,
> >           pa.str(k);
> >           pa.hex(rev);
> >           attr_key key = attr_key(k);
> >-          I(n->attrs.find(key) != n->attrs.end());
> >           safe_insert(marking.attrs[key], revision_id(rev));
> >         }
> 
> Slightly nervous about this. Why is it ok?

Because the only caller for ros.parse_from is read_roster_and_marking,
which immediately calls roster.check_sane_against(marking), which does
the same check.

(Possibly we should also call check_sane_against after reconstructing
a roster from roster_deltas.  Hmm.  Doesn't involve any memory
allocation, might even be fast...)

> >--- roster.hh        43a7100415b7fe11037a50fbd8cdd0cf260b926d
> >+++ roster.hh        2f33287325fd542a21217aeba673015b0c271912
> >@@ -14,6 +14,7 @@
> > 
> > #include <boost/shared_ptr.hpp>
> > 
> >+#include "basic_io.hh"
> 
> You can just forward-declare the namespace basic_io with incomplete 
> types parser and printer in it.

'Fair nuff.  Fixed.

-- Nathaniel

-- 
"But in Middle-earth, the distinct accusative case disappeared from
the speech of the Noldor (such things happen when you are busy
fighting Orcs, Balrogs, and Dragons)."
[Prev in Thread]
Current Thread
[Next in Thread]
[Monotone-devel] [patch] roster-deltas (40% pull speedup), Nathaniel Smith, 2006/08/21
- Re: [Monotone-devel] [patch] roster-deltas (40% pull speedup), Eric Anderson, 2006/08/21
  - Re: [Monotone-devel] [patch] roster-deltas (40% pull speedup), Nathaniel Smith, 2006/08/21
    - Re: [Monotone-devel] [patch] roster-deltas (40% pull speedup), Eric Anderson, 2006/08/22
- [Monotone-devel] Re: [patch] roster-deltas (40% pull speedup), Graydon Hoare, 2006/08/22
  - Re: [Monotone-devel] Re: [patch] roster-deltas (40% pull speedup), Nathaniel Smith <=
    - Re: [Monotone-devel] Re: [patch] roster-deltas (40% pull speedup), Eric Anderson, 2006/08/23
Prev by Date: Re: [Monotone-devel] RFC: Unit test improvements
Next by Date: [Monotone-devel] [Program Options] cannot use Boost.Program_Options with an i18nized program
Previous by thread: [Monotone-devel] Re: [patch] roster-deltas (40% pull speedup)
Next by thread: Re: [Monotone-devel] Re: [patch] roster-deltas (40% pull speedup)
Index(es):
- Date
- Thread