monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] speed up the initial pull


From: Nathaniel Smith
Subject: Re: [Monotone-devel] speed up the initial pull
Date: Sat, 29 Apr 2006 06:35:51 -0700
User-agent: Mutt/1.5.11

On Sat, Apr 29, 2006 at 11:30:00AM +0200, Markus Schiltknecht wrote:
> > And, you know, if we try and fail, then of course we'll consider other
> > options.  But right now it feels like we're rejecting the best
> > solution, because we're worried it _might_ not work.
> 
> I've never argued to reject the best solution (tm). I even see my
> proposal more as a replacement for the 'download-the-db-via-http'
> workaround than as a competitor to netsync.

Oh, sure, definitely.  I'm not even saying that people _shouldn't_
work on workarounds.  Obviously I have no control over what people
work on anyway; but even if I did, I wouldn't use it.

I can, though, try to _convince_ everyone to get _excited_ about doing
something different ;-).  So I've been trying to explain the reasoning
that some of us have worked through previously (in a less public way),
that convinced us to take this frustrating and tiresome approach we've
been working on.

Well, basically what it comes down to is that I'm lazy, and maybe if I
can get people excited about all pulling together in the same
direction that a few of us have been going already, we will all have
more fun and it will be less tiresome ;-).  And then if our first
approach doesn't work out, we can all work together on figuring out
the best possible workarounds, too.

(But, seriously, I mean it, anyone who is still itching to get a
shorter solution and finds this other approach I like to be less
exciting, scratch your itch.  Rule 0, code and fun are _always_
better than no code and/or no fun.)

> As soon as netsync manages to transfere repositories in a reasonable
> amount of time I'm more than willing to drop any workaround. But until
> we are there we can either dissappoint early-adopters or accept some
> sort of workaround.
> 
> On Fri, 2006-04-28 at 19:19 +0100, Bruce Stephens wrote: 
> > I'd guess reconstructing the bytes to check (applying the xdelta
> > patches and things) is vastly more costly.
> 
> Up until now I thought 'consistency checking takes so much time' was the
> answer to the 'why does pulling it take so long?' question.
> 
> Honestly, I don't know and if you are right, then of course such a
> workaround is pretty useless.

With the old pre-roster code, consistency checking was _really_
expensive; we didn't have any systematic understanding of what could
go wrong, so we took the "throw ten dozen stupid brute force tests at
everything and hope that that'll be enough".  (Of course, to add
injury to... injury, it _wasn't_ enough.  Hence the various
rename-related merge crashes people run into.  But I'm digressing.)
One of the major motivations of this weird and crazy "roster" stuff
was exactly to remove this checking bottleneck :-).  Getting rosters
ready soaked up a lot of work over a long time, so actually very
little work has been done -- yet -- on fixing the _new_ set of
bottlenecks.

(Actually, I wonder if this is one reason I'm not especially afraid of
tackling the remaining slowdowns head-on.  I don't know how to fix
them... but I vividly remember being even _less_ sure there was _any_
way to make renames and history tracking and merging make sense and be
fast, and, well, that seems to have worked out okay.  It might turn
out once we get into it to be even harder than rosters were, who
knows... but I definitely don't see any reason to be _daunted_ by
it :-).)

I should also say that in retrospect, I think I might have
overemphasized the cost of the consistency checking for the old code.
Checking did definitely set a hard lower bound, though (like, maybe we
could actually have gotten 2x faster by tweaking deltas and stuff, but
that's it), and definitely it was the overwhelming cause of the
slowness seen on large trees like OE.

> My conclusion: I want to better understand how netsync works and I need
> to do some profiling. I'll let you know...

Cool!  Make sure to keep in touch about what you're trying and finding
(IRC is especially good for this).  Let's kill this problem dead! :-)

Cheers,
-- Nathaniel

-- 
The best book on programming is still Strunk and White.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]