[Monotone-devel] Re: results of mercurial user survey

monotone-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] Re: results of mercurial user survey

From:	Graydon Hoare
Subject:	[Monotone-devel] Re: results of mercurial user survey
Date:	Sat, 29 Apr 2006 10:47:51 -0700
User-agent:	Thunderbird 1.5.0.2 (Windows/20060308)

Bruce Stephens wrote:

I was just doing a quick estimate, and I think it's likely that the
SHA1 and RSA cost for checking everything in the current venge.net
repository is a minute or two rather than an hour or two.

If monotone were to give up verification, then it would have to be
because that would avoid some other aspects of work: reconstructing
files, reversing deltas, or whatever.


Two points:

First, boring though it feels, please stick to using profiles; do notmake up performance stories. The profiles sometimes mention SHA1, butthey almost always mention things which account for a lot more than ittoo. Inlining opportunities, combinatorial explosions, bad buffering,pessimistic cache behavior, etc. Please stick to what the profiles tell you.

Second, there is no specific part of monotone which you can point to andsay "this is where we do verification"; the concept is spread allthrough the program's design. And it's really not so much that we"verify"; as Nathaniel pointed out, the things specifically marked as"sanity checking" or "verifying" code rarely dominate any profiles.

However, there's a kernel of truth in here: the fact is that we "dowork" in between the network and the disk. What work?


  - Selecting the right information to send.
  - Transforming from the format we store in to the format we send.
  - Transforming back to the format to store in.
  - Integrating the received information into a uniform store.

These design decisions are deeply embedded in the program. The storageformat is intended not to leak out. I'm confident that we can make theexisting structure a fair bit faster -- there is still a lot to tune --but without extensive redesign there will be a limit to the speed, andit will be a lower limit than our competitors. The reason is simple: ourcompetitors decided to use the opposite design:


  - Their transmission format is identical to their storage format.
  - Their storage units are pre-separated into bundles representing the
    types of transmission you might like to make.

These decisions mean that their networking often reduces to somethinglike sendfile(). The decisions also imply some negatives:


  - They are forced to separate branches into separate locations, and
    cannot easily do fine-grained access control or mix branches the
    way we can.
  - By avoiding reconstruction of the storage format very often, they
    are more likely to let global or structural inconsistencies sit
    without noticing them.
  - By coupling the storage and transmission formats, they make it
    harder to adjust one without adjusting the other. We have more
    flexibility there.
  - Since we're synthesizing the storage format on the fly anyways,
    we can do things like repacking and rearranging the delta graph
    as we write.
  - Their repositories contain lots of files, typically, rather than
    our single sqlite file.

You might, by analogy, think of it as the difference between aCGI-driven website and one serving static content. Which is better? TheCGI-driven site can do more stuff, and do more *detailed* stuff, becauseit has more logic in it. The static site can serve the fixed set ofpages it has much faster. Can you make a slow CGI run faster? Often. Butseldom as fast as a static site. The logic of sendfile() is hard to beat.

There is some work -- called "monotone dumb" -- to make monotone have an"externalization form" which can be retrieved at sendfile() speed. Itwill carry some of the same limitations of our competitors, but maybethose limitations will prove acceptable. The difficulty lies in the factthat the monotone *client* will still need to integrate the externalizedinformation into its database. None of the normal monotone commands knowhow to work with such externalized forms. They all expect there to be adatabase. So the client will remain a bottleneck in such a situation,though only "half a bottleneck" compared to today.


-graydon

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Monotone-devel] Re: results of mercurial user survey, (continued)

Prev by Date: Re: [Monotone-devel] TracMonotone
Next by Date: Re: [Monotone-devel] Re: results of mercurial user survey
Previous by thread: [Monotone-devel] Re: results of mercurial user survey
Next by thread: Re: [Monotone-devel] Re: results of mercurial user survey
Index(es):
- Date
- Thread