monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] Re: results of mercurial user survey


From: Markus Schiltknecht
Subject: Re: [Monotone-devel] Re: results of mercurial user survey
Date: Fri, 28 Apr 2006 17:47:31 +0200

Hello Bruce,

thank you for your answer.

On Fri, 2006-04-28 at 15:49 +0100, Bruce Stephens wrote:
> Except if the sending end has bogus data (as happened with the
> venge.net server).

Even in that case, netsync (tcp, really) guarantees to correctly deliver
the bogus data ;-)

> Maybe.  On the other hand, the current symmetry is nice: that servers
> and clients really aren't *that* different.

I totally agree and I love monotone for exactly that reason (among
others of course).

> And it's a nice property
> that my monotone checks everything that it gets before the data goes
> into its database.  

half ACK. In most cases, yes. But on an initial pull of a > 200mb
repository I don't want monotone to check everything right away.

> And would we really want to rely on a possibly damaged peer to
> reliably detect that it's damaged?

For a first impression of a project (repository), yes, why not. If I
know I'm going to hack on that project I can at least ask monotone to
check over the night.

Especially for open source projects, the first impression is very
important. If I have to wait several hours (with 100% CPU utilization)
before I can checkout the source I'm long gone to another project.

Plus there are often much more read-only users than active committers.
They all benefit from a faster checkout (despite of that I don't want to
burden such users with consistency checking my repository).

> (And that's not considering deliberately malign peers.)

Hm. That's a point. Though it is normally handled by granting or
revoking commit acces, isn't it?

> [...]
> 
> > This would allow a user to pull a repository and have a look at it. And
> > since a lot of users only want an up to date read-only copy (i.e. they
> > don't commit anything) that's a huge gain, IMHO.
> 
> How much does verification cost?  IIRC, njs measured it and found that
> the answer's not that much.  It seems likely that if we assumed that
> we were always doing no verification, then one could change things in
> such a way that netsync was quite a bit faster.  

Depends on what you measure. I'm currently working with a repository
with 21'000 revisions. A simple 'mtn sync' takes some time (primarly
collecting data to sync). A pull takes hours of 100% CPU usage.

> I'd want to see some estimates before I found that convincing, though.
> For example, we know it's going to be slower than the raw network
> speed because we're taking and sticking data into a database in little
> chunks.  For the initial pull, we could just copy the whole database,

That's the 'download-the-db-via-http' approach, which simply subverts
any consistency check. If you install such a thing you have to be aware,
that your database is never checked (except for the most recent
revisions which get checked during sync). If you don't want to give up
safety of your archiving tool, that's simply not an option.

> but then that suggests that it might not be the verification as such
> that's the problem as much as using the database in this way.

Hm. Considering CPU usage vs. harddrive noise I don't think the database
was very busy during an initial pull. But the last time I've tried was
before 0.26...

> > What do you think? Is it feasible to implement such a suspection list?
> 
> It sounds too complex to be worthwhile without being more sure about
> the benefits, IMHO.

Yes, I need to check again how much time could be saved. Does anybody
have some meaningfull numbers about how much consistency checking of an
initial import costs?

Regards

Markus






reply via email to

[Prev in Thread] Current Thread [Next in Thread]