monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] Re: [sqlite] disk locality (and delta storage)


From: Nathaniel Smith
Subject: Re: [Monotone-devel] Re: [sqlite] disk locality (and delta storage)
Date: Sat, 11 Feb 2006 02:11:10 -0800
User-agent: Mutt/1.5.11

Thanks for this detailed email!  I've been a bit crazy with codecon
the last week, and most of what I have to say is "hmm, lots to think
about here", so, yeah :-).

I did want to reply to your VCS system comments, though:

On Tue, Feb 07, 2006 at 10:17:54AM -0500, address@hidden wrote:
> What I'm looking for is a VCS + bug-tracker + wiki that I can just
> drop into a cgi-bin of some low cost web hosting site (like Hurricane
> Electric, which I notice we both use) and have a ready-to-roll project
> management system with no setup or other details to worry about.  Kind
> of a sourceforge-in-a-box, only a lot better than sourceforge.  I'm

That sounds awesome.  I'm _really_ interested in this area myself,
actually, though the stuff I've been playing with is somewhat
orthogonal.  (Nothing really working yet, but playing with ways to
improve data integration -- by folding in even more, like IRC logs and
mailing list archives.  These kinds of things require a much heavier
server architecture, though, so sort of exploring different parts of
the design space.)

> looking for something like monotone with CVSTrac enhancements that
> works over HTTP.  Nothing like that currently exists, I'm afraid, 
> so I've been working on my own.
>
> Yes, it is a big job, though perhaps not any bigger than writing 
> an SQL database engine ;-)

I can't say monotone works with CVSTrac, but then, I assume your
system doesn't either off the bat :-).  And, monotone can work over
HTTP -- or at least, _I_ say it can :-).  There is a branch where I
worked out a proof-of-concept implementation of this:
  http://viewmtn.angrygoats.net/branch.psp?branch=net.venge.monotone.dumb
It's as a python script that does some rather Clever Things.  I don't
have time to explain the algorithm in detail now, but it basically
supports monotone's full push/pull/sync semantics, you can talk to one
friend's repo and then talk to another's, etc., that all just works;
and it should be transactional (except that in some rare cases someone
has to rollback by hand; in the dumb remove filesystem case, readers
can't in general do a rollback, and renames can't in general be
atomic, so if you're extremely unlikely you might get wedged).  And it
should be really speedy on top of some well done network backends (you
really want things like pipelining).  If you do take a look at it,
there's basically "dumb.py" doing monotone glue, "fs.py" giving the
generic filesystem interface that needs to be implemented for each
transport, and "merkle_dir.py" in between, where all the magic lives.

I haven't actually convinced anyone yet to take care of it and polish
it up, though (or rewrite more sanely, or whatever), so it's just sort
of been sitting there, waiting forlornly for someone who will give it
love...

> Monotone is my favorite of the current crop of VCSes by the way.
> It has 80% of what I am looking for.  What I'm really after is 
> a better monotone.  I have been greatly inspired by your work,
> as have others, I notice.

Thank you very much!  I would be interested in hearing what you think
of as the last 20%, beyond HTTP support.

I also innocently observe that if what you want is a better monotone,
the quickest route may be to... make monotone better ;-).

> > While size is definitely not everything -- I doubt anyone notices 10%
> > here or there -- a factor of 2-3x is probably going to be hard to
> > sell.  Unfortunately, since it's a nice scheme.
> 
> The CVS repository for SQLite is 15MiB.  Under a baseline+delta schema
> it would grow to (perhaps) 45MiB.  The cheapest account on Hurricane
> Electric includes 1GiB of storage.  Why should I care about the extra
> 30MiB?

Well, because there are enough people out there with gig-sized source
checkouts.  (Yeah, not even history, but... checkouts.)  Or hundreds
and hundreds of thousands of commits (gcc, mozilla, linux kernel...).
While there's definitely value to a tool that works well for the 95%
of projects that are not that big, we really would like to have a
single tool that 100% of projects can viably use.

The extraordinarily intense competition in the VCS field is also a
factor; note that mercurial scales to those projects just fine,
without excessive space usage, and is experiencing great uptake on
small projects too -- so since we think we offer some other
compelling advantages, we really would like to compete directly across
the whole range :-).

-- Nathaniel

-- 
In mathematics, it's not enough to read the words
you have to hear the music




reply via email to

[Prev in Thread] Current Thread [Next in Thread]