monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] ideas to speed up monotone SQLite database access


From: Nathaniel Smith
Subject: Re: [Monotone-devel] ideas to speed up monotone SQLite database access
Date: Tue, 7 Feb 2006 03:32:22 -0800
User-agent: Mutt/1.5.11

On Mon, Feb 06, 2006 at 07:38:10PM -0800, Joe Wilson wrote:
> The monotone client in the initial pull was idle for minutes
> at a time waiting for netsync (no I/O, no CPU).  When the 
> netsync packets did arrive, the CPU hovered between 80% and 
> 99% for a minute or two and then resumed being idle again 
> waiting on the socket.

Hrm, really?  Odd.  Something funny going on at the server side, I
guess?  Or possibly bugs in the event loop, it's shown hints of
squirrelly behavior in 0.26pre1, but this may be jumping to
conclusions.  Hard to say without more data.

> What specific monotone operation do you feel is significantly 
> slowed by the SQLite database?

We know that initial pull is not bound by sqlite.  What our profiling
_has_ tended to show was that it was bound by us using some stupidly
inefficient algorithms for dealing with deltas; and so we've been
looking at various options to fix this.  These involve changing the
way we arrange deltas on disk, and the obvious approaches to make
netsync faster turn out to make things like 'checkout' potentially
much slower -- and this _does_ become bound on IO.

In particular, netsync wants to send forward deltas, but if we store
forward deltas on disk, we have to traverse (potentially long) delta
chains when reconstructing files at 'checkout' time.

Actually, using backwards deltas, we have the same problem right now
if someone checks out _old_ versions, but I don't think we've ever
gotten a complaint about this, so we haven't put it at the top of the
priority list.  If we switch to forward deltas, though, then suddenly
checking out head becomes slow, and people are... more likely to
notice that :-).

> In order to keep the CPU sustained at 100% you might consider
> a multithreading and/or multiprocess strategy to deal
> with the latency of the database and socket communication. 
> If you're single threaded, then the latency really gets you.
> I'm guessing you should be able to schedule some CPU
> intensive compression, decryption or merge operation on one 
> thread while another thread is writing to the database and 
> yet another thread is waiting for netsync instructions.
> Perhaps the solution might be as simple as spawning a few
> concurrent monotone clients that operate on the same database
> doing non-overlapping work (operating on files beginning with
> a different letter, etc). That way you keep the CPU pipeline
> churning in one process while another might be waiting on a
> socket or a database. Works for grid computations, why not
> monotone?

This is so terrifying that I'm not even going to really think about it
until everything else fails to pan out.

(Consider for comparison that monotone almost entirely refuses to use
memory allocation functions because they are too dangerous.  We would
need a lot of prodding to convince us to touch concurrency.)

-- Nathaniel

-- 
When the flush of a new-born sun fell first on Eden's green and gold,
Our father Adam sat under the Tree and scratched with a stick in the mould;
And the first rude sketch that the world had seen was joy to his mighty heart,
Till the Devil whispered behind the leaves, "It's pretty, but is it Art?"
  -- The Conundrum of the Workshops, Rudyard Kipling




reply via email to

[Prev in Thread] Current Thread [Next in Thread]