sks-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Sks-devel] Memory Leak in recon server?


From: Phil Pennock
Subject: Re: [Sks-devel] Memory Leak in recon server?
Date: Tue, 2 Feb 2010 00:15:07 +0100

On 2010-02-01 at 16:25 -0500, Daniel Kahn Gillmor wrote:
> Are you suggesting that if a single peer was to, say, flush its DB and
> re-connect, it could trigger this memory consumption on all/any of its
> peers?  Would the memory consumption increase proportionally to the
> number of peers which did this?

I'm not in a position to discuss the performance of the SKS code, I'm
not sufficiently conversant in O'Caml to understand the performance
characteristics of the code.

What I know is that I personally have seen problems in the past with
certain peers, discussed on this mailing-list.  A couple of peers with
an empty DB did not help, but one "strange" peer could totally kill
things.

I have a 64-bit platform and the db process is 47kB, 38kB resident while
the recon process is 29kB, 16kB resident.  And yet I've seen these
balloon to consume all the available RAM in this 2GB RAM system, in the
presence of errors.

So it appears that SKS is not yet as robust as it could be in the
presence of a non-optimal peer, but is very efficient *if* everyone is
behaving correctly.  In my day job, I'd class this as "not production
ready" but nobody is being paid to develop SKS and so the most
reasonable course of action is to provide patches, with scenarios and
tests to explain why the patches are needed.  But my O'Caml is lacking
and I don't have the time to dedicate to investigating this.

So in the finest tradition of gum and baling wire I use ulimits and a
daemon wrapper.  The current processes are the ones which have been
running since boot, 36 days ago.

Looking at http://sks.spodhuis.org/sks-peers now, I see that peers of
keyserver.cais.rnp.br, pgp.acm.jhu.edu, pgpkeys.mallos.nl, keyserver.ws
and sks.ms.mff.cuni.cz might want to cough at those operators and nudge
them to take a look at their servers to see if they're healthy.
keyserver.novomundo.com.br is a little behind too, which shouldn't be
happening in a healthy network with decent peering.  So something is
amiss.

Any time I notice a new peering request and I'm adding peers, while
checking that the new peer is up-to-date, I glance over the list of my
current peers and any which are having "behind" problems, I drop them a
private mail asking what's up.  I ignore values of 0 or -1, those just
mean the data wasn't collected, either by the sks server not having
"initial_stat:" in the config or because I couldn't reach the server at
the last scan.

Regards,
-Phil

Attachment: pgphkAkCqiDoC.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]