sks-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Sks-devel] SKS intermittently stalls with 100% CPU & rate-limiting


From: Phil Pennock
Subject: Re: [Sks-devel] SKS intermittently stalls with 100% CPU & rate-limiting
Date: Mon, 25 Jun 2018 17:46:37 -0400

On 2018-06-25 at 13:08 +0200, Paul Fontela wrote:
> I have tried almost everything, from downloading a dump and starting the
> server sks again to reinstall system and everything else, the result is
> always the same, it works well for a while, sometimes an hour sometimes
> a little more and suddenly it it freezes the key server, reaching 80%
> RAM, which makes it unstable and inoperable.

That sounds like recon gone wild, normally a sign that you're peering
with someone who is very much behind on keys.  The recon system only
works if your peers are "mostly up-to-date".

This is why we introduced the template for introducing yourself to the
community, in the Peering wiki page, showing how many keys you have
loaded.  It cut down on people joining with 0 keys, expecting recon to
do all the work, and new peers complaining that their SKS was hanging.

Per <https://sks-keyservers.net/status/> the lower bound of keys to be
included is:  5105570
You have:     5109664

Using <http://keyserver.ispfontela.es:11371/pks/lookup?op=stats> as a
starting point, and skipping your in-house 11380 peers, opening all the
others up in tabs and looking (I don't have this scripted) we see:

  5109604  keys.niif.hu
  5065412  keys.sbell.io
  5107576  sks.mbk-lab.ru
  5109585  pgp.neopost.com
  5108773  pgp.uni-mainz.de
  5109639  pgpkeys.urown.net
  4825075  pgp.key-server.io
  <can't connect>  sks.funkymonkey.org
  5084241  keyserver.iseclib.ru
  5109254  keyserver.swabian.net
  5109628  sks-cmh.semperen.com
  <sks down behind proxy>  keys-02.licoho.de
  5109629  keyserver.dobrev.eu
  5109121  sks.mirror.square-r00t.net
  5109629  keyserver.escomposlinux.org
  5108778  keyserver.lohn24-datenschutz.de

If your in-house peers are way behind, fix that.

Comment out all peers with fewer than 5_100_000 keys.  Restart sks and
sks-recon.

The 284,000 key difference is pretty severe.  Since that peer isn't
getting updates, they're probably hanging on peering and causing even
more problems for you.

Disable peering _at least_ with those three hosts.


Whenever SKS isn't performing right, the _first_ step after looking for
errors in logs should always be a Peering Hygiene Audit.  Find the peers
who are sufficiently behind that their keeping the peering up is
anti-social and likely causing _you_ problems, comment out the peering
entries, restart (for a completely clean slate) and then reach out to
those peers to ask "Hey, what's up?".

Regards,
-Phil



reply via email to

[Prev in Thread] Current Thread [Next in Thread]