sks-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Sks-devel] SKS intermittently stalls with 100% CPU & rate-limiting


From: Paul Fontela
Subject: Re: [Sks-devel] SKS intermittently stalls with 100% CPU & rate-limiting
Date: Tue, 26 Jun 2018 11:52:10 +0200
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0

Hi Phill,

Thank you very much for your interest and your answer, the server keyserver.ispfontela.es has no problems, in fact has been able to synchronize almost 200,000 keys in less than 2 hours, that computer is powerful, has a large processor and a lot of RAM, the one that has a serious problem is a.0.na.ispfontela.es, is a virtual host that only has 1Gb of RAM, has always worked well until a few days ago that suddenly has begun to suffer what other colleagues comment, including with the updated database, more than 5100000 keys, it got stuck and stopped, I asked myself then:
If nothing has been modified in the configuration of the server or in the SKS service, what has happened?
That's when I started with the battery of tests.
1 - Changes in Nginx configuration.
2 - Begin the database of keys with a new dump from scratch.
3 - System re-installation (Ubuntu)
4 - Other modifications (add swap to linux that you did not have).

The result was always the same, after a short period of time after starting SKS it increased RAM consumption up to 80% and did not decrease at any time.

Maybe some system update may have affected?

Today is underway synchronizing with only 2 pairs from 26,000 keys until it reaches 5,100,000 with that I will know more or less what is happening.

I have seen that some other servers that are also hosted on Amazon datacenters are suffering from the same problem, could it be Amazon, I do not know, I can not answer that yet.

I will continue investigating and if in the end it does not improve, I will eliminate that server and I will leave running only keyserver.ispfontela.es that for the moment works well


El 25/06/2018 a las 23:46, Phil Pennock escribió:
That sounds like recon gone wild, normally a sign that you're peering
with someone who is very much behind on keys.  The recon system only
works if your peers are "mostly up-to-date".

This is why we introduced the template for introducing yourself to the
community, in the Peering wiki page, showing how many keys you have
loaded.  It cut down on people joining with 0 keys, expecting recon to
do all the work, and new peers complaining that their SKS was hanging.

Per <https://sks-keyservers.net/status/> the lower bound of keys to be
included is:  5105570
You have:     5109664

Using <http://keyserver.ispfontela.es:11371/pks/lookup?op=stats> as a
starting point, and skipping your in-house 11380 peers, opening all the
others up in tabs and looking (I don't have this scripted) we see:

  5109604  keys.niif.hu
  5065412  keys.sbell.io
  5107576  sks.mbk-lab.ru
  5109585  pgp.neopost.com
  5108773  pgp.uni-mainz.de
  5109639  pgpkeys.urown.net
  4825075  pgp.key-server.io
  <can't connect>  sks.funkymonkey.org
  5084241  keyserver.iseclib.ru
  5109254  keyserver.swabian.net
  5109628  sks-cmh.semperen.com
  <sks down behind proxy>  keys-02.licoho.de
  5109629  keyserver.dobrev.eu
  5109121  sks.mirror.square-r00t.net
  5109629  keyserver.escomposlinux.org
  5108778  keyserver.lohn24-datenschutz.de

If your in-house peers are way behind, fix that.

Comment out all peers with fewer than 5_100_000 keys.  Restart sks and
sks-recon.

The 284,000 key difference is pretty severe.  Since that peer isn't
getting updates, they're probably hanging on peering and causing even
more problems for you.

Disable peering _at least_ with those three hosts.


Whenever SKS isn't performing right, the _first_ step after looking for
errors in logs should always be a Peering Hygiene Audit.  Find the peers
who are sufficiently behind that their keeping the peering up is
anti-social and likely causing _you_ problems, comment out the peering
entries, restart (for a completely clean slate) and then reach out to
those peers to ask "Hey, what's up?".

Regards,
-Phil

-- 

Paul Fontela
keyserver.ispfontela.es 11370	# Paul Fontela <address@hidden> 0x31743FFC33E746C5
a.0.na.ispfontela.es	11370	# Paul Fontela Gmail <address@hidden> 0x3D7FCDA03AAD46F1

reply via email to

[Prev in Thread] Current Thread [Next in Thread]