savannah-hackers-public
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Savannah-hackers-public] Slow SQL Queries


From: Bob Proulx
Subject: [Savannah-hackers-public] Slow SQL Queries
Date: Fri, 8 Dec 2017 15:06:58 -0700
User-agent: Mutt/1.9.1 (2017-09-22)

FSF Nagios sent an alert on internal0 high load average earlier this
week.  Ruben notified me.  I jumped in but had already missed the high
load and things were in the after-event settle down return to normal
behavior.  But it had load spiked to 53+ within the previous few minutes.

I looked through the logs everywhere and determined that external
influences were hitting the web frontend on frontend0.  This was
flowing through to hitting internal0 for database queries.  I missed
being able to figure out exactly what was hitting the web page.  But
if it was hitting there once it will hit there again.  All we need to
do is wait.  The event did not cause a high load on the web frontend0
which handled it okay.  The slow queries caused the MySQL server to
have a high load as it forked to handle the task.

Afterward I turned on the slow query log on internal0.  This is
available moving forward.  Initially I had the time limit set at 2
seconds and it was streaming out logs of 2-second slow queries.
(Which is not a good sign.)  I reduced it to 4 seconds and then
finally back to the 10 second default value until the stream slowed to
a trickle.  At 10 seconds there are always at least one per day but at
least not streaming.  At the point that we have time to look and
improve things it needs to be reduced in steps lower and the
problematic queries found and fixed.

There appear to be many queries that search every row of the
database.  This may be unavoidable due to the web bug tracker query
possibilities.  But if a human is running the web interface the rate
of them will be slow enough that this isn't a problem.  It is only a
robot that would strike the web interface fast enough for this to be a
problem.

* Improve the robots.txt file to keep them out of wherever they were
hitting that was trouble.  But I think this was a probe attack and not
a well behaved robot.

* In other cases it appears that some columns should have an index but
do not.  Another opportunity for improvement.

* And of course many tables (most?) are of the old type and should be
migrated to an ACID3 compliant engine that also has better locking
behavior.

Time is what keeps everything from happening all at once. :-)

Bob



reply via email to

[Prev in Thread] Current Thread [Next in Thread]