monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

failed to get process data


From: Luca Cazzaniga
Subject: failed to get process data
Date: Sat, 20 Nov 2021 08:41:54 +0100

hi there, recently after a server reboot I found the following
messages in the monit log, repeated every cycle..

[CET Nov  6 03:56:07] error    : 'sistd' failed to get process data

it was related to a process check with a pid file. Monit reported no
problem using the command monit summary.. unfortunatly I can't
investigate furthermore because I was off-shift and monit has been
restarted again in order to solve.
I has glanced over the source and found the following stack trace to
bring to the error message:

#0  check_process (s=0x702b30) at src/validate.c:1326
#1  0x000000000043c320 in validate () at src/validate.c:1292
#2  0x000000000041bd72 in do_default () at src/monit.c:586
#3  0x000000000041b37b in do_action (argc=4, args=0x7fffffffe368) at
src/monit.c:414
#4  0x000000000041ad4d in main (argc=4, argv=0x7fffffffe368) at src/monit.c:173

for what I can get, every iteration cycle monit reads from procfs the
statistics of every process and load them on an array ptree..
it reads the pid from the pidfile checking for its existence using the
syscall getgpid..
then it searches the pid read from the pidfile on the ptree array to
update the service data on the list used for the checks (of type
ServiceT).. and it saves the statistics read from the ptree on
s->inf.process
it seems that the error happens when getgpid returns no error but the
pid is no more on the ptree array... All these steps are not atomic..
so the file content could be modified between the load on procfs and
the update.. Anyway I couldn't reproduce the error anymore.
Maybe I miss something. Do you know if there's some workaround or this
kind of error is related to a bug.. The monit release on the server is
5.25.3

Thanks

Luca Cazzaniga



reply via email to

[Prev in Thread] Current Thread [Next in Thread]