monit-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: monit deadlock


From: Martin Pala
Subject: Re: monit deadlock
Date: Wed, 17 Sep 2003 23:06:59 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030908 Debian/1.4-4

(gdb) set stopped=1
(gdb) print stopped
$3 = 1
(gdb) continue
Continuing.
Program exited normally.
(gdb)

Martin Pala wrote:

It seems that there is some sort of deadlock. To replicate it, you can use following command:

unicorn:~/cvs/monit# i=9; while [ $i -ge 0 ] ; do ./monit -c /etc/monitrc ; ./monit -c /etc/monitrc quit; i=$[i-1]; done

Console output:

Starting monit daemon
Starting httpd at [127.0.0.1:2812]
monit daemon with pid [24755] killed
monit daemon at 24755 awakened
monit daemon with pid [24755] killed
monit daemon at 24755 awakened
monit daemon with pid [24755] killed
monit daemon at 24755 awakened
monit daemon with pid [24755] killed
monit daemon at 24755 awakened
monit daemon with pid [24755] killed
monit daemon at 24755 awakened
monit daemon with pid [24755] killed
monit daemon at 24755 awakened
monit daemon with pid [24755] killed
monit daemon at 24755 awakened
monit daemon with pid [24755] killed
Starting monit daemon
Starting httpd at [127.0.0.1:2812]
monit daemon with pid [24775] killed
monit daemon at 24775 awakened
monit daemon with pid [24775] killed
unicorn:~/cvs/monit# ./monit -c /etc/monitrc quit
monit daemon with pid [24775] killed
unicorn:~/cvs/monit# ./monit -c /etc/monitrc quit
monit daemon with pid [24775] killed
unicorn:~/cvs/monit# ./monit -c /etc/monitrc quit
monit daemon with pid [24775] killed
unicorn:~/cvs/monit# ./monit -c /etc/monitrc quit
monit daemon with pid [24775] killed
unicorn:~/cvs/monit# ./monit -c /etc/monitrc quit
monit daemon with pid [24775] killed
unicorn:~/cvs/monit# ./monit -c /etc/monitrc quit
monit daemon with pid [24775] killed
unicorn:~/cvs/monit# ./monit -c /etc/monitrc quit
monit daemon with pid [24775] killed


Log output:

[CEST Sep 17 22:45:13] Starting monit daemon
[CEST Sep 17 22:45:13] Starting httpd at [127.0.0.1:2812]
[CEST Sep 17 22:45:13] Shutting down monit HTTP server
[CEST Sep 17 22:45:13] monit daemon at 24755 awakened
[CEST Sep 17 22:45:13] Awakened by User defined signal 1
[CEST Sep 17 22:45:13] monit daemon at 24755 awakened
[CEST Sep 17 22:45:13] Awakened by User defined signal 1
[CEST Sep 17 22:45:13] Awakened by User defined signal 1
[CEST Sep 17 22:45:13] monit daemon at 24755 awakened
[CEST Sep 17 22:45:13] Awakened by User defined signal 1
[CEST Sep 17 22:45:13] monit daemon at 24755 awakened
[CEST Sep 17 22:45:14] Awakened by User defined signal 1
[CEST Sep 17 22:45:14] monit daemon at 24755 awakened
[CEST Sep 17 22:45:14] Awakened by User defined signal 1
[CEST Sep 17 22:45:14] monit daemon at 24755 awakened
[CEST Sep 17 22:45:14] Awakened by User defined signal 1
[CEST Sep 17 22:45:14] monit daemon at 24755 awakened
[CEST Sep 17 22:45:14] monit HTTP server stopped
[CEST Sep 17 22:45:14] monit daemon with pid [24755] killed
[CEST Sep 17 22:45:14] Starting monit daemon
[CEST Sep 17 22:45:14] Starting httpd at [127.0.0.1:2812]
[CEST Sep 17 22:45:14] Shutting down monit HTTP server



Monit under pid 24775 is frozen - you can send as many "monit quit" commands as you want - it will not terminate. It seems that monit is blocking in thread join:


(gdb) info threads
 3 Thread 16386 (LWP 24777)  0x402896e6 in poll () from /lib/libc.so.6
 2 Thread 32769 (LWP 24776)  0x402896e6 in poll () from /lib/libc.so.6
1 Thread 16384 (LWP 24775) 0x4002a354 in __pthread_sigsuspend () from /lib/libpthread.so.0
(gdb) bt
#0  0x4002a354 in __pthread_sigsuspend () from /lib/libpthread.so.0
#1 0x4002a118 in __pthread_wait_for_restart_signal () from /lib/libpthread.so.0
#2  0x400274dc in pthread_join () from /lib/libpthread.so.0
#3  0x0804e7d7 in monit_http (action=-4) at monit_http.c:131
#4  0x0804f17e in do_destroy (sig=15) at monitor.c:512
#5  0x4002d685 in __pthread_sighandler () from /lib/libpthread.so.0
#6  <signal handler called>
#7  0x4002d91b in read () from /lib/libpthread.so.0
#8  0x00000006 in ?? ()
#9  0x00000038 in ?? ()
#10 0x0805c651 in read_proc_file (
buf=0xbfffe880 "10233 (mozilla-bin) S 10232 690 690 0 -1 64 23 0 5 0 141 117 0 0 9 0 0 0 1843919 89362432 14890 4294967295 134512640 134720822 3221224560 3212834796 1080776422 0 0 4098 17453 3222590806 0 0 33 0\n0 0 0"..., buf_size=4096, name=0xc3 <Address 0xc3 out of bounds>,
   pid=10233) at process/common.c:98
#11 0x0805de2b in get_process_info_sysdep (p=0x8083f30) at process/sysdep_LINUX.c:134 #12 0x0805c780 in getdatafromproc (pid=10233, entry=0x80847e0) at process/common.c:185 #13 0x0805e047 in initprocesstree_sysdep (reference=0xbffff988) at process/sysdep_LINUX.c:265 #14 0x0804e985 in initprocesstree (reference=0xbffff988) at monit_process.c:178
#15 0x08054937 in validate () at validate.c:136
#16 0x0804f295 in do_default () at monitor.c:562
#17 0x0804f0c3 in do_action (args=0xfff) at monitor.c:348
#18 0x0804eb33 in main (argc=3, argv=0xbffffa24) at monitor.c:115
(gdb) thread 2
[Switching to thread 2 (Thread 32769 (LWP 24776))]#0 0x402896e6 in poll () from /lib/libc.so.6
(gdb) bt
#0  0x402896e6 in poll () from /lib/libc.so.6
#1  0x400278fe in __pthread_manager () from /lib/libpthread.so.0
#2  0x40291be7 in clone () from /lib/libc.so.6
(gdb) thread 3
[Switching to thread 3 (Thread 16386 (LWP 24777))]#0 0x402896e6 in poll () from /lib/libc.so.6
(gdb) bt
#0  0x402896e6 in poll () from /lib/libc.so.6
#1  0x0805ae7a in socket_producer (server=7) at http/engine.c:473
#2 0x0805aa8b in start_httpd (port=0, backlog=10, bindAddr=0x807f988 "127.0.0.1") at http/engine.c:194
#3  0x0804e86d in thread_wrapper (arg=0x0) at monit_http.c:167
#4  0x40027bf0 in pthread_start_thread () from /lib/libpthread.so.0
#5  0x40291be7 in clone () from /lib/libc.so.6
(gdb) print stopped
$1 = 0


Martin



_______________________________________________
monit-dev mailing list
address@hidden
http://mail.nongnu.org/mailman/listinfo/monit-dev







reply via email to

[Prev in Thread] Current Thread [Next in Thread]