monit-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 4.0 showstopper?


From: Martin Pala
Subject: Re: 4.0 showstopper?
Date: Wed, 17 Sep 2003 16:38:29 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030908 Debian/1.4-4

Jan-Henrik Haukeland wrote:

Martin Pala <address@hidden> writes:

during "lastminute" tests i saw following problem:

[CEST Sep 17 14:51:46] AssertException: at socket.c:333
aborting..

Oops!
I identified the problem - it is caused by race condition between methods execution:

1.) monit detected that the process is not running and main monit thread forks the start method (separate process, which will inherite all filedescriptors)

2.) monit main thread creates new thread which waits for the service to start - in the case that the service will not start (timeout will occure), this thread posts timeout event, which causes alert and continues by smtp server connection and sending the message. The fildescriptor of socket opened to smtp server is shared between all threads (wait_start and main)

3.) while wait_start is waiting for service to start, monit main thread executed another validate cycle and detected that the process is not running (independently of wait_start thread) - as usual, new process inherites all open filedescriptors (including smtp server socket fd)


Solution:

- secure runtime filedescriptors from inheritance by forked process
- synchronize main and wait_start thread to not check the service which is in wait_start stage. This is standalone problem - monit can try to start the service in paralel without realy waiting for service to start. In addition it allows the fd race condition in the case of alert message.

In theory this race condition can affect all filedescriptors - for example in the case that there is connection to monit httpd server (filedescriptor of the client connection will be inherited the same way as described above => the consequences will be probably the same - assert exception on socket write will shutdown monit).//


I'll look on it ...

Super! We'll wait until tomorrow then so we can add the "-i"
replacement doc as well to the release. I can add this doc, if it is
okay with you.

Thanks :)


Martin





reply via email to

[Prev in Thread] Current Thread [Next in Thread]