monit-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Features...


From: Martin Pala
Subject: Re: Features...
Date: Fri, 26 Sep 2003 13:45:46 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030908 Debian/1.4-4

Christian Hopp wrote:

On Fri, 26 Sep 2003, Martin Pala wrote:

Christian Hopp wrote:

On Fri, 26 Sep 2003, Martin Pala wrote:



2) A watchdog thread... (I think I have mentioned this hidden
somewhere in a mail some days ago)

We detach the validation as one thread, then we do already have the
http thread and whatever might come.  The main thread become the
watchdog.  After each validation and after each accept (w/ or w/o
timeout) both do set a semaphore which is checked by the watchdog.
In case one of them has not done this the thread is restarted or
monit is restarted.



I'm not sure whether it is needed to separate main thread for watchdog.
It could be usefull to watch httpd thread, but i think it could be
implemented inside of main thread. As workaround in the case that monit
httpd availability is issue, it can be monitored by present monit
version from main thread by:

check host monit-httpd with address 127.0.0.1
 if failed port 2812 protocol http then exec "/usr/bin/monit -c
/etc/monitrc reload"

This is not exactly the same thing as httpd thread watchdog, but it can
solve the issue too. In the case that monit httpd is not accessible,
monit will reload itself, which will cause failed thread to start.


But this does only work in case you start the http support.  If you
think of application where you donT need/want the http support you
have no chance finding out whether monit is running or not.



Main thread can be watched by init - in the case that it dies, init will
respawn it => the critical functionality is kept regardless of http. In
the case that we'll implement watchdog thread, it will be still needed
to watch this watchdog thread by init to make sure that it will work.

The watchdog serves more the purpose of finding deadlocks or other
situation where the threads might get stuck.  Init just protects agains
unexpected QUIT/TERM/... situations.  And the watchdog is nothing else
but,

clean_sems()
start_validate()
start_httpd()

while TRUE
        sleep (cycletime) (or select, cycletime>max_expected_cycle)
       if (! sem_validate )
                restart_validate()
        sem_validate= FALSE

        if (run_httpd && ! sem_httpd )
                restart_httpd()
        sem_httpd= FALSE
end

if it is simple as this it does not require additional supervision.  IMO
it does not require LOCKs. Anyhow the LOCKs could lock the watchdog.  That
should not happen.

CHopp

Yes, in such case it makes sence.

On the other side deadlock and similar problems are bugs and they should not happen. In the case that such situation will happen, we will fix this problem to not happen any more. Implementation of unknown bugs sanitizer could itself create uknown issues (according to Murphys's laws) and i think it is better to keep the code simple. In addition if the watchdog is part of the program it can be affected by unknown bug too => it should be probably independent program for security and high availability reasons.

Martin







reply via email to

[Prev in Thread] Current Thread [Next in Thread]