monit-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [monit-dev] "Execution failed"


From: Brian Candler
Subject: Re: [monit-dev] "Execution failed"
Date: Wed, 28 May 2008 12:45:13 +0100
User-agent: Mutt/1.5.11

On Tue, May 27, 2008 at 08:42:56PM +0200, Martin Pala wrote:
> Can you run monit in verbose mode and send output when monit tries to 
> restart the service and fails with 'execution failed' error?

OK, I'll try to reproduce (see below)

> It's possible that the services are starting longer then 30s ... since 
> monit-5.0, it's possible to customize how long should monit wait for 
> service to start. By default it's 30s - if the service is not up within 
> 30s, monit will report execution failed. To override the default use 
> timeout option, for example:
> 
>  start program = "/bin/foo start" with timeout 60 seconds

I'm monitoring about 15 processes, and monit starts them all up at system
bootup time. In turn, monit itself is started from /etc/inittab, run with
-I.

No single process should take anything like 30 seconds to start. However,
does monit start all 15 processes simultaneously and then monitor them, or
does it start one, wait for that to start, start the next, and so on?

As far as I can tell from looking at the logs, monit starts them in turn.

I guess it's conceivable that if all 15 were started simultaneously, the
system would be so busy that it could take 30 seconds for one of them to
start. But even then I don't think that's particularly likely. It would also
suggest that the 'big' (Rails) processes would get into this state more
often, but actually the stuck processes I saw were relatively lightweight
(e.g. Apache)

Conceptually: even if the process did fail to start within 30 seconds,
should the status really be shown as a red "executing failed" indefinitely,
even if it does manage to start subsequently? This isn't a major problem
though - I can clear the status by selecting unmonitor and then monitor to
clear the state - but it _is_ a spurious alarm.

I'm currently trying to reproduce this on a test system while logging with
-v -l file. Unfortunately it hasn't happened yet... I will let you know if I
can.

Regards,

Brian.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]