monit-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: check device core dump


From: Martin Pala
Subject: Re: check device core dump
Date: Sun, 07 Sep 2003 22:35:40 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030827 Debian/1.4-3

Thanks for report - bug fixed.


The thing with DeviceInfo_Usage should not be needed, because it is done by:

 if(stat(s->path, &stat_buf) != 0) {
Event_post(s, EVENT_START, "Event: device '%s' doesn't exist\n", s->name);
   return FALSE;
 }

on the beginning of check_device() => it is done for every device regardless of space or inode test definition.


Martin

Jan-Henrik Haukeland wrote:

Martin, I just ran a test (as a non-root user) using the following
monitrc:

check device cdrom with path /dev/cdrom
  start "/bin/mount /dev/cdrom"
  stop  "/bin/umount /dev/cdrom"
  if space > 1000 MB then alert
  alert address@hidden

check device disk1 with path /
  alert address@hidden

Running monit: monit -Iv

Gives the following core dump:

 The service list contains the following entries:
Device Name = cdrom
 Group                = (not defined)
 Path                 = /dev/cdrom
 Monitoring mode      = active
 Start program        = /bin/mount /dev/cdrom
 Stop program         = /bin/umount /dev/cdrom
Program received signal SIGSEGV, Segmentation fault.
 [Switching to Thread 1024 (LWP 14146)]
 0x42080c63 in strlen () from /lib/i686/libc.so.6
(gdb)
The problematic code is in util.c:659:


   } else if(dl->resource == RESOURCE_ID_SPACE) {

=>   printf(" %-20s = if %s %ld %s then %s\n",
          "Space usage limit",
          operatornames[dl->operator],
          (dl->limit_absolute > -1)?dl->limit_absolute:dl->limit_percent,
          (dl->limit_absolute > -1)?"blocks":"%",
          actionnames[dl->action]);

   }

(gdb) p *dl
$2 = {resource = 11, operator = 0, limit_absolute = -2147483648, limit_percent = -1, action = 1000, next = 0x0}


The core dump happens because actionnames[1000] points way out of the
actionnames array. The limit_absolute value looks rotten to. Can you
please take a look at it?

------------------------------------------------------------------------------

Also I think that using simply

check device disk1 with path /dev/hda5
  alert address@hidden

should lead to a call to DeviceInfo_Usage in validate.c:check_device,
because it's first after this call, when you try to read info from the
device you will now if the device was mounted or not, the other tests
in validate.c:check_device almost always will be true since they
simply test the /dev/device if it exist, which it usually will.

As the parser is now, this function will not be called, unless an
IF-THEN test was added to the entry.

BTW, I have changed the event to START if monit cannot read from the
device, so monit will call the entry's start method, and not UNMONITOR
as I wrongly added before.







reply via email to

[Prev in Thread] Current Thread [Next in Thread]