help-cfengine
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

timeouts, zombies, and shellcommands


From: Darrell Fuhriman
Subject: timeouts, zombies, and shellcommands
Date: 11 Oct 2001 14:10:08 -0700
User-agent: Gnus/5.0807 (Gnus v5.8.7) XEmacs/21.1 (Canyonlands)

[this is a slightly more detailed copy of a post to gnu.cfengine.help]


So, when I'm executing a shell command using 1.6.3, in this case,
a very basic:

shellcommands:
        redhat::
                "/sbin/chkconfig ntpd on" useshell=false
                timeout=30 

Now, it sometimes hangs when cfengine is being run by kickstart
as part of the post-install scripts.  Note there are *no* daemons
being started by this prograsm, so I don't think it's the
not-closing-descriptors problem.  All the program does is create
a couple symlinks.

Here's some strace output, from the cfengine command.

rt_sigaction(SIGALRM, {0x4002c8c0, [ALRM], SA_RESTART|0x4000000}, {SIG_DFL}, 8) 
= 0
alarm(30)                      = 0
umask(077)                     = 022
pipe([22, 23])                 = 0
fork()                         = 1370
close(23)                      = 0
fcntl64(22, F_GETFL)           = 0 (flags O_RDONLY)
fstat64(22, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x402e9000
_llseek(22, 0, 0xbfff3940, SEEK_CUR) = -1 ESPIPE (Illegal seek)
munmap(0x402e9000, 4096)       = 0
wait4(89, 

the child process looks like this:
getpid()                       = 1370
[deleted]
execve("/sbin/chkconfig", ["/sbin/chkconfig", "ntpd", "on"], [/* 21 vars */]) = 0
[deleted]
close(22)                      = 0
unlink("/etc/rc5.d/K74ntpd")   = 0
unlink("/etc/rc5.d/S26ntpd")   = -1 ENOENT (No such file
or directory)
symlink("../init.d/ntpd", "/etc/rc5.d/S26ntpd") = 0
_exit(0)                       = ?


One thing that has me confused is why it seems to be waiting on
PID 89, instead of '-1', or the actual PID of the child (1370).
I smell a bug of some sort...  especially in light of the fact
that it does correctly wait for the previous command's PID.

Anyway, that's where it hangs.  Also, I notice that it seems to
never recieve the ALRM signal.  Is that some strange signal
interaction I don't understand?

To make things worse, it works correctly when run manually
instead of automatically.

As an aside, it seems that the alarm handler doesn't actually
*do* anything, especially anything useful like attempt to kill
the shellcommand.

net.c:41

void TimeOut()
 
{
alarm(0);
Verbose("%s: Time out\n",VPREFIX);
}

Is this, in fact, an unimplemented feature?

Darrell



reply via email to

[Prev in Thread] Current Thread [Next in Thread]