bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[contrib]: setpgrp + killpg builtins


From: Jason Vas Dias
Subject: [contrib]: setpgrp + killpg builtins
Date: Sun, 1 Feb 2015 01:13:06 +0000

Dear bash developers -

It is very difficult to overcome the problems caused by
the scenario described within this email without something the
enclosed "setpgrp <pid> <pgrp>" and "killpg <pgrp> <sig>"
bash loadable builtins .

Without them, or options to change signal handling for simple
commands,  it is too easy to create orphan processes,
and too difficult to find a workaround to prevent
orphan processes being created, as in the following
scenario:

1. An "invoker.sh" process runs a "job.sh" bash script in a separate
   process which runs a long-running (or non-terminating!)
   'Simple Command' (not a shell "Job") (call it "nterm.sh").

2. After a while, the originator decides that the job has timed-out,
   and kills its process (the instance of bash running job.sh), and
   then exits.

3. The "long-command" nterm.sh process is left still running as an orphan,
   and would become a zombie if it tries to exit.

I tested this with lastest bash-4.3.33 and with bash-4.2 .

The problem is most shell scripts use just simple commands and not
background jobs - changing a large number of scripts to use
asynchronous background jobs for every simple command that may
potentially not terminate due to for example NFS hangs is not
an option. Simple commands will run in their
own process groups in interactive mode, or that of the parent
in non-interactive mode, and will not be killed when
their parent job.sh  exits because the parent has no
background pid to wait for so cannot wait for them.

This is demonstrated by the attached shell scripts in the
 "nterm-demo.tar" file (nterm-demo/*) :
 invoker.sh: forks off "job.sh", waits for it to timeout, and kills it
 job.sh    : runs "nterm.sh" as a simple command
 nterm.sh  : a non-terminating process
 killpg.c  : killpg built-in

To demonstrate:

$ tar -xpf nterm_demo.tar
$ cd nterm_demo
$ BASH_BUILD_DIR=...  BASH_SOURCE_DIR=... make

Example output is :
gcc  -fPIC -O3 -g -I. -I/home/jvasdias/src/3P/bash
-I/home/jvasdias/src/3P/bash/lib -I/home/jvasdias/src/3P/bash/builtins
-I/home/jvasdias/src/3P/bash/include
-I/home/jvasdias/src/3P/bash-4.30-ubuntu
-I/home/jvasdias/src/3P/bash-4.30-ubuntu/lib
-I/home/jvasdias/src/3P/bash-4.30-ubuntu/builtins  -c -o setpgid.o
setpgid.c
gcc  -shared -Wl,-soname,$@  setpgid.o   -o setpgid
gcc  -fPIC -O3 -g -I. -I/home/jvasdias/src/3P/bash
-I/home/jvasdias/src/3P/bash/lib -I/home/jvasdias/src/3P/bash/builtins
-I/home/jvasdias/src/3P/bash/include
-I/home/jvasdias/src/3P/bash-4.30-ubuntu
-I/home/jvasdias/src/3P/bash-4.30-ubuntu/lib
-I/home/jvasdias/src/3P/bash-4.30-ubuntu/builtins  -c -o killpg.o
killpg.c
...
gcc  -shared -Wl,-soname,$@  killpg.o   -o killpg
bash -c ./invoker.sh 0<&- 2>&1 | tee
./invoker.sh: hB : 11524
JOB: 11528
./job.sh: 11528: pgid : 11510
./job.sh: 11528: pgid now : 11528
./nterm.sh: hB: 11535 : pgid: 11528
non-terminating command 11535 (11528) still running.
./invoker.sh: timeout - killing job: 11528
Terminated
./nterm.sh: 11535: exits 143
./job.sh: 11528: exits 143
./invoker.sh: 11524: exiting.
$

To demonstrate the problem, make the built-ins not be found:
$ make show_the_bug
unset BASH_LOADABLES_DIR; ./invoker.sh  0<&- 2>&1 | tee
./invoker.sh: hB : 11670
Demonstrating the bug. Please kill the nterm.sh process manually.
JOB: 11672
./job.sh: 11672: pgid : 11668
job.sh will be killed, but nterm.sh will not.
./nterm.sh: hB: 11676 : pgid: 11668
non-terminating command 11676 (11668) still running.
./invoker.sh: timeout - killing job: 11672
non-terminating command 11676 (11668) still running.
non-terminating command 11676 (11668) still running.
^Cmake: *** [show_the_bug] Interrupt

Fortunately, make carefully cleans up and kills 11676 silenty.
If one types at the command line or in a shell script:
 $ BASH_LOADABLES_DIR='' ./invoker.sh
then it is really hard to kill the resulting nterm.sh process -
one has to use kill -9 $nterm_pid .

So, please give scripts some means of saying
"if I am killed, kill my current simple command",
even in interactive mode, with some new shopt option,
or provide something like the killpg / setpgid built-ins attached.

Thanks & Regards,
Jason

Attachment: nterm_demo.tar
Description: Unix tar archive


reply via email to

[Prev in Thread] Current Thread [Next in Thread]