[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[50 character or so descriptive subject here (for reference)]
From: |
Junjiro Okajima |
Subject: |
[50 character or so descriptive subject here (for reference)] |
Date: |
Wed, 13 Dec 2000 20:28:21 +0900 |
Configuration Information [Automatically generated, do not change]:
Machine: i386
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='i386'
-DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='i386-pc-linux-gnu'
-DCONF_VENDOR='pc' -DSHELL -DHAVE_CONFIG_H -D_FILE_OFFSET_BITS=64 -I.
-I/home/swt/doko/export/packages/bash/bash-2.03
-I/home/swt/doko/export/packages/bash/bash-2.03/lib -I/usr/include -g -O2
uname output: Linux nskli014 2.2.16 #5 SMP Tue Sep 19 23:42:25 JST 2000 i686
unknown
Machine Type: i386-pc-linux-gnu
Bash Version: 2.03
Patch Level: 0
Release Status: release
Description:
I found a problem in bash-2.03. And it seems to be the same in
bash-2.04 to me who read the source of 2.04.
The problem is a script such like this, waits for "background
long_task" finishes.
----------------------------------------
#!/bin/bash -
# this command takes a long time,
# so running in background.
long_task &
# this command can exit as soon as possible.
short_task
# the return value is "short_task"s.
exit $?
----------------------------------------
I read the source codes of bash 2.03 and understands,
o execute_disk_command("long_task &") does,
- fork() and unblock SIGCHLD by make_child(),
- execve() by shell_execve().
and does NOT wait, of course.
o execute_disk_command("short_task") does,
- fork() and unblock SIGCHLD by make_child(),
- execve() by shell_execve().
and
- waitpid/wait4() for any child process by waitchld() in wait_for().
o SIGCHLD handler sigchld_handler() calls waitchld().
I guess there is a wait race condition. Under the very high load
average system, bash will be sent SIGCHLD before he does
wait_for(). After the sigchld_handler() who waits for "short_task",
bash will call wait_for() to wait for "long_task".
And there will be another phenomenon. If the "long_task" exit() before
"short_task", the return value $? of "short_task" will be "long_task"s
one.
Is my understaing is right? And will you fix this problem?
I hope so.
Thank you.
----------------------------------------------------------------------
o in the case of bash waits for background job
<bash>
|
|
fork()-------------------------------+
| |
| exec("long_task")
fork()-------------------+ |
| | |
| | |
| exec("short_task") |
| | |
| | |
|<----------SIGCHLD-- exit() |
+-------->+ : |
sigchld_handler() : |
| : |
| : |
| : |
wait()--------->X |
| |
| |
+<--------+ |
| |
| |
| |
| |
wait()..............................>|
|
|
|
|
V
o in the case of "long_task" exits faster
<bash>
|
|
fork()-------------------------------+
| |
| exec("long_task")
fork()-------------------+ |
| | |
| | |
| exec("short_task") |
| | |
| | |
wait() | |
: | |
: | |
: | |
:<----------SIGCHLD-------------- exit(1)
| |
| |
| |
exit() |
|
|
|
/etc/init <--SIGCHLD--exit(0)
----------------------------------------------------------------------
Repeat-By:
It is very difficult, since this race problem will occur only
under the high load average. It is needed to task
switch/dispath to "short_task" happen between the execve() and
wait of the bash.
Fix:
How about,,,
- Only sigchld_hanlder does wait.
- it is done by specifying child pid.
- pids of children are to be managed.
- [50 character or so descriptive subject here (for reference)],
Junjiro Okajima <=