bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

wait race


From: Junjiro Okajima
Subject: wait race
Date: Wed, 13 Dec 2000 20:51:06 +0900

Configuration Information [Automatically generated, do not change]:
Machine: i386
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='i386' 
-DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='i386-pc-linux-gnu' 
-DCONF_VENDOR='pc' -DSHELL -DHAVE_CONFIG_H  -D_FILE_OFFSET_BITS=64  -I.  
-I/home/swt/doko/export/packages/bash/bash-2.03 
-I/home/swt/doko/export/packages/bash/bash-2.03/lib -I/usr/include -g -O2
uname output: Linux nskli014 2.2.16 #5 SMP Tue Sep 19 23:42:25 JST 2000 i686 
unknown
Machine Type: i386-pc-linux-gnu

Bash Version: 2.03, 2.04
Patch Level: 0
Release Status: release

Description:

I found a problem in bash-2.03. And it seems to be the same in
bash-2.04 to me who read the source of 2.04.

The problem is a script such like this, waits for "background
long_task" finishes.
----------------------------------------
#!/bin/bash -

# this command takes a long time,
# so running in background.
long_task &

# this command can exit as soon as possible.
short_task

# the return value is "short_task"s.
exit $?
----------------------------------------

I read the source codes of bash 2.03 and understands,

o execute_disk_command("long_task &") does,
- fork() and unblock SIGCHLD by make_child(),
- execve() by shell_execve().
and does NOT wait, of course.

o execute_disk_command("short_task") does,
- fork() and unblock SIGCHLD by make_child(),
- execve() by shell_execve().
and
- waitpid/wait4() for any child process by waitchld() in wait_for().

o SIGCHLD handler sigchld_handler() calls waitchld().

I guess there is a wait race condition. Under the very high load
average system, bash will be sent SIGCHLD before he does
wait_for(). After the sigchld_handler() who waits for "short_task",
bash will call wait_for() to wait for "long_task".

And there will be another phenomenon. If the "long_task" exit() before
"short_task", the return value $? of "short_task" will be "long_task"s
one.


Is my understaing is right? And will you fix this problem?
I hope so.
Thank you.

----------------------------------------------------------------------
o in the case of bash waits for background job

 <bash>
   |
   |
  fork()-------------------------------+
   |                                   |
   |                          exec("long_task")
  fork()-------------------+           |
   |                       |           |
   |                       |           |
   |               exec("short_task")  |
   |                       |           |
   |                       |           |
   |<----------SIGCHLD-- exit()        |
   +-------->+             :           |
       sigchld_handler()   :           |
             |             :           |
             |             :           |
             |             :           |
           wait()--------->X           |
             |                         |
             |                         |
   +<--------+                         |
   |                                   |
   |                                   |
   |                                   |
   |                                   |
  wait()..............................>|
                                       |
                                       |
                                       |
                                       |
                                       V

o in the case of "long_task" exits faster

 <bash>
   |
   |
  fork()-------------------------------+
   |                                   |
   |                          exec("long_task")
  fork()-------------------+           |
   |                       |           |
   |                       |           |
   |               exec("short_task")  |
   |                       |           |
   |                       |           |
  wait()                   |           |
   :                       |           |
   :                       |           |
   :                       |           |
   :<----------SIGCHLD-------------- exit(1)
   |                       |
   |                       |
   |                       |
  exit()                   |
                           |
                           |
                           |
    /etc/init <--SIGCHLD--exit(0)
----------------------------------------------------------------------

PS.
A few months before, I have reported a problem about an errno in a signal 
handler.
And I found it is fixed in bash 2.04. Thank you very much.

Repeat-By:
        It is very difficult, since this race problem will occur only
        under the high load average. It is needed to task
        switch/dispath to "short_task" happen between the execve() and
        wait of the bash.

Fix:
        How about,,,
        - Only sigchld_hanlder does wait.
        - it is done by specifying child pid.
        - pids of children are to be managed.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]