[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [50 character or so descriptive subject here (for reference)]
From: |
Chet Ramey |
Subject: |
Re: [50 character or so descriptive subject here (for reference)] |
Date: |
Wed, 13 Dec 2000 16:17:06 -0500 |
> Machine Type: i386-pc-linux-gnu
>
> Bash Version: 2.03
> Patch Level: 0
> Release Status: release
>
> Description:
>
> I found a problem in bash-2.03. And it seems to be the same in
> bash-2.04 to me who read the source of 2.04.
>
> The problem is a script such like this, waits for "background
> long_task" finishes.
Thanks for your analysis.
> ----------------------------------------
> #!/bin/bash -
>
> # this command takes a long time,
> # so running in background.
> long_task &
>
> # this command can exit as soon as possible.
> short_task
>
> # the return value is "short_task"s.
> exit $?
> ----------------------------------------
>
> I read the source codes of bash 2.03 and understands,
>
> o execute_disk_command("long_task &") does,
> - fork() and unblock SIGCHLD by make_child(),
> - execve() by shell_execve().
> and does NOT wait, of course.
>
> o execute_disk_command("short_task") does,
> - fork() and unblock SIGCHLD by make_child(),
> - execve() by shell_execve().
> and
> - waitpid/wait4() for any child process by waitchld() in wait_for().
In this case, execute_simple_command calls execute_disk_command,
which calls make_child to create the child process. The pid of the
new process is saved in last_made_pid. execute_simple_command calls
wait_for with this PID as its argument. It's clumsier than it needs
to be, to be sure.
wait_for finds the job and waits for all of its processes to complete.
> I guess there is a wait race condition. Under the very high load
> average system, bash will be sent SIGCHLD before he does
> wait_for(). After the sigchld_handler() who waits for "short_task",
> bash will call wait_for() to wait for "long_task".
waitchld() is responsible for setting the status of the job. It marks
the status of the process it reaps as not running (saving the exit
status), and, if all processes in the job have exited, it marks the
job as dead.
If the SIGCHLD occurs before wait_for is called, waitchld() will already
have set the job's status.
The first thing wait_for does is check that the child is running. If
it's not, then it simply collects its exit status. That takes care of
one race condition.
If the SIGCHLD happens between the time that wait_for finds that the
child process is running and the time it calls waitchld(), there might
be a problem. There's a very small window before wait_for sets
waiting_for_job to 1 and calls waitchld() that a SIGCHLD may arrive.
The SIGCHLD handler won't call waitchld() if wait_for has already called
it (they rendezvous on `waiting_for_job').
(Bash-2.05 has some minor changes here to avoid calling waitpid() when
the kernel doesn't think there are any running child processes, even
if bash does.)
> And there will be another phenomenon. If the "long_task" exit() before
> "short_task", the return value $? of "short_task" will be "long_task"s
> one.
I'm not sure that this is true.
Thanks for the message; there are definitely things to look at for
the next release after bash-2.05.
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
( ``Discere est Dolere'' -- chet)
Chet Ramey, CWRU chet@po.CWRU.Edu http://cnswww.cns.cwru.edu/~chet/