[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
potential bash bug, weird script behavior, Linux, SIGCHLD
From: |
Ingo Molnar |
Subject: |
potential bash bug, weird script behavior, Linux, SIGCHLD |
Date: |
Wed, 5 Dec 2007 21:56:24 +0100 |
User-agent: |
Mutt/1.5.17 (2007-11-01) |
Oleg Nesterov has distilled a very simple (and reproducable) testcase
below for what appears to be a potential long-existing bash bug. This is
a problem that triggers on Linux quite frequently. (i can also send the
configs.tar.bz2 testcase i made - but i think Oleg's is far simpler) I
used bash-3.2-19.fc8 for my tests, on Linux 2.6.24-0.39.rc3.git1.fc9.
Ingo
----- Forwarded message from Oleg Nesterov <oleg@tv-sign.ru> -----
Date: Mon, 3 Dec 2007 18:42:51 +0300
From: Oleg Nesterov <oleg@tv-sign.ru>
To: Ingo Molnar <mingo@elte.hu>
Subject: Re: weird script behavior, signals?
Cc: Jan Kratochvil <jkratoch@redhat.com>,
Roland McGrath <roland@redhat.com>
On 12/03, Ingo Molnar wrote:
>
> here's a fresh incident that is 100% reproducible. I constructed the
> following simple oneliner script to analyze saved kernel config files:
>
> for N in `grep 'is not set' config* | cut -d\# -f2- | cut -d' ' -f2 |
> sort | uniq`; do printf "%10d %s\n" `grep "$N=y" config* | wc -l` $N; done
>
> the script starts printing results like this:
>
> [...]
> 30 CONFIG_B43LEGACY_DEBUG
> 15 CONFIG_B43LEGACY_DMA_AND_PIO_MODE
> 18 CONFIG_B43LEGACY_DMA_MODE
> 19 CONFIG_B43LEGACY_PIO_MODE
> 21 CONFIG_B43_DEBUG
> 15 CONFIG_B43_DMA_AND_PIO_MODE
> 17 CONFIG_B43_DMA_MODE
> 6 CONFIG_B43_PCMCIA
> [...]
>
> now if i Ctrl-C the script, i get:
>
> -bash: printf: CONFIG_AFS_FS: invalid number
>
> if i Ctrl-Z the script, i get hung output, due to:
>
> |-login(2068)---bash(2306)---bash(10838)-+-grep(10839)
> | `-wc(10840)
>
> both grep and wc are in T+ state:
>
> mingo 10839 0.0 0.0 6088 676 tty2 T+ 06:14
> mingo 10840 0.0 0.0 3800 428 tty2 T+ 06:14 0:00 wc -l
>
> is this signal behavior really expected? I cannot kill the script - i
I assume you still can kill it doing "kill" aon another console, yes?
> have to manually kill the wc and grep tasks and then have to wait until
> its finished. Is this normal?
Looks like a bash bug to me.
$ echo `echo >&2 XXX; sleep 10000`
$ ps ax
...
2549 tty1 S 0:00 -bash
2550 tty1 S+ 0:00 sleep 10000
...
Small note, the job control rules is a black magic to me, so I assume it
is correct that "sleep" is in "foreground process group", but "bash" is not.
This -bash btw is the child of login shell, it executes `...`.
$ cat /proc/2549/status
...
ShdPnd: 0000000000000000
SigBlk: 0000000000010000
...
No pending signals, but SIGCHLD is blocked, I think this is the reason.
$ cat /proc/2549/wchan; echo
do_wait
Now I press Ctrl-Z, SIGTSTP goes to "sleep" and stopes it.
$ cat /proc/2550/status
...
State: T (stopped)
...
"sleep" notifies the parent,
$ cat /proc/2549/status
...
ShdPnd: 0000000000010000
SigBlk: 0000000000010000
...
note the pending SIGCHLD. But it is blocked, signal_pending() is not true.
do_notify_parent_cldstop() does __wake_up_parent() anyway, but this doesn't
help because according to strace the "bash" does waitpid(-1, 0xafd37628, 0).
So do_wait() was called with options == WEXITED, it blocks again after wakeup.
This is correct because !signal_pending().
Unless I missed something, perhaps this should be reported to bash developers?
Oleg.
----- End forwarded message -----
----- End forwarded message -----
----- End forwarded message -----
----- End forwarded message -----
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- potential bash bug, weird script behavior, Linux, SIGCHLD,
Ingo Molnar <=