bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

echo interrupted by SIGCHLD from a dying coprocess


From: Tomáš Trnka
Subject: echo interrupted by SIGCHLD from a dying coprocess
Date: Wed, 24 Mar 2010 20:30:00 +0100
User-agent: KMail/1.13.1 (Linux/2.6.32.9-70.fc12.x86_64; KDE/4.4.1; x86_64; ; )

Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' -
DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-unknown-linux-gnu' -
DCONF_VENDOR='unknown' -DLOCALEDIR='/home/trnka/opt/share/locale' -
DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H   -I.  -I. -I./include -I./lib   -g -
O2
uname output: Linux a324-2 2.6.24.2 #1 SMP Wed Feb 20 12:36:17 CET 2008 x86_64 
GNU/Linux
Machine Type: x86_64-unknown-linux-gnu

Bash Version: 4.1
Patch Level: 2
Release Status: release

Description:
I've started using coprocesses heavily and I've found a nasty problem related 
(but not limited) to them: After the coprocess finishes its job, the resultant 
SIGCHLD is not properly blocked by bash signal processing logic and interferes 
with script I/O. In my case, I've been using something like:

read var1 var2 < <( a | long | pipeline | here)
echo "var1=$var1"
echo "var2=$var2"

Sometimes, the SIGCHLD arrived just when one of the echos were doing output 
and the result was:
echo: write error: Interrupted system call
As this is a bit of a race, it occurs only when the stars are right, i.e. 
during normal usage the probability of the SIGCHLD hitting exactly the echo is 
quite low. However, as soon as anything causes the I/O to take significantly 
longer, the bug appears. I've been hitting quite often (30%?) when running the 
script over SSH.

This bug has probably been reported years ago here:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=382798

Repeat-By:
I've reduced one of my scripts to this (nothing exceptionally intelligent, but 
it does the job):

#!/bin/bash

while [[ 1 ]]; do
set +e
read tmp tmp2 < <( echo "blabla" | wc | tr -s " " "\n" | tail -n 2 | tr "\n" " 
")
set -e
echo $tmp
echo $tmp2
echo $tmp
echo $tmp2
echo $tmp
echo $tmp2
echo $tmp
echo $tmp2
echo $tmp
echo $tmp2
echo $tmp
echo $tmp2
echo $tmp
echo $tmp2
echo $tmp
echo $tmp2
echo $tmp
echo $tmp2
done

Using this script I can reliably reproduce the bug (i.e. get a Interrupted 
system call error) using bash 4.1.2 (compiled myself from vanilla tarball) and 
3.1.17 (Debian lenny) over SSH and 4.0.35 (stock Fedora 12) under strace.

Fix:
Applying the following simple patch (against 4.1.2) fixes the bug:

--- builtins/echo.def.orig      2010-03-24 19:40:54.000000000 +0100
+++ builtins/echo.def   2010-03-24 19:47:07.000000000 +0100
@@ -27,6 +27,7 @@
 
 #include "../bashansi.h"
 
+#include <signal.h>
 #include <stdio.h>
 #include "../shell.h"
 
@@ -108,6 +109,7 @@
 {
   int display_return, do_v9, i, len;
   char *temp, *s;
+  sigset_t nmask, omask;
 
   do_v9 = xpg_echo;
   display_return = 1;
@@ -159,6 +161,10 @@
 
   clearerr (stdout);   /* clear error before writing and testing success */
 
+  sigemptyset(&nmask);
+  sigaddset(&nmask, SIGCHLD);
+  sigprocmask(SIG_BLOCK, &nmask, &omask);
+
   terminate_immediately++;
   while (list)
     {
@@ -193,6 +199,8 @@
   if (display_return)
     putchar ('\n');
 
+  sigprocmask(SIG_SETMASK, &omask, NULL);
+
   terminate_immediately--;
   return (sh_chkwrite (EXECUTION_SUCCESS));
 }





reply via email to

[Prev in Thread] Current Thread [Next in Thread]