[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
lost output from asynchronous lists
From: |
Ralf Wildenhues |
Subject: |
lost output from asynchronous lists |
Date: |
Mon, 27 Oct 2008 23:12:24 +0100 |
User-agent: |
Mutt/1.5.18 (2008-05-17) |
Hello bug-bash readers,
in Autoconf, we've stumbled over a problem that has us a bit lost.
More precisely, it's output from asynchronous lists that gets lost
somewhere. I can't say whether it's due to the shell (it happens
with bash, pdksh, zsh, dash), due to the operating system (seen on
GNU/Linux, Solaris, and others, or maybe even a simple programming
error in our script. But it's annoying as it prevents simple
parallelism in a shell-based testsuite framework (Autotest) from
working nicely.
Maybe you have an idea where to start looking, that would be great!
Below is a couple of scripts that, for me, can seemingly provoke
the same failure on a GNU/Linux system (uniprocessor as well as multi).
A script output of the original failure can be found here:
<http://buildbot.proulx.com:9003/amd64-gnu-linux/builds/961/step-test/0>
If you would like to see the original code in action, grab current git
Autoconf (you need Autoconf 2.60 preinstalled in order to bootstrap):
git clone git://git.savannah.gnu.org/autoconf.git
cd autoconf
autoreconf -vi
./configure
make
and run the respective test until it fails:
while make check TESTSUITEFLAGS='-k "parallel test execution"';
do :; done
then find in tests/testsuite.dir/XXX the remains (XXX being somewhere
close to 143). The above seems to fail roughly one out of 10 times or
more often.
Now, here's the small example that seems to reproduce the original
failure (but fails considerably less often):
--- foo.sh ---
#! /bin/sh
do_work ()
{
sleep 1
echo "work $i is done"
}
for i in 1 2 3 4 5 6 7 8 9 10
do
(
do_work $i
) &
done
wait
--- bar.sh ---
#! /bin/sh
./foo.sh > stdout 2>stderr
echo stdout:; cat stdout
test `grep -c done stdout` -eq 10 || { echo "FAILED"; exit 1; }
---
Run
while ./bar.sh; do :; done
You may have to add some load to your system to provoke failure.
For example, one failure looked like this:
[... lots of output ...]
stdout:
work 1 is done
work 2 is done
work 3 is done
work 7 is done
work 4 is done
work 5 is done
work 8 is done
work 9 is done
work 10 is done
FAILED
another one dropped a lot more, after dozens of iterations:
stdout:
work 2 is done
work 10 is done
FAILED
Cheers, and thanks for maintaining bash!
Ralf
- lost output from asynchronous lists,
Ralf Wildenhues <=