testsuite failure - 193 parallel execution

bug-autoconf

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

testsuite failure - 193 parallel execution

From:	Eric Blake
Subject:	testsuite failure - 193 parallel execution
Date:	Tue, 20 Jul 2010 11:20:52 -0600
User-agent:	Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.10) Gecko/20100621 Fedora/3.0.5-1.fc13 Lightning/1.0b2pre Mnenhy/0.8.3 Thunderbird/3.0.5

I'm seeing failures about 3 out of 10 times on my moderately-loaded
machine on test 193; invariably, the failures are due to unexpected
output on stderr, such as:

$ sh ./micro-suite -j4
## -------------------------------------------------------------- ##
## GNU Nonsense 1.0 test suite: suite to test parallel execution. ##
## -------------------------------------------------------------- ##

  1: test number 1                                   ok
  2: test number 2                                   ok
  3: test number 3                                   ok
  4: test number 4                                   ok
  7: test number 7                                   ok
  8: test number 8                                   ok
  5: test number 5                                   ok
  6: test number 6                                   ok
./micro-suite: line 1726: echo: write error: Broken pipe
./micro-suite: line 4: echo: write error: Broken pipe

## ------------- ##
## Test results. ##
## ------------- ##

All 8 tests were successful.


In looking closer, those two line numbers correspond to
 echo token >&6
lines (one occurs inside the trap at line 1711; bash reports $LINENO in
a trap relative to the start of the trap rather than the overall script).

It seems like a race in parallel tests - we are closing fd 6 prior to
the last few subshells being permitted to finish writing 'token' into fd
6, and bash warns about the EPIPE failure to write in that case,
followed by triggering the PIPE trap, where the second echo is attempted
and also warns about the EPIPE failure to write.

I'm thinking about the following patch, but am not comfortable pushing
it without some review.  The idea is that we should not close the token
collector fd until we know that no subshells will try to write into the fd.

diff --git i/lib/autotest/general.m4 w/lib/autotest/general.m4
index e27d601..0b17e79 100644
--- i/lib/autotest/general.m4
+++ w/lib/autotest/general.m4
@@ -1425,8 +1425,8 @@ dnl           kill -13 $$
       read at_token
     done <&AT_JOB_FIFO_FD
   fi
-  exec AT_JOB_FIFO_FD<&-
   wait
+  exec AT_JOB_FIFO_FD<&-
 else
   # Run serially, avoid forks and other potential surprises.
   for at_group in $at_groups; do



-- 
Eric Blake   address@hidden    +1-801-349-2682
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

[Prev in Thread]

Current Thread

[Next in Thread]

testsuite failure - 193 parallel execution, Eric Blake <=
- Re: testsuite failure - 193 parallel execution, Ralf Wildenhues, 2010/07/20
  - Re: testsuite failure - 193 parallel execution, Eric Blake, 2010/07/20
    - Re: testsuite failure - 193 parallel execution, Paul Eggert, 2010/07/20
    - Re: testsuite failure - 193 parallel execution, Ralf Wildenhues, 2010/07/20
    - Re: testsuite failure - 193 parallel execution, Eric Blake, 2010/07/20
    - Re: testsuite failure - 193 parallel execution, Eric Blake, 2010/07/20
    - Re: testsuite failure - 193 parallel execution, Paul Eggert, 2010/07/20
    - Re: testsuite failure - 193 parallel execution, Paul Eggert, 2010/07/20
    - Re: testsuite failure - 193 parallel execution, Eric Blake, 2010/07/20
    - Re: testsuite failure - 193 parallel execution, Eric Blake, 2010/07/20

Prev by Date: Re: how to detect int64?
Next by Date: Re: testsuite failure - 193 parallel execution
Previous by thread: AT_ARG_OPTION_ARG test failure
Next by thread: Re: testsuite failure - 193 parallel execution
Index(es):
- Date
- Thread