bug-autoconf
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: testsuite failure - 193 parallel execution


From: Eric Blake
Subject: Re: testsuite failure - 193 parallel execution
Date: Tue, 20 Jul 2010 12:01:30 -0600
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.10) Gecko/20100621 Fedora/3.0.5-1.fc13 Lightning/1.0b2pre Mnenhy/0.8.3 Thunderbird/3.0.5

On 07/20/2010 11:55 AM, Ralf Wildenhues wrote:
> Hi Eric,
> 
> thanks for analyzing and tracking this down!
> 
>> It seems like a race in parallel tests - we are closing fd 6 prior to
>> the last few subshells being permitted to finish writing 'token' into fd
>> 6, and bash warns about the EPIPE failure to write in that case,
>> followed by triggering the PIPE trap, where the second echo is attempted
>> and also warns about the EPIPE failure to write.
> 
> I'm not sure I follow this reasoning completely.  At the time the master
> closes the fd, it should have read back all tokens.  Why would any of
> the workers try to write to the fd after that?  And if they don't need
> to write any more data, why should close generate a SIGPIPE?

In looking at it further, I'm lost as to where the SIGPIPE is coming
from; maybe we're facing a bash bug?  Your point, that we should have
read back all the tokens already, seems valid, particularly since fifo
writes and reads should be atomic.

> 
>> I'm thinking about the following patch, but am not comfortable pushing
>> it without some review.  The idea is that we should not close the token
>> collector fd until we know that no subshells will try to write into the fd.
> 
> The patch seems fairly safe in that it shouldn't hurt.  How do multiple
> runs of the test fare on your moderately-loaded system with it?

Unfortunately, swapping those two lines didn't seem to help:

$ for i in $(seq 10); do sh ./micro-suite -j4 > stdout || { echo fail on
$i; break; }; grep 'All 16' stdout || { echo fail2 on $i; break; }; done
All 16 tests were successful.
All 16 tests were successful.
All 16 tests were successful.
All 16 tests were successful.
All 16 tests were successful.
All 16 tests were successful.
All 16 tests were successful.
All 16 tests were successful.
All 16 tests were successful.
fail2 on 10
$ cat stdout
## -------------------------------------------------------------- ##
## GNU Nonsense 1.0 test suite: suite to test parallel execution. ##
## -------------------------------------------------------------- ##

  1: test number 1                                   ok
  3: test number 3                                   ok
  4: test number 4                                   ok
  2: test number 2                                   ok
  6: test number 6                                   ok
  7: test number 7                                   ok
  5: test number 5                                   ok
  8: test number 8                                   ok

## ------------- ##
## Test results. ##
## ------------- ##

All 8 tests were successful.

Somehow, we're still getting short-changed, and only 8 of the 16 tests
are getting run, but this time, no Pipe failed message appeared.

-- 
Eric Blake   address@hidden    +1-801-349-2682
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]