[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: testsuite failure - 193 parallel execution

From: Paul Eggert
Subject: Re: testsuite failure - 193 parallel execution
Date: Tue, 20 Jul 2010 14:05:26 -0700
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv: Gecko/20100527 Thunderbird/3.0.5

On 07/20/10 12:27, Ralf Wildenhues wrote:

> Good point, but wouldn't that be at least a QoI issue for the shell?

I'd think so, yes.

Staring at the code some more I see another race condition
that could explain the problem.  Suppose the parent, just
after the last fork, races ahead of the child and jumps
ahead to the "read at_token" cleanup loop.  The parent then
executes the last "read at_token" cleanup at a point
where the second-to-the-last child has already output
its token, but before the second-to-the-last-child has
closed the fifo.  The "read at_token" will then return 0 (because
it sees end-of-file), but the parent incorrectly thinks
that it has seen a token and then closes down the fifo
before the last child gets a chance to write its token.

If this guess is right, the following (untested) patch
might fix the problem.  The basic idea is to open the
fifo just once for reading and once for writing in the
parent, so that no child needs to open a fifo and no
child is left behind.

--- general.m4  2010-07-20 11:12:58.055141603 -0700
+++ /tmp/general.m4     2010-07-20 13:59:28.607141344 -0700
@@ -959,7 +959,8 @@ export PATH
 # Setting up the FDs.
 m4_define([AS_MESSAGE_LOG_FD], [5])
-m4_define([AT_JOB_FIFO_FD], [6])
+m4_define([AT_JOB_INFIFO_FD], [6])
+m4_define([AT_JOB_OUTFIFO_FD], [7])
 [#] AS_MESSAGE_LOG_FD is the log file.  Not to be overwritten if `-d'.
 if $at_debug_p; then
@@ -1366,6 +1367,9 @@ dnl cause changed test semantics; e.g.,
   at_joblist=`AS_ECHO([" $at_groups_all "]) | \
     sed 's/\( '$at_jobs'\) .*/\1/'`
+  exec AT_JOB_INFIFO_FD<"$at_job_fifo"
+  exec AT_JOB_OUTFIFO_FD>"$at_job_fifo"
   set X $at_joblist
   for at_group in $at_groups; do
@@ -1376,7 +1380,7 @@ dnl avoid all the status output by the s
       # Start one test group.
-      exec AT_JOB_FIFO_FD>"$at_job_fifo"
+      exec AT_JOB_INFIFO_FD<&-
 dnl When a child receives PIPE, be sure to write back the token,
 dnl so the master does not hang waiting for it.
 dnl errexit and xtrace should not be set in this shell instance,
@@ -1386,7 +1390,7 @@ dnl optimize away the _AT_CHECK subshell
 dnl Ignore PIPE signals that stem from writing back the token.
            trap "" PIPE
            echo stop > "$at_stop_file"
-           echo token >&AT_JOB_FIFO_FD
+           echo >&AT_JOB_OUTFIFO_FD
 dnl Do not reraise the default PIPE handler.
 dnl It wreaks havoc with ksh, see above.
 dnl        trap - 13
@@ -1395,26 +1399,24 @@ dnl         kill -13 $$
       if cd "$at_group_dir" &&
         at_fn_test $at_group &&
-        . "$at_test_source" # AT_JOB_FIFO_FD>&-
+        . "$at_test_source" # AT_JOB_OUTFIFO_FD>&-
       then :; else
        AS_WARN([unable to parse test group: $at_group])
-      echo token >&AT_JOB_FIFO_FD
+      echo >&AT_JOB_OUTFIFO_FD
     ) &
-    if $at_first; then
-      at_first=false
-      exec AT_JOB_FIFO_FD<"$at_job_fifo"
-    fi
+    at_first=false
     shift # Consume one token.
     if test address@hidden:@] -gt 0; then :; else
-      read at_token <&AT_JOB_FIFO_FD || break
+      read at_token <&AT_JOB_INFIFO_FD || break
       set x $[*]
     test -f "$at_stop_file" && break
+  exec AT_JOB_OUTFIFO_FD>&-
   # Read back the remaining ($at_jobs - 1) tokens.
   set X $at_joblist
@@ -1423,9 +1425,9 @@ dnl           kill -13 $$
     for at_job
       read at_token
-    done <&AT_JOB_FIFO_FD
+    done <&AT_JOB_INFIFO_FD
-  exec AT_JOB_FIFO_FD<&-
+  exec AT_JOB_INFIFO_FD<&-
   # Run serially, avoid forks and other potential surprises.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]