[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PATCH 1/2] remote: Use `catch' in killing pending force-kills
From: |
Maciej W. Rozycki |
Subject: |
[PATCH 1/2] remote: Use `catch' in killing pending force-kills |
Date: |
Wed, 20 May 2020 22:22:05 +0100 (BST) |
User-agent: |
Alpine 2.21 (LFD 202 2017-01-01) |
Address an execution race in `close_wait_program' and use `catch' in
killing pending force-kills issued there in the recovery of a stuck test
case, in case the force-kill sequence has completed before the command
to kill the sequence had a chance to run, so that no error is thrown and
a testsuite run does not get interrupted early like:
PASS: gcc.c-torture/execute/postmod-1.c -O0 (test for excess errors)
Executing on remote-localhost: .../gcc/testsuite/gcc/postmod-1.exe (timeout
= 15)
spawn [open ...]
WARNING: program timed out
ERROR: tcl error sourcing .../gcc/testsuite/gcc.c-torture/execute/execute.exp.
ERROR: child process exited abnormally
while executing
"exec sh -c "exec > /dev/null 2>&1 && kill -9 $exec_pid""
(procedure "close_wait_program" line 57)
invoked from within
"close_wait_program $spawn_id $pid wres"
(procedure "local_exec" line 104)
[...]
"uplevel #0 source .../gcc/testsuite/gcc.c-torture/execute/execute.exp"
invoked from within
"catch "uplevel #0 source $test_file_name""
testcase .../gcc/testsuite/gcc.c-torture/execute/execute.exp completed in 196
seconds
=== gcc Summary ===
# of expected passes 1
-- therefore not letting `execute.exp' continue (here with the GCC `c'
testsuite invoked with `execute.exp=postmod-1.c' for 8 compilation and 8
execution tests).
The completion of the force-kill sequence would have to happen in the
window between the `wait' command has returned, which would at worst
happen as a result of the final `kill -9' command in the sequence, and
the `kill -9 $exec_pid' command issued here, and the `sleep 5' command
issued at the end of the force-kill sequence makes the likelihood of
such a scenario low, but this might still happen with a loaded host
system and there is no drawback from using `catch' here, so let's do it.
* lib/remote.exp (close_wait_program): Use `catch' in killing
pending force-kills.
Signed-off-by: Maciej W. Rozycki <address@hidden>
---
Hi,
I have only observed it in a debug scenario, where an artificial delay
was inserted before the `wait' command referred in the change description,
while tracking down a testsuite hang with a stuck test case, but as noted
the use of `catch' here is otherwise harmless and while the likelihood of
the scenario where the race triggers might be epsilon it is not nil.
Therefore, please apply. FAOD this has been formatted for `git am' use.
Maciej
---
lib/remote.exp | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
dejagnu-remote-close-wait-kill-catch.diff
Index: dejagnu/lib/remote.exp
===================================================================
--- dejagnu.orig/lib/remote.exp
+++ dejagnu/lib/remote.exp
@@ -113,7 +113,10 @@ proc close_wait_program { program_id pid
# We reaped the process, so cancel the pending force-kills, as
# otherwise if the PID is reused for some other unrelated
# process, we'd kill the wrong process.
- exec sh -c "exec > /dev/null 2>&1 && kill -9 $exec_pid"
+ #
+ # Use `catch' in case the force-kills have completed, so as not
+ # to cause TCL to choke if `kill' returns a failure.
+ catch "exec sh -c \"exec > /dev/null 2>&1 && kill -9 $exec_pid\""
}
return $res