[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#40981: Graphical installer tests sometimes hang.
From: |
Mathieu Othacehe |
Subject: |
bug#40981: Graphical installer tests sometimes hang. |
Date: |
Sun, 10 May 2020 13:19:18 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) |
Hey Ludo,
> Woow, good catch!
Thanks :)
>> The work-around here is to save the installed SIGTERM handler and reset
>> it. Then, after forking, the parent can restore the SIGTERM handler. The
>> child will use the default SIGTERM handler that terminates the process.
>
> OK, makes sense. (Another occasion to see how something like
> ‘posix_spawn’ would be more appropriate than fork + exec…)
Didn't know about that function, but it seems way easier to manipulate
and less error prone indeed!
> Shouldn’t it be:
>
> (let ((pgid (getpgid pid)))
> (if (= (getpgid 0) pgid)
> (kill pid signal) ;don't kill ourself
> (kill (-p pgid) signal)))
Yes, I changed it.
> Note: Use the most specific comparison procedure, ‘=’ in this case,
> because we know we’re dealing with numbers (it enables proper type
> checking, better compiler optimizations, etc.). More importantly, ‘eq?’
> performs pointer comparison, so it shouldn’t be used with numbers (in
> practice it works with “fixnums” but not with bignums).
Ok, noted!
>> +# Try to trigger eventual race conditions, when killing a process between
>> fork
>> +# and execv calls.
>> +for i in {1..50}
>> +do
>> + $herd restart test3
>> +done
>
> Would it reproduce the problem well enough?
On a slow machine one time out of two, and on a faster machine,
never. The 'reproducer' I used, was to add a 'sleep' between fork and
exec, it works way better!
Tell me if you think its better to drop it.
> Could you send an updated patch?
Here it is!
> Thanks for the bug hunting and for the patch!
Thanks for the fast review :)
Mathieu
>From 79d3603bf15b8f815136178be8c8a236734a7596 Mon Sep 17 00:00:00 2001
From: Mathieu Othacehe <address@hidden>
Date: Thu, 7 May 2020 18:39:41 +0200
Subject: [PATCH] service: Fix 'make-kill-destructor' when PGID is zero.
When a process is forked, and before its GID is changed in "exec-command",
it will share the parent GID, which is 0 for Shepherd. In that case, use
the PID instead of the PGID.
Also make sure to reset the SIGTERM handler before forking a process. Failing
to do so, will result in stopping Shepherd if a SIGTERM is received between
fork and execv calls. Restore the SIGTERM handler once the process has been
forked.
* modules/shepherd/service.scm (fork+exec-command): Save the current SIGTERM
handler and reset it before forking. Then, restore it on the parent after
forking.
(make-kill-destructor): Handle the case when PGID is zero, between the process
fork and exec.
---
modules/shepherd/service.scm | 33 ++++++++++++++++++++++++++-------
tests/forking-service.sh | 18 ++++++++++++++++++
2 files changed, 44 insertions(+), 7 deletions(-)
diff --git a/modules/shepherd/service.scm b/modules/shepherd/service.scm
index 8604d2f..45fcf32 100644
--- a/modules/shepherd/service.scm
+++ b/modules/shepherd/service.scm
@@ -5,6 +5,7 @@
;; Copyright (C) 2016 Alex Kost <address@hidden>
;; Copyright (C) 2018 Carlo Zancanaro <address@hidden>
;; Copyright (C) 2019 Ricardo Wurmus <address@hidden>
+;; Copyright (C) 2020 Mathieu Othacehe <address@hidden>
;;
;; This file is part of the GNU Shepherd.
;;
@@ -884,7 +885,18 @@ its PID."
(unless %sigchld-handler-installed?
(sigaction SIGCHLD handle-SIGCHLD SA_NOCLDSTOP)
(set! %sigchld-handler-installed? #t))
- (let ((pid (primitive-fork)))
+
+ ;; When forking a process, the signal handlers are inherited, until it
+ ;; forks. If SIGTERM is received by the forked process, before it calls
+ ;; execv, the installed SIGTERM handler, stopping Shepherd will be called.
+ ;; To avoid this, save the SIGTERM handler, disable it, and restore it once,
+ ;; the process has been forked. This way, the forked process will use the
+ ;; default SIGTERM handler stopping the process.
+ (let ((term-handler (match (sigaction SIGTERM)
+ ((proc . _)
+ proc)))
+ (pid (and (sigaction SIGTERM SIG_DFL)
+ (primitive-fork))))
(if (zero? pid)
(exec-command command
#:user user
@@ -893,7 +905,10 @@ its PID."
#:directory directory
#:file-creation-mask file-creation-mask
#:environment-variables environment-variables)
- pid)))
+ (begin
+ ;; Restore the initial SIGTERM handler.
+ (sigaction SIGTERM term-handler)
+ pid))))
(define* (make-forkexec-constructor command
#:key
@@ -957,11 +972,15 @@ start."
"Return a procedure that sends SIGNAL to the process group of the PID given
as argument, where SIGNAL defaults to `SIGTERM'."
(lambda (pid . args)
- ;; Kill the whole process group PID belongs to. Don't assume that PID
- ;; is a process group ID: that's not the case when using #:pid-file,
- ;; where the process group ID is the PID of the process that
- ;; "daemonized".
- (kill (- (getpgid pid)) signal)
+ ;; Kill the whole process group PID belongs to. Don't assume that PID is
+ ;; a process group ID: that's not the case when using #:pid-file, where
+ ;; the process group ID is the PID of the process that "daemonized". If
+ ;; this procedure is called, between the process fork and exec, the PGID
+ ;; will still be zero (the Shepherd PGID). In that case, use the PID.
+ (let ((pgid (getpgid pid)))
+ (if (= (getpgid 0) pgid)
+ (kill pid signal) ;don't kill ourself
+ (kill (- pgid) signal)))
#f))
;; Produce a constructor that executes a command.
diff --git a/tests/forking-service.sh b/tests/forking-service.sh
index 7126c3d..bd9aac9 100644
--- a/tests/forking-service.sh
+++ b/tests/forking-service.sh
@@ -71,6 +71,17 @@ cat > "$conf"<<EOF
#:pid-file "$PWD/$service2_pid")
#:stop (make-kill-destructor)
#:respawn? #t))
+
+(define %command3
+ '("$SHELL" "-c" "sleep 600"))
+
+(register-services
+ (make <service>
+ ;; A service that forks into a different process.
+ #:provides '(test3)
+ #:start (make-forkexec-constructor %command3)
+ #:stop (make-kill-destructor)
+ #:respawn? #t))
EOF
cat $conf
@@ -113,3 +124,10 @@ sleep 1;
$herd status test2 | grep started
test "`cat $PWD/$service2_started`" = "started
started"
+
+# Try to trigger eventual race conditions, when killing a process between fork
+# and execv calls.
+for i in `seq 1 50`
+do
+ $herd restart test3
+done
--
2.26.2
- bug#40981: Graphical installer tests sometimes hang., Mathieu Othacehe, 2020/05/04
- bug#40981: Graphical installer tests sometimes hang., Mathieu Othacehe, 2020/05/04
- bug#40981: Graphical installer tests sometimes hang., Ludovic Courtès, 2020/05/05
- bug#40981: Graphical installer tests sometimes hang., Mathieu Othacehe, 2020/05/05
- bug#40981: Graphical installer tests sometimes hang., Mathieu Othacehe, 2020/05/06
- bug#40981: Graphical installer tests sometimes hang., Mathieu Othacehe, 2020/05/07
- bug#40981: Graphical installer tests sometimes hang., Ludovic Courtès, 2020/05/10
- bug#40981: Graphical installer tests sometimes hang.,
Mathieu Othacehe <=
- bug#40981: Graphical installer tests sometimes hang., Ludovic Courtès, 2020/05/11