[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[shepherd] branch master updated: service: Process monitor does not wait

From: Ludovic Courtès
Subject: [shepherd] branch master updated: service: Process monitor does not wait for pseudo-process termination.
Date: Mon, 20 Nov 2023 17:13:08 -0500

This is an automated email from the git hooks/post-receive script.

civodul pushed a commit to branch master
in repository shepherd.

The following commit(s) were added to refs/heads/master by this push:
     new cc9c5c0  service: Process monitor does not wait for pseudo-process 
cc9c5c0 is described below

commit cc9c5c029534458ae547d78200b6b51f729654e3
Author: Ludovic Courtès <>
AuthorDate: Mon Nov 20 22:46:29 2023 +0100

    service: Process monitor does not wait for pseudo-process termination.
    Fixes <>.
    * modules/shepherd/service.scm (PF_KTHREAD): New variable.
    (linux-process-flags, linux-kernel-thread?, pseudo-process?): New
    (process-monitor): ‘await’ does not wait when PID denotes a
    * NEWS: Update.
 NEWS                         |  8 ++++++++
 modules/shepherd/service.scm | 43 +++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 49 insertions(+), 2 deletions(-)

diff --git a/NEWS b/NEWS
index 6e05177..c0eb647 100644
--- a/NEWS
+++ b/NEWS
@@ -20,6 +20,14 @@ Until now, user code could call (@ (guile) sleep), the core 
Guile binding for
 caused ‘shepherd’ to actually sleep for that time, instead of performing other
 on-going tasks.  ‘sleep’ is now replaced by (@ (fibers) sleep) to avoid that.
+** Do not accidentally wait for Linux kernel thread completion
+   (<>)
+In cases a PID file contained a bogus PID or one that’s only valid in a
+separate PID namespace, shepherd could end up waiting for the termination of
+what’s actually a Linux kernel thread, such as PID 2 (“kthreadd”).  This
+situation is now recognized and avoided.
 * Changes in 0.10.2
 ** ‘shepherd’ loads configuration file asynchronously
diff --git a/modules/shepherd/service.scm b/modules/shepherd/service.scm
index 039d155..16f5868 100644
--- a/modules/shepherd/service.scm
+++ b/modules/shepherd/service.scm
@@ -2329,6 +2329,44 @@ otherwise by updating its state."
     (('exception args)
      (apply throw args))))
+(define (linux-process-flags pid)
+  "Return the process flags of @var{pid} (or'd @code{PF_} constants), assuming
+the Linux /proc file system is mounted; raise a @code{system-error} exception
+  (call-with-input-file (string-append "/proc/" (number->string pid)
+                                       "/stat")
+    (lambda (port)
+      (define line
+        (get-string-all port))
+      ;; Parse like systemd's 'is_kernel_thread' function.
+      (let ((offset (string-index line #\))))     ;offset past 'tcomm' field
+        (match (and offset
+                    (string-tokenize (string-drop line (+ offset 1))))
+          ((state ppid pgrp sid tty-nr tty-pgrp flags . _)
+           (or (string->number flags) 0))
+          (_
+           0))))))
+;; Per-process flag defined in <linux/sched.h>.
+(define PF_KTHREAD #x00200000)                    ;I am a kernel thread
+(define (linux-kernel-thread? pid)
+  "Return true if @var{pid} is a Linux kernel thread."
+  (= PF_KTHREAD (logand (linux-process-flags pid) PF_KTHREAD)))
+(define pseudo-process?
+  (if (string-contains %host-type "linux")
+      (lambda (pid)
+        "Return true if @var{pid} denotes a \"pseudo-process\" such as a Linux
+kernel thread rather than a \"regular\" process.  A pseudo-process is one that
+may never terminate, even after sending it SIGKILL---e.g., kthreadd on Linux."
+        (catch 'system-error
+          (lambda ()
+            (linux-kernel-thread? pid))
+          (const #f)))
+      (const #f)))
 (define (process-monitor channel)
   "Run a process monitor that handles requests received over @var{channel}."
   (let loop ((waiters vlist-null))
@@ -2365,9 +2403,10 @@ otherwise by updating its state."
       (('await pid reply)
        ;; Await the termination of PID and send its status on REPLY.
-       (if (catch-system-error (kill pid 0))
+       (if (and (catch-system-error (kill pid 0))
+                (not (pseudo-process? pid)))
            (loop (vhash-consv pid reply waiters))
-           (begin                                 ;PID is gone
+           (begin                             ;PID is gone or a pseudo-process
              (put-message reply 0)
              (loop waiters)))))))

reply via email to

[Prev in Thread] Current Thread [Next in Thread]