guix-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[shepherd] 02/03: service: Handle service state in a monitoring agent.


From: Ludovic Courtès
Subject: [shepherd] 02/03: service: Handle service state in a monitoring agent.
Date: Sat, 12 Nov 2022 16:44:26 -0500 (EST)

civodul pushed a commit to branch wip-service-monitor
in repository shepherd.

commit c06fa2f3418671779816de43459b7e6ef17f70c4
Author: Ludovic Courtès <ludo@gnu.org>
AuthorDate: Sat Jun 25 15:27:34 2022 +0200

    service: Handle service state in a monitoring agent.
    
    * modules/shepherd/service.scm (default-service-termination-handler):
    Add 'pid' parameter.  Remove uses of 'service-running-value'.
    (<service>)[running]: Remove slot.
    (service-running-value): Rewrite as an RPC to (current-monitor-channel).
    (start): Replace accesses to the 'running' slot with messages
    to (current-monitor-channel).
    (replace-service): Replace references to the '%services' hash table with
    messages to (current-monitor-channel).
    (stop): Likewise, and remove 'call-with-blocked-asyncs' call.
    (service-monitor, spawn-service-monitor): New procedures.
    (current-monitor-channel): New variable.
    (with-service-monitor): New macro.
    (make-inetd-constructor)[handle-child-termination]: Add 'pid'  parameter
    to 'handle-child-termination'.
    (%services): Remove.
    (fold-services, find-service, lookup-services): Rewrite as an RPC
    to (current-monitor-channel).
    (for-each-service, service-list): Rewrite in terms of 'fold-services'.
    (handle-service-termination): Send message to (current-monitor-channel).
    (respawn-service): Remove references to the 'running' slot and to
    '%services' by messages to (current-monitor-channel).
    (deregister-service): Likewise.
    <top level>: Remove (register-services root-service) call.
    * modules/shepherd/support.scm (primitive-load*): New procedure.
    (load-in-user-module): Use it instead of 'load-in-user-module'.
    (eval-in-user-module): Use 'primitive-eval' instead of 'eval'.
    * tests/inetd.sh: Add "herd eval root '(gc)'" before assignment of
    'initial_fd_count'.
    * modules/shepherd.scm (main): Remove (start root-service) call.
    Wrap body in 'with-service-monitor'.
    * doc/shepherd.texi (Slots of services): Remove 'running' slot.  Adjust
    wording for 'handle-termination'.
    (Methods of services): Adjust wording that referred to the 'running' slot.
    (Service Convenience, Service De- and Constructors): Likewise.
---
 .dir-locals.el               |   3 +-
 doc/shepherd.texi            |  30 +--
 modules/shepherd.scm         |  35 ++--
 modules/shepherd/service.scm | 432 +++++++++++++++++++++++++------------------
 modules/shepherd/support.scm |  19 +-
 5 files changed, 298 insertions(+), 221 deletions(-)

diff --git a/.dir-locals.el b/.dir-locals.el
index 3e64a3e..6ffae4c 100644
--- a/.dir-locals.el
+++ b/.dir-locals.el
@@ -11,6 +11,7 @@
       . "<https?://\\(debbugs\\|bugs\\)\\.gnu\\.org/\\([0-9]+\\)>")))
  (scheme-mode
   . ((indent-tabs-mode . nil)
-     (eval . (put 'with-blocked-signals 'scheme-indent-function 1))))
+     (eval . (put 'with-blocked-signals 'scheme-indent-function 1))
+     (eval . (put 'with-service-monitor 'scheme-indent-function 0))))
  (texinfo-mode    . ((indent-tabs-mode . nil)
                      (fill-column . 72))))
diff --git a/doc/shepherd.texi b/doc/shepherd.texi
index bec87e5..f9f0b79 100644
--- a/doc/shepherd.texi
+++ b/doc/shepherd.texi
@@ -648,19 +648,6 @@ those symbols must be started.  If a required symbol is 
provided by
 several services, one will be started.  By default, this slot
 contains the empty list.
 
-@item
-@vindex running (slot of <service>)
-@cindex Hook for individual services
-@code{running} is a hook that can be used by each service in its own
-way.  The default value is @code{#f}, which indicates that the service
-is not running. When an attempt is made to start the service, it will
-be set to the return value of the procedure in the @code{start} slot.
-It will also be passed as an argument to the procedure in the
-@code{stop} slot.  If it is set a value that is an integer, it is
-assumed to be a process id, and shepherd will monitor the process for
-unexpected exits.  If it is a procedure, that procedure is called to get
-at the underlying value.  This slot cannot be initialized with a keyword.
-
 @item
 @vindex respawn? (slot of <service>)
 @cindex Respawning services
@@ -725,8 +712,8 @@ returns @code{#f} and performs no further actions.
 @vindex handle-termination (slot of <service>)
 @cindex Termination of a service's process.
 The @code{handle-termination} slot contains the procedure to call when
-the process associated with a service---the process whose PID appears in
-the @code{running} slot---terminates.  It is passed the service and its
+the process associated with a service terminates.  It is passed the
+service, the PID of the terminating process, and its
 exit status, an integer as returned by @code{waitpid} (@pxref{Processes,
 @code{waitpid},, guile, GNU Guile Reference Manual}).
 
@@ -783,9 +770,8 @@ required symbol can not be started, it will look for 
another service
 that also provides this symbol, until starting one such service
 succeeds.  There is some room for theoretical improvement here, of
 course, but in practice the current strategy already works very well.
-This method returns the new value of the @code{running} slot
-@ref{Slots of services}, which is @code{#f} if the service could not
-be started.
+This method returns the new ``running value'' of the service,
+@code{#f} if the service could not be started.
 @end deffn
 
 @deffn {method} stop (obj <service>)
@@ -793,7 +779,7 @@ This will stop the service @var{obj}, trying to stop 
services that
 depend in it first, so they can be shutdown cleanly.  If this will
 fail, it will continue anyway.  Stopping of services should usually
 succeed, though.  Otherwise, the behaviour is very similar to the
-@code{start} method.  The return value is also the new @code{running}
+@code{start} method.  The return value is also the new running
 value, thus @code{#f} if the service was stopped.
 @end deffn
 
@@ -865,7 +851,7 @@ This macro is used to create a value for the @code{actions} 
slot of a
 service object @ref{Slots of services}.  Each @var{name} is a symbol
 and each @var{proc} the corresponding procedure that will be called to
 perform the action.  A @var{proc} has one argument, which will be the
-current value of the @code{running} slot of the service.
+running value of the service.
 @end deffn
 
 @deffn {method} start (obj <symbol>)
@@ -980,9 +966,7 @@ PID given as argument, where @var{signal} defaults to 
@code{SIGTERM}.
 
 This @emph{does} work together with respawning services,
 because in that case the @code{stop} method of the @code{<service>}
-class sets the @code{running} slot to @code{#f} before actually
-calling the destructor; if it would not do that, killing the process
-in the destructor would immediately respawn the service.
+arranges so that the service is not respawned.
 @end deffn
 
 The @code{make-forkexec-constructor} procedure builds upon the following
diff --git a/modules/shepherd.scm b/modules/shepherd.scm
index 91f3318..701e3f3 100644
--- a/modules/shepherd.scm
+++ b/modules/shepherd.scm
@@ -380,9 +380,6 @@ already ~a threads running, disabling 'signalfd' support")
 
       (set-port-encoding! (log-output-port) "UTF-8")
 
-      ;; Start the 'root' service.
-      (start root-service)
-
       (when (= 1 (getpid))
         ;; When running as PID 1, disable hard reboots upon ctrl-alt-del.
         ;; Instead, the kernel will send us SIGINT so that we can gracefully
@@ -413,19 +410,25 @@ already ~a threads running, disabling 'signalfd' support")
       ;; because POSIX threads and 'fork' cannot be used together.
       (run-fibers
        (lambda ()
-         (catch 'quit
-           (lambda ()
-             (run-daemon #:socket-file socket-file
-                         #:config-file config-file
-                         #:pid-file pid-file
-                         #:signal-port signal-port
-                         #:poll-services? poll-services?
-                         #:persistency persistency))
-           (case-lambda
-             ((key value . _)
-              (primitive-exit value))
-             ((key)
-              (primitive-exit 0)))))
+         (with-service-monitor
+
+          ;; Register and start the 'root' service.
+          (register-services root-service)
+          (start root-service)
+
+          (catch 'quit
+            (lambda ()
+              (run-daemon #:socket-file socket-file
+                          #:config-file config-file
+                          #:pid-file pid-file
+                          #:signal-port signal-port
+                          #:poll-services? poll-services?
+                          #:persistency persistency))
+            (case-lambda
+              ((key value . _)
+               (primitive-exit value))
+              ((key)
+               (primitive-exit 0))))))
        #:parallelism 1  ;don't create POSIX threads
        #:hz 0))))       ;disable preemption, which would require POSIX threads
 
diff --git a/modules/shepherd/service.scm b/modules/shepherd/service.scm
index 6f1d13d..9b326ce 100644
--- a/modules/shepherd/service.scm
+++ b/modules/shepherd/service.scm
@@ -26,6 +26,7 @@
 (define-module (shepherd service)
   #:use-module ((fibers)
                 #:hide (sleep))
+  #:use-module (fibers channels)
   #:use-module (fibers scheduler)
   #:use-module (oop goops)
   #:use-module (srfi srfi-1)
@@ -39,6 +40,7 @@
   #:use-module (ice-9 format)
   #:autoload   (ice-9 ports internal) (port-read-wait-fd)
   #:autoload   (ice-9 pretty-print) (truncated-print)
+  #:use-module (ice-9 vlist)
   #:use-module (shepherd support)
   #:use-module (shepherd comm)
   #:use-module (shepherd config)
@@ -52,6 +54,7 @@
             action-list
             lookup-action
             defines-action?
+            with-service-monitor
 
             action?
 
@@ -182,7 +185,7 @@ respawned, shows that it has been respawned more than TIMES 
in SECONDS."
            (and (> (+ last-respawn seconds) now)
                 (loop (- times 1) rest)))))))
 
-(define (default-service-termination-handler service status)
+(define (default-service-termination-handler service pid status)
   "Handle the termination of @var{service} by respawning it if applicable.
 Log abnormal termination reported by @var{status}."
   (unless (zero? status)
@@ -191,20 +194,18 @@ Log abnormal termination reported by @var{status}."
            =>
            (lambda (code)
              (local-output (l10n "Service ~a (PID ~a) exited with ~a.")
-                           (canonical-name service)
-                           (service-running-value service) code)))
+                           (canonical-name service) pid code)))
           ((status:term-sig status)
            =>
            (lambda (signal)
              (local-output (l10n "Service ~a (PID ~a) terminated with signal 
~a.")
-                           (canonical-name service)
-                           (service-running-value service) signal)))
+                           (canonical-name service) pid signal)))
           ((status:stop-sig status)
            =>
            (lambda (signal)
              (local-output (l10n "Service ~a (PID ~a) stopped with signal ~a.")
                            (canonical-name service)
-                           (service-running-value service) signal)))))
+                           pid signal)))))
 
   (respawn-service service))
 
@@ -253,12 +254,8 @@ Log abnormal termination reported by @var{status}."
   ;; on this.
   (actions #:init-keyword #:actions
           #:init-form (make-actions))
-  ;; If this is `#f', it means that the service is not running
-  ;; currently.  Otherwise, it is the value that was returned by the
-  ;; procedure in the `start' slot when the service was started.
-  (running #:init-value #f)
   ;; Procedure called to notify that the process associated with this service
-  ;; (whose PID is in the 'running' slot) has terminated.
+  ;; has terminated.
   (handle-termination #:init-keyword #:handle-termination
                       #:init-value default-service-termination-handler)
   ;; A description of the service.
@@ -333,11 +330,15 @@ wire."
 (define-method (canonical-name (obj <service>))
   (car (provided-by obj)))
 
-;; Return the "running value" of OBJ.
-(define-method (service-running-value (obj <service>))
-  (match (slot-ref obj 'running)
-    ((? procedure? proc) (proc))
-    (value value)))
+(define service-running-value
+  (let ((reply (make-channel)))
+    (lambda (service)
+      "Return the \"running value\" of SERVICE."
+      (put-message (current-monitor-channel)
+                   `(running ,service ,reply))
+      (match (get-message reply)
+        ((? procedure? proc) (proc))
+        (value value)))))
 
 ;; Return whether the service is currently running.
 (define-method (running? (obj <service>))
@@ -396,18 +397,19 @@ wire."
                             (canonical-name obj)
                             problem)
                ;; Start the service itself.
-               (slot-set! obj 'running (catch #t
-                                         (lambda ()
-                                           (apply (slot-ref obj 'start)
-                                                  args))
-                                         (lambda (key . args)
-                                           (report-exception 'start obj
-                                                             key args)))))
+               (let ((running (catch #t
+                                (lambda ()
+                                  (apply (slot-ref obj 'start) args))
+                                (lambda (key . args)
+                                  (report-exception 'start obj key args)))))
+                 (put-message (current-monitor-channel)
+                              `(set-running ,obj ,running))))
 
           ;; Status message.
            (let ((running (service-running-value obj)))
              (when (one-shot? obj)
-               (slot-set! obj 'running #f))
+               (put-message (current-monitor-channel)
+                            `(notify-termination ,obj)))
              (local-output (if running
                               (l10n "Service ~a has been started.")
                                (l10n "Service ~a could not be started."))
@@ -419,14 +421,9 @@ wire."
   "Replace OLD-SERVICE with NEW-SERVICE in the services registry.  This
 completely removes all references to OLD-SERVICE before registering
 NEW-SERVICE."
-  (define (remove-service name)
-    (let* ((old (hashq-ref %services name))
-           (new (delete old-service old)))
-      (if (null? new)
-          (hashq-remove! %services name)
-          (hashq-set! %services name new))))
   (when new-service
-    (for-each remove-service (provided-by old-service))
+    (put-message (current-monitor-channel)
+                 `(unregister ,(list old-service)))
     (register-services new-service)))
 
 (define (required-by? service dependent)
@@ -444,60 +441,61 @@ NEW-SERVICE."
 canonical names for all of the services which have been stopped (including
 transitive dependent services).  This method will print a warning if SERVICE
 is not already running, and will return SERVICE's canonical name in a list."
-  ;; Block asyncs so the SIGCHLD handler doesn't execute concurrently.
-  ;; Notably, that makes sure the handler processes the SIGCHLD for SERVICE's
-  ;; process once we're done; otherwise, it could end up respawning SERVICE.
-  (call-with-blocked-asyncs
-   (lambda ()
-     (if (not (running? service))
-         (begin
-           (local-output (l10n "Service ~a is not running.")
-                         (canonical-name service))
-           (list (canonical-name service)))
-         (let ((name (canonical-name service))
-               (stopped-dependents (fold-services (lambda (other acc)
-                                                    (if (and (running? other)
-                                                             (required-by? 
service other))
-                                                        (append (stop other) 
acc)
-                                                        acc))
-                                                  '())))
-           ;; Stop the service itself.
-           (catch #t
-             (lambda ()
-               (apply (slot-ref service 'stop)
-                      (service-running-value service)
-                      args))
-             (lambda (key . args)
-               ;; Special case: 'root' may quit.
-               (and (eq? root-service service)
-                    (eq? key 'quit)
-                    (apply quit args))
-               (caught-error key args)))
-
-           ;; SERVICE is no longer running.
-           (slot-set! service 'running #f)
-
-           ;; Reset the list of respawns.
-           (slot-set! service 'last-respawns '())
-
-           ;; Replace the service with its replacement, if it has one
-           (let ((replacement (slot-ref service 'replacement)))
-             (when replacement
-               (replace-service service replacement)))
-
-           ;; Status message.
-           (if (running? service)
-               (local-output (l10n "Service ~a could not be stopped.")
-                             name)
-               (local-output (l10n "Service ~a has been stopped.")
-                             name))
-
-           (when (transient? service)
-             (hashq-remove! %services (canonical-name service))
-             (local-output (l10n "Transient service ~a unregistered.")
-                           (canonical-name service)))
-
-           (cons name stopped-dependents))))))
+  ;; XXX: Assume that SIGCHLD resulting from calling SERVICE's 'stop' method
+  ;; won't be handled by the time we're done (in which case we'd end up
+  ;; respawning the service we're trying to stop).  FIXME: This may not be the
+  ;; case when not using a signal file descriptor.
+  (if (not (running? service))
+      (begin
+        (local-output (l10n "Service ~a is not running.")
+                      (canonical-name service))
+        (list (canonical-name service)))
+      (let ((name (canonical-name service))
+            (stopped-dependents (fold-services (lambda (other acc)
+                                                 (if (and (running? other)
+                                                          (required-by? 
service other))
+                                                     (append (stop other) acc)
+                                                     acc))
+                                               '())))
+        ;; Stop the service itself.
+        (catch #t
+          (lambda ()
+            (apply (slot-ref service 'stop)
+                   (service-running-value service)
+                   args))
+          (lambda (key . args)
+            ;; Special case: 'root' may quit.
+            (and (eq? root-service service)
+                 (eq? key 'quit)
+                 (apply quit args))
+            (caught-error key args)))
+
+        ;; SERVICE is no longer running.
+        (put-message (current-monitor-channel)
+                     `(notify-termination ,service))
+
+        ;; Reset the list of respawns.
+        (slot-set! service 'last-respawns '())
+
+        ;; Replace the service with its replacement, if it has one
+        (let ((replacement (slot-ref service 'replacement)))
+          (when replacement
+            (replace-service service replacement)))
+
+        ;; Status message.
+        (if (running? service)
+            (local-output (l10n "Service ~a could not be stopped.")
+                          name)
+            (local-output (l10n "Service ~a has been stopped.")
+                          name))
+
+        (when (transient? service)
+          (put-message (current-monitor-channel)
+                       `(unregister ,(list service)))
+          (local-output (l10n "Transient service ~a unregistered.")
+                        (canonical-name service)))
+
+        (cons name stopped-dependents))))
 
 ;; Call action THE-ACTION with ARGS.
 (define-method (action (obj <service>) the-action . args)
@@ -643,6 +641,121 @@ clients."
   (every lookup-running (required-by obj)))
 
 
+;;;
+;;; Service monitor.
+;;;
+
+(define (service-monitor channel)
+  "Encapsulate shepherd state (registered and running services) and serve
+requests arriving on @var{channel}."
+  (let loop ((registered vlist-null)
+             (running vlist-null))
+    (define (unregister services)
+      ;; Return REGISTERED minus SERVICE.
+      (vhash-fold (lambda (name service result)
+                    (if (memq service services)
+                        result
+                        (fold (cut vhash-consq <> service <>)
+                              result
+                              (provided-by service))))
+                  vlist-null
+                  registered))
+
+    (define* (register service #:optional (registered registered))
+      ;; Add SERVICE to REGISTER and return it.
+      (fold (cut vhash-consq <> service <>)
+            registered
+            (provided-by service)))
+
+    (match (get-message channel)
+      (('register service)                        ;no reply
+       (let ((name (canonical-name service)))
+         (match (vhash-assq name registered)
+           (#f
+            (loop (register service) running))
+           ((_ . old)
+            (if (vhash-assq old running)
+                (begin
+                  (slot-set! old 'replacement service)
+                  (loop registered running))
+                (loop (register service (unregister (list old)))
+                      running))))))
+      (('unregister services)                     ;no reply
+       (match (filter (cut vhash-assq <> running) services)
+         (()
+          (loop (unregister services) running))
+         (lst                                     ;
+          (local-output
+           (l10n "Cannot unregister service ~a, which is still running"
+                 "Cannot unregister services~{ ~a,~} which are still running"
+                 (length lst))
+           (map canonical-name lst))
+          (loop registered running))))
+      (('unregister-all)                          ;no reply
+       (let ((root (cdr (vhash-assq 'root registered))))
+         (loop (fold (cut vhash-consq <> root <>)
+                     vlist-null
+                     (provided-by root))
+               (vhash-consq root #t running))))
+      (('lookup name reply)
+       (put-message reply
+                    (vhash-foldq* cons '() name registered))
+       (loop registered running))
+      (('service-list reply)
+       (let ((names (delete-duplicates
+                     (vhash-fold (lambda (key _ result)
+                                   (cons key result))
+                                 '()
+                                 registered)
+                     eq?)))
+         (put-message reply
+                      (fold (lambda (name result)
+                              (alist-cons name
+                                          (vhash-foldq* cons '() name
+                                                        registered)
+                                          result))
+                            '()
+                            names))
+         (loop registered running)))
+      (('running service reply)
+       (put-message reply
+                    (match (vhash-assq service running)
+                      (#f #f)
+                      ((_ . value) value)))
+       (loop registered running))
+      (('set-running service value)               ;no reply
+       (loop registered
+             (vhash-consq service value running)))
+      (('notify-termination service)              ;no reply
+       (loop registered
+             (vhash-delq service running))))))    ;XXX: complexity
+
+(define (spawn-service-monitor)
+  "Spawn a new service monitor fiber and return a channel to send it requests."
+  (define channel
+    (make-channel))
+
+  (spawn-fiber
+   (lambda ()
+     (service-monitor channel)))
+
+  channel)
+
+(define current-monitor-channel
+  ;; The channel to communicate with the current service monitor.
+  (make-parameter #f))
+
+(define (call-with-service-monitor thunk)
+  (parameterize ((current-monitor-channel (spawn-service-monitor)))
+    (thunk)))
+
+(define-syntax-rule (with-service-monitor exp ...)
+  "Spawn a new service monitor and evaluate @var{exp}... within that dynamic 
extent.
+This allows @var{exp}... and their callees to send requests to delegate
+service state and to send requests to the service monitor."
+  (call-with-service-monitor (lambda () exp ...)))
+
+
 
 (define (launch-service name proc args)
   "Try to start (with PROC) a service providing NAME; return #f on failure.
@@ -1495,13 +1608,13 @@ The remaining arguments are as for 
@code{make-forkexec-constructor}."
     ;; Number of active connections.
     0)
 
-  (define (handle-child-termination service status)
+  (define (handle-child-termination service pid status)
     (set! connection-count (- connection-count 1))
     (local-output (l10n "~a connection still in use after ~a termination."
                         "~a connections still in use after ~a termination."
                         connection-count)
                   connection-count (canonical-name service))
-    (default-service-termination-handler service status))
+    (default-service-termination-handler service pid status))
 
   (define (spawn-child-service connection server-address client-address)
     (let* ((name    (child-service-name))
@@ -1695,9 +1808,6 @@ This must be paired with @code{make-systemd-destructor}."
 
 ;;; Registered services.
 
-;; All registered services.
-(define %services (make-hash-table 75))
-
 ;;; Perform actions with services:
 
 (define (lookup-canonical-service name services)
@@ -1707,48 +1817,51 @@ Return #f if service is not found."
           (eq? name (canonical-name service)))
         services))
 
-(define (fold-services proc init)
-  "Apply PROC to the registered services to build a result, and return that
+(define fold-services
+  (let ((reply (make-channel)))
+    (lambda (proc init)
+      "Apply PROC to the registered services to build a result, and return that
 result.  Works in a manner akin to `fold' from SRFI-1."
-  (hash-fold (lambda (name services acc)
+      (put-message (current-monitor-channel)
+                   `(service-list ,reply))
+      (fold (match-lambda*
+              (((name . services) result)
                (let ((service (lookup-canonical-service name services)))
                  (if service
-                     (proc service acc)
-                     acc)))
-             init %services))
+                     (proc service result)
+                     result))))
+            init
+            (get-message reply)))))
 
 (define (for-each-service proc)
   "Call PROC for each registered service."
-  (hash-for-each (lambda (name services)
-                   (and=> (lookup-canonical-service name services)
-                          proc))
-                 %services))
+  (fold-services (lambda (service _)
+                   (proc service)
+                   *unspecified*)
+                 *unspecified*))
 
 (define (service-list)
   "Return the list of services currently defined.  Note: The order of the list
 returned in unspecified."
-  (hash-fold (lambda (name services result)
-               (let ((service (lookup-canonical-service name services)))
-                 (if service
-                     (cons service result)
-                     result)))
-             '()
-             %services))
-
-(define (find-service pred)
-  "Return the first service that matches PRED, or #f if none was found."
-  (call/ec
-   (lambda (return)
-     (hash-fold (lambda (name services _)
-                  (and=> (find pred services)
-                         return))
-                #f
-                %services)
-     #f)))
-
-(define (lookup-services name)
-  "Return a (possibly empty) list of services that provide NAME."
-  (hashq-ref %services name '()))
+  (fold-services cons '()))
+
+(define find-service
+  (let ((reply (make-channel)))
+    (lambda (pred)
+      "Return the first service that matches PRED, or #f if none was found."
+      (call/ec
+       (lambda (return)
+         (fold-services (lambda (service _)
+                          (and (pred service)
+                               (return service)))
+                        #f))))))
+
+(define lookup-services
+  (let ((reply (make-channel)))
+    (lambda (name)
+      "Return a (possibly empty) list of services that provide NAME."
+      (put-message (current-monitor-channel) `(lookup ,name ,reply))
+      (get-message reply))))
 
 (define waitpid*
   (lambda (what flags)
@@ -1791,13 +1904,14 @@ otherwise by updating its state."
 PID is in its @code{running} slot; @var{status} is the process's exit status
 as returned by @code{waitpid}.  This procedure is called right after the
 process has terminated."
-  ((slot-ref service 'handle-termination) service status))
+  (let ((running (service-running-value service)))
+    (put-message (current-monitor-channel) `(notify-termination ,service))
+    ((slot-ref service 'handle-termination) service running status)))
 
 (define (respawn-service serv)
   "Respawn a service that has stopped running unexpectedly. If we have
 attempted to respawn the service a number of times already and it keeps dying,
 then disable it."
-  (slot-set! serv 'running #f)
   (if (and (respawn? serv)
            (not (respawn-limit-hit? (slot-ref serv 'last-respawns)
                                     (car respawn-limit)
@@ -1818,7 +1932,7 @@ then disable it."
         (slot-set! serv 'enabled? #f)
 
         (when (transient? serv)
-          (hashq-remove! %services (canonical-name serv))
+          (put-message (current-monitor-channel) `(unregister (,serv)))
           (local-output (l10n "Transient service ~a terminated, now 
unregistered.")
                         (canonical-name serv))))))
 
@@ -1833,22 +1947,7 @@ is currently stopped, replace it immediately."
     (assert (list-of-symbols? (required-by new)))
     (assert (boolean? (respawn? new)))
 
-    ;; FIXME: Just because we have a unique canonical name now doesn't mean it
-    ;; will remain unique as other services are added. Whenever a service is
-    ;; added it should check that it's not conflicting with any already
-    ;; registered canonical names.
-    (match (lookup-services (canonical-name new))
-      (() ;; empty, so we can safely add ourselves
-       (for-each (lambda (name)
-                  (let ((old (lookup-services name)))
-                    (hashq-set! %services name (cons new old))))
-                (provided-by new)))
-      ((old . rest) ;; one service registered, it may be an old version of us
-       (assert (null? rest))
-       (assert (eq? (canonical-name new) (canonical-name old)))
-       (if (running? old)
-           (slot-set! old 'replacement new)
-           (replace-service old new)))))
+    (put-message (current-monitor-channel) `(register ,new)))
 
   (for-each register-single-service new-services))
 
@@ -1861,50 +1960,23 @@ This will remove a service either if it is identified 
by its canonical
 name, or if it is the only service providing the service that is
 requested to be removed."
   (define (deregister service)
-    (if (running? service)
-        (stop service))
+    (when (running? service)
+      (stop service))
     ;; Remove services provided by service from the hash table.
-    (for-each
-     (lambda (name)
-       (let ((old (lookup-services name)))
-         (if (= 1 (length old))
-             ;; Only service provides this service; remove it.
-             (hashq-remove! %services name)
-             ;; ELSE: remove service from providing services.
-             (hashq-set! %services name
-                         (remove
-                          (lambda (lk-service)
-                            (eq? (canonical-name service)
-                                 (canonical-name lk-service)))
-                          old)))))
-     (provided-by service)))
-  (define (service-pairs)
-    "Return '(name . service) of all user-registered services."
-    (filter identity
-            (hash-map->list
-             (lambda (key value)
-               (match value
-                 ((service)     ; only one service associated with KEY
-                  (and (eq? key (canonical-name service))
-                       (not (memq key '(root shepherd)))
-                       (cons key service)))
-                 (_ #f)))               ; all other cases: #f.
-             %services)))
+    (put-message (current-monitor-channel)
+                 `(unregister ,(list service))))
 
   (let ((name (string->symbol service-name)))
     (cond ((eq? name 'all)
            ;; Special 'remove all' case.
-           (let ((pairs (service-pairs)))
-             (local-output (l10n "Unloading all optional services: '~a'...")
-                           (map car pairs))
-             (for-each deregister (map cdr pairs))
-             (local-output (l10n "Done."))))
+           (put-message (current-monitor-channel) `(unregister-all))
+           #t)
           (else
            ;; Removing only one service.
            (match (lookup-services name)
-             (()                        ; unknown service
+             (()                                  ; unknown service
               (raise (condition (&missing-service-error (name name)))))
-             ((service)             ; only SERVICE provides NAME
+             ((service)                     ; only SERVICE provides NAME
               ;; Are we removing a user service…
               (if (eq? (canonical-name service) name)
                   (local-output (l10n "Removing service '~a'...") name)
@@ -1914,7 +1986,7 @@ requested to be removed."
                    (canonical-name service) name))
               (deregister service)
               (local-output (l10n "Done.")))
-             ((services ...)            ; ambiguous NAME
+             ((services ...)                      ; ambiguous NAME
               (local-output
                "Not unloading: '~a' names several services: '~a'."
                name (map canonical-name services))))))))
@@ -1984,8 +2056,11 @@ where prctl/PR_SET_CHILD_SUBREAPER is unsupported."
              #t)
     #:stop (lambda (unused . args)
             (local-output (l10n "Exiting shepherd..."))
+
             ;; Prevent that we try to stop ourself again.
-            (slot-set! root-service 'running #f)
+            (put-message (current-monitor-channel)
+                          `(notify-termination ,root-service))
+
              (shutdown-services)
             (quit))
     ;; All actions here need to take care that they do not invoke any
@@ -2112,4 +2187,3 @@ when in interactive mode, i.e. with `--socket=none'."
       (lambda (running)
        (local-output (l10n "You must be kidding.")))))))
 
-(register-services root-service)
diff --git a/modules/shepherd/support.scm b/modules/shepherd/support.scm
index a68bbd4..ed02d89 100644
--- a/modules/shepherd/support.scm
+++ b/modules/shepherd/support.scm
@@ -361,20 +361,35 @@ which has essential bindings pulled in."
     (module-use! m (resolve-interface '(shepherd service)))
     m))
 
+(define (primitive-load* file)
+  ;; Like 'primitive-load', but in Scheme, so that it does not introduce a
+  ;; continuation barrier that would prevent code in FILE from suspending.
+  (call-with-input-file file
+    (lambda (port)
+      (let loop ((result *unspecified*))
+       (match (read port)
+         ((? eof-object?)
+          result)
+         (exp
+          (loop (primitive-eval exp))))))))
+
 (define (load-in-user-module file)
   "Load FILE in a fresh user module that has essential bindings pulled in."
   (let ((user-module (make-user-module)))
     (save-module-excursion
      (lambda ()
        (set-current-module user-module)
-       (primitive-load file)))))
+       (primitive-load* file)))))
 
 (define (eval-in-user-module exp)
   "Eval EXP in a fresh user module that has essential bindings pulled in."
   (let ((user-module (make-user-module)))
     (save-module-excursion
      (lambda ()
-       (eval exp user-module)))))
+       ;; Note: As of Guile 3.0.8, 'eval' is written in C, and is thus a
+       ;; continuation barrier.  Use 'primitive-eval' to avoid that.
+       (set-current-module user-module)
+       (primitive-eval exp)))))
 
 (define* (verify-dir dir #:key (secure? #t))
   "Check if the directory DIR exists and create it if it is the default



reply via email to

[Prev in Thread] Current Thread [Next in Thread]