bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#55444: elogind startup race between shepherd and dbus-daemon


From: Ludovic Courtès
Subject: bug#55444: elogind startup race between shepherd and dbus-daemon
Date: Fri, 27 May 2022 22:54:49 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)

Hey there,

Ludovic Courtès <ludo@gnu.org> skribis:

> So it would seem that the solution to this is to prevent dbus-daemon
> from starting elogind.  We can do that by changing
> org.freedesktop.login1.service so that it has “Exec=true” instead of
> “Exec=elogind --daemon”.
>
> “Exec=true” is a bit crude because it doesn’t guarantee that elogind is
> really started; if that isn’t good enough, we could instead wait for the
> PID file or something (as of Shepherd 0.9.0, invoking ‘herd start
> elogind’ potentially leads shepherd to start a second instance if the
> first one is still being started, so we can’t really do that).

The patch below address that: it changes the “Exec=” line of
‘org.freedesktop.login1’ to refer to a wrapper.  That wrapper connects
to shepherd and waits until ‘elogind’ is started.

That way, if dbus-daemon comes first, it won’t actually launch anything
and instead wait for the Shepherd ‘elogind’ service to be up.  (And if
it comes second, dbus-daemon won’t try to launch anything, so no
spurious “already running” messages.)

I tested it in a ‘desktop.tmpl’ VM, quickly logging in on tty1.  On
/var/log/messages, you can see the “Activating ….login1” message from
dbus-daemon, followed by “Service elogind started” from shepherd,
followed by “Successfully activated ….login1” from dbus-daemon.

The “elogind” system test passes too.

Thoughts?  Objections?

Ludo’.

>From 7ef63d7426677961afd2bd937af19b08209c5b70 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= <ludo@gnu.org>
Date: Fri, 27 May 2022 22:41:55 +0200
Subject: [PATCH] services: elogind: When started by dbus-daemon, wait for the
 Shepherd service.

Fixes <https://issues.guix.gnu.org/55444>.

Previously shepherd and dbus-daemon would race to start elogind.  In
some cases (for instance if one logs in quickly enough on the tty),
dbus-daemon would "win" and start elogind before shepherd has had a
chance to do it.  Consequently, shepherd would fail to start elogind and
mark it as stopped and disabled, in turn preventing services that depend
on it such as 'xorg-server' from starting.

* gnu/services/desktop.scm (elogind-dbus-service): Rewrite to refer to a
wrapper that waits for the 'elogind' Shepherd service.
---
 gnu/services/desktop.scm | 79 ++++++++++++++++++++++++++++++++++++++--
 1 file changed, 75 insertions(+), 4 deletions(-)

diff --git a/gnu/services/desktop.scm b/gnu/services/desktop.scm
index 24fd43a207..318107a2ca 100644
--- a/gnu/services/desktop.scm
+++ b/gnu/services/desktop.scm
@@ -1075,10 +1075,81 @@ (define-syntax-rule (ini-file config file clause ...)
    ("HybridSleepMode" (sleep-list elogind-hybrid-sleep-mode))))
 
 (define (elogind-dbus-service config)
-  (list (wrapped-dbus-service (elogind-package config)
-                              "libexec/elogind/elogind"
-                              `(("ELOGIND_CONF_FILE"
-                                 ,(elogind-configuration-file config))))))
+  "Return a @file{org.freedesktop.login1.service} file that tells D-Bus how to
+\"start\" elogind.  In practice though, our elogind is started when booting by
+shepherd.  Thus, the @code{Exec} line of this @file{.service} file does not
+explain how to start elogind; instead, it spawns a wrapper that waits for the
+@code{elogind} shepherd service.  This avoids a race condition where both
+@command{shepherd} and @command{dbus-daemon} would attempt to start elogind."
+  ;; For more info on the elogind startup race, see
+  ;; <https://issues.guix.gnu.org/55444>.
+
+  (define elogind
+    (elogind-package config))
+
+  (define wrapper
+    (program-file "elogind-dbus-shepherd-sync"
+                  (with-imported-modules '((gnu services herd))
+                    #~(begin
+                        (use-modules (gnu services herd)
+                                     (srfi srfi-1)
+                                     (ice-9 match))
+
+                        (define (elogind-service? service)
+                          (memq 'elogind (live-service-provision service)))
+
+                        (define max-attempts
+                          ;; Number of attempts before assuming elogind failed
+                          ;; to start.
+                          20)
+
+                        ;; Repeatedly check whether the 'elogind' shepherd
+                        ;; service is up and running.  (As of Shepherd 0.9.1,
+                        ;; we cannot just call the 'start' method and wait for
+                        ;; it: it would spawn an additional elogind process.)
+                        (let loop ((attempts 0))
+                          (define services
+                            (current-services))
+
+                          (when (>= attempts max-attempts)
+                            (format (current-error-port)
+                                    "elogind shepherd service not started~%")
+                            (exit 2))
+
+                          (match (find elogind-service? services)
+                            (#f
+                             (format (current-error-port)
+                                     "no elogind shepherd service~%")
+                             (exit 1))
+                            (service
+                             (unless (live-service-running service)
+                               (sleep 1)
+                               (loop (+ attempts 1))))))))))
+
+  (define build
+    (with-imported-modules '((guix build utils))
+      #~(begin
+          (use-modules (guix build utils)
+                       (ice-9 match))
+
+          (define service-directory
+            "/share/dbus-1/system-services")
+
+          (mkdir-p (dirname (string-append #$output service-directory)))
+          (copy-recursively (string-append #$elogind service-directory)
+                            (string-append #$output service-directory))
+          (symlink (string-append #$elogind "/etc") ;for etc/dbus-1
+                   (string-append #$output "/etc"))
+
+          ;; Replace the "Exec=" line of the 'org.freedesktop.login1.service'
+          ;; file with one that refers to WRAPPER instead of elogind.
+          (match (find-files #$output "\\.service$")
+            ((file)
+             (substitute* file
+               (("Exec[[:blank:]]*=.*" _)
+                (string-append "Exec=" #$wrapper "\n"))))))))
+
+  (list (computed-file "elogind-dbus-service-wrapper" build)))
 
 (define (pam-extension-procedure config)
   "Return an extension for PAM-ROOT-SERVICE-TYPE that ensures that all the PAM
-- 
2.36.0


reply via email to

[Prev in Thread] Current Thread [Next in Thread]