bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#57922: Shepherd doesn't seem to correctly handle waitpid itself


From: Maxim Cournoyer
Subject: bug#57922: Shepherd doesn't seem to correctly handle waitpid itself
Date: Fri, 23 Sep 2022 13:49:26 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.1 (gnu/linux)

tags 57922 +notabug
thanks

Hi Ludo!

Ludovic Courtès <ludo@gnu.org> writes:

[...]

>> What I don't understand that well is that this signal handler could be
>> installed only once when shepherd starts, right?  That way, it wouldn't
>> need to depend on specific start actions being chosen.
>
> The SIGCHLD handler is installed lazily since
> f776de04e6702e18d95152072e78c43441d3ccc3.  The rationale was discussed
> here:
>
>   https://issues.guix.gnu.org/27553
>
> That said, on GNU/Linux, SIGCHLD is actually blocked and instead we rely
> on signalfd(2).  It’s from the main even loop in shepherd.scm that the
> signal handler is called.

I had missed that, thanks for explaining.

>>> Here's a small reproducer to apply on our code base:
>>>
>>> --8<---------------cut here---------------start------------->8---
>>> modified   gnu/services/telephony.scm
>>> @@ -685,13 +685,7 @@ (define (archive-name->username archive)
>>>
>>>                      ;; Finally, return the PID of the daemon process.
>>>                      daemon-pid))
>>> -               (stop
>>> -                #~(lambda (pid . args)
>>> -                    (kill pid SIGKILL)
>>> -                    ;; Wait for the process to exit; this prevents 
>>> overlapping
>>> -                    ;; processes when issuing 'herd restart'.
>>> -                    (waitpid pid)
>>> -                    #f))))))))
>>> +               (stop #~(make-kill-destructor))))))))
>
> I think the main difference between these two is that the first one uses
> SIGKILL while the second one uses SIGTERM.
>
> You could try #~(make-kill-destructor SIGKILL) to get the same effect.

You are right, the important difference was SIGTERM vs SIGKILL.  I
thought I had tried that.  The problem only shows itself in the
'jami-provisioning' system test, not the 'jami' one.

Marking this one as notabug and closing.

Thanks again!

Maxim





reply via email to

[Prev in Thread] Current Thread [Next in Thread]