qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 1/1] os-posix: asynchronous teardown for shutdown on Linux


From: Claudio Imbrenda
Subject: Re: [PATCH v3 1/1] os-posix: asynchronous teardown for shutdown on Linux
Date: Fri, 12 Aug 2022 09:26:23 +0200

On Thu, 11 Aug 2022 23:05:52 -0300
Murilo Opsfelder Araújo <muriloo@linux.ibm.com> wrote:

> On 8/11/22 11:02, Daniel P. Berrangé wrote:
> [...]
> >>> Hmm, I was hoping you could just use SIGKILL to guarantee that this
> >>> gets killed off.  Is SIGKILL delivered too soon to allow for the
> >>> main QEMU process to have exited quickly ?  
> >>
> >> yes, I tried. qemu has not finished exiting when the signal is
> >> delivered, the cleanup process dies before qemu, which defeats the
> >> purpose  
> >
> > Ok, too bad.
> >  
> >>> If so I wonder what happens when systemd just delivers SIGKILL to
> >>> all processes in the cgroup - I'm not sure there's a guarantee it
> >>> will SIGKILL the main qemu before it SIGKILLs this helper  
> >>
> >> I'm afraid in that case there is no guarantee.
> >>
> >> for what it's worth, both virsh shutdown and destroy seem to do things
> >> properly.  
> >
> > Hmm, probably because libvirt tells QEMU to exit before systemd comes
> > along and tells everything in the cgroup to die with SIGKILL.  
> 
> It seems Libvirt sends SIGKILL if qemu process doesn't terminate within 10
> seconds after Libvirt sent SIGTERM:
> 
> https://gitlab.com/libvirt/libvirt/-/blob/0615df084ec9996b5df88d6a1b59c557e22f3a12/src/util/virprocess.c#L375

but this is fine.

with asynchronous teardown, qemu will exit almost immediately when
receiving SIGTERM, and the cleanup process will start cleaning up.

> 
> So I guess this patch happened to work with Libvirt because the main qemu
> process terminated before the timeout and before SIGKILL was delivered.

it seems so

> 
> The cleanup process is trying to solve the problem where the main qemu process
> takes too long to terminate. However, if the cleanup process itself takes too
> long, SIGKILL will be sent by Libvirt anyway.

but that is not a problem, the sole purpose of the cleanup process is
to terminate _after_ qemu. it doesn't matter what happens after qemu
has terminated. if you look at the patch, after going to great lengths
to assure that qemu has terminated, all the child process does is
_exit(0). 

> 
> Perhaps we can describe this situation in the parameter help, e.g.: If
> management layer decides to send SIGKILL (e.g.: due to timeout or deliberate
> decision), the cleanup process can exit before the main process, deceiving its
> purpose.

if the management layer (or the user) decides to send SIGKILL
immediately to the whole cgroup without sending SIGTERM first, then
this whole asynchronous teardown mechanism is defeated, yes.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]