qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH V5 02/12] cpus: stop vm in suspended state


From: Peter Xu
Subject: Re: [PATCH V5 02/12] cpus: stop vm in suspended state
Date: Wed, 22 Nov 2023 11:21:42 -0500

On Wed, Nov 22, 2023 at 09:38:06AM +0000, Daniel P. Berrangé wrote:
> On Mon, Nov 20, 2023 at 04:44:50PM -0500, Peter Xu wrote:
> > On Mon, Nov 20, 2023 at 03:55:54PM -0500, Steven Sistare wrote:
> > > If we drop force, then all calls to vm_stop will completely stop the
> > > suspended state, eg an hmp "stop" command. This causes two problems.
> > > First, that is a change in user-visible behavior for something that
> > > currently works,
> > 
> > IMHO it depends on what should be the correct behavior.  IOW, when VM is in
> > SUSPENDED state and then the user sends "stop" QMP command, what should we
> > expect?
> 
> I would say that from a mgmtm app POV "stop" is initiating a state
> transition, from RUN_STATE_RUNNING to RUN_STATE_PAUSED and "cont"
> is doing the reverse from PAUSED to RUNNING.
> 
> It is a little more complicated than that as there are some other
> states like INMIGRATE that are conceptually equiv to RUNNING,
> and states where the transition simply doesn't make sense.

In the qemu impl, INMIGRATE is, imho, more equiv of PAUSED - do_vm_stop()
mostly ignores every state except RUNNING (putting bdrv operations aside).
IOW, anything besides "running" is treated as "not running".

But then Paolo fixed that in 1e9981465f ("qmp: handle stop/cont in
INMIGRATE state"), wiring that to autostart.

Now we seem to find that "suspended" should actually fall within (where
"vm" is running, but "vcpu" is not), and it seems we should treat "vm" and
"vcpu" differently.

> 
> So my question is if we're in "SUSPENDED" and someone issues "stop",
> what state do we go into, and perhaps more importantly what state
> do we go to in a subsequent "cont".

I think we must stop the "vm", not only the "vcpu".  I discussed this bit
in the other thread more or less: it's because qemu_system_wakeup_request()
can be called in many places, e.g. acpi_pm_tmr_timer().

It means even after the VM is "stop"ped by the admin QMP (where qmp_stop()
ignores SUSPENDED, keep the "vm" running), it can silently got waken up
without admin even noticing it.  I'm not sure what Libvirt will behave if
it suddenly receives a QAPI_EVENT_WAKEUP randomly after a "stop".

> 
> If you say SUSPENDED ---(stop)---> PAUSED ---(cont)---> SUSPENDED
> then we create a problem, because the decision for the transition
> out of PAUSED needs memory of the previous state.

Right, that's why I think we at least need one more boolean to remember the
suspended state, then when we switch from any STOP states into any RUN
states, we know where to go.  Here STOP states I meant anything except
RUNNING and SUSPENDED, while RUN -> RUNNING or SUSPENDED.

> 
> > My understanding is we should expect to fully stop the VM, including the
> > ticks, for example.  Keeping the ticks running even after QMP "stop"
> > doesn't sound right, isn't it?
> 
> The "stop" QMP command is documented as
> 
>     "Stop all guest VCPU execution"
> 
> the devil is in the detail though, and we've not documented any detail.
> 
> Whether or not timers keep running across stop/cont I think can be
> argued to be an impl detail, as long as the headline goal "vcpus
> don't execute" is satisfied.

"stop" was since qemu v0.14, so I guess we can't blame the missing of
details or any form of inaccuracy..  Obviously we do more than "stop vCPU
executions" in the current implementation.

But after we reach a consensus on how we should fix the current suspended
problem, we may want to update the documentation to start containing more
information.

> 
> > > vs the migration code where we are fixing brokenness.
> > 
> > This is not a migration-only bug if above holds, IMO.
> > 
> > > Second, it does not quite work, because the state becomes
> > > RUN_STATE_PAUSED, so the suspended state is forgotten, and the hmp "cont"
> > > will try to set the running state.  I could fix that by introducing a new
> > > state RUN_STATE_SUSPENDED_STOPPED, but again it is a user-visible change
> > > in existing behavior.  (I even implemented that while developing, then I
> > > realized it was not needed to fix the migration bugs.)
> > 
> > Good point.
> 
> We have added new guest states periodically. It is a user visible
> change, but you could argue that it is implementing a new feature
> ie the ability to "stop" a "suspended" guest, and so is justified.
> 
> S3 is so little used in virt, so I'm not surprised we're finding
> long standing edge cases that have never been thought about before.
> 
> > Now with above comments, what's your thoughts on whether we should change
> > the user behavior?  My answer is still a yes.
> > 
> > Maybe SUSPENDED should not be a RunState at all? SUSPENDED is guest visible
> > behavior, while something like QMP "stop" is not guest visible.  Maybe we
> > should remember it separately?
> 
> Yes, every time I look at this area I come away thinking that
> the RunState enum is a mess, overloading too many different
> concepts onto the same single field.
> 
> Specifically "SUSPENDED" vs "RUNNING" is a reflection of guest
> state (ie whether or not the VM is in S3), but pretty much all
> the others are a reflection of QEMU host state. I kind of feel
> that SUSPENDED (S3) probably shouldn't have been a RunState at
> all. I'd probably put guest-panicked into a separate thing too.
> 
> But we're stuck with what we have.

IMO compatibility is only necessary if at least the existing code is
running well.  But now I see it a major flaw in suspended state and I can't
see how it can go right if with current code base..  My concern is instead
that after suspended will be used more (e.g., assuming CPR will rely on it)
we can have more chance to confuse/oops a mgmt app like Libvirt, like I
described above.

In summary, I think a current solution to me is only to fix at least
suspended state for good, by:

  - adding vm_suspended boolean to remember machine RUNNING / SUSPENDED
    state.  When "cont" we need to switch to either RUNNING / SUSPENDED
    depending on the boolean.

  - keeping SUSPENDED state as RunState (for compatibility, otherwise we'll
    need another interface to fetch that boolean anyway), even though not
    query-able during any !RUNNING && !SUSPENDED state.. hopefully not a
    big deal

  - enrich ducumentation of qmp_stop/qmp_cont to describe what they really do

  - (with suspended working all right...) fix migration of SUSPENDED state

I don't expect a lot of code changes is needed, maybe even less than the
current series (because we don't need special knob to differenciate
migration or non-migration callers of do_vm_stop()). IMHO this series is
already doing some of that but just decided to ignore outside-migration
states for suspeneded.

We may want to add some test cases though to verify the suspended state
transitions (maybe easier to put into migration-test with the new ACPI
guest code), but optional.

Thanks,

-- 
Peter Xu




reply via email to

[Prev in Thread] Current Thread [Next in Thread]