qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 06/29] migration: Add auto-pause capability


From: Daniel P . Berrangé
Subject: Re: [PATCH v2 06/29] migration: Add auto-pause capability
Date: Wed, 25 Oct 2023 18:31:53 +0100
User-agent: Mutt/2.2.9 (2022-11-12)

On Wed, Oct 25, 2023 at 01:20:52PM -0400, Peter Xu wrote:
> On Wed, Oct 25, 2023 at 04:40:52PM +0100, Daniel P. Berrangé wrote:
> > On Wed, Oct 25, 2023 at 11:36:27AM -0400, Peter Xu wrote:
> > > On Wed, Oct 25, 2023 at 04:25:23PM +0100, Daniel P. Berrangé wrote:
> > > > > Libvirt will still use fixed-ram for live snapshot purpose, 
> > > > > especially for
> > > > > Windows?  Then auto-pause may still be useful to identify that from 
> > > > > what
> > > > > Fabiano wants to achieve here (which is in reality, non-live)?
> > > > > 
> > > > > IIRC of previous discussion that was the major point that libvirt can 
> > > > > still
> > > > > leverage fixed-ram for a live case - since Windows lacks efficient 
> > > > > live
> > > > > snapshot (background-snapshot feature).
> > > > 
> > > > Libvirt will use fixed-ram for all APIs it has that involve saving to
> > > > disk, with CPUs both running and paused.
> > > 
> > > There are still two scenarios.  How should we identify them, then?  For
> > > sure we can always make it live, but QEMU needs that information to make 
> > > it
> > > efficient for non-live.
> > > 
> > > Considering when there's no auto-pause, then Libvirt will still need to
> > > know the scenario first then to decide whether pausing VM before migration
> > > or do nothing, am I right?
> > 
> > libvirt will issue a 'stop' before invoking 'migrate' if it
> > needs to. QEMU should be able to optimize that scenario if
> > it sees CPUs already stopped when migrate is started ?
> > 
> > > If so, can Libvirt replace that "pause VM" operation with setting
> > > auto-pause=on here?  Again, the benefit is QEMU can benefit from it.
> > > 
> > > I think when pausing Libvirt can still receive an event, then it can
> > > cooperate with state changes?  Meanwhile auto-pause=on will be set by
> > > Libvirt too, so Libvirt will even have that expectation that QMP migrate
> > > later on will pause the VM.
> > > 
> > > > 
> > > > > From that POV it sounds like auto-pause is a good knob for that.
> > > > 
> > > > From libvirt's POV auto-pause will create extra work for integration
> > > > for no gain.
> > > 
> > > Yes, I agree for Libvirt there's no gain, as the gain is on QEMU's side.
> > > Could you elaborate what is the complexity for Libvirt to support it?
> > 
> > It increases the code paths because we will have to support
> > and test different behaviour wrt CPU state for fixed-ram
> > vs non-fixed ram usage.
> 
> To me if the user scenario is different, it makes sense to have a flag
> showing what the user wants to do.
> 
> Guessing that from "whether VM is running or not" could work in many cases
> but not all.
> 
> It means at least for dirty tracking, we only have one option to make it
> fully transparent, starting dirty tracking when VM starts during such
> migration.  The complexity moves from Libvirt into migration / kvm from
> this aspect.

Even with auto-pause we can't skip dirty tracking, as we don't
guarantee the app won't run 'cont' at some point.

We could have an explicit capability 'dirty-tracking' which an app
could set as an explicit "promise" that it won't ever need to
(re)start CPUs while migration is running.   If dirty-tracking==no,
then any attempt to run 'cont' should return an hard error while
migration is running.

> Meanwhile we lose some other potential optimizations for good, early
> releasing of resources will never be possible anymore because they need to
> be prepared to be reused very soon, even if we know they will never.  But
> maybe that's not a major concern.

What resources can we release early, without harming our ability to
restart the current QEMU on failure ?  

> No strong opinion from my side.  I'll leave it to Fabiano.  I didn't see
> any further optimization yet with the new cap in this series.  I think the
> trick is current extra overheads are just not high enough for us to
> care.. even if we know some work is pure overhead.  Then indeed we can also
> postpone the optimizations until justified worthwhile.


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




reply via email to

[Prev in Thread] Current Thread [Next in Thread]