[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [PATCH 07/10] Disable auto-coverge before entering COLO mode.
From: |
Rao, Lei |
Subject: |
RE: [PATCH 07/10] Disable auto-coverge before entering COLO mode. |
Date: |
Thu, 14 Jan 2021 03:21:51 +0000 |
I think there is a difference between doing checkpoints in COLO and live
migration.
The feature of auto-converge is to ensure the success of live migration even
though the dirty page generation speed is faster than data transfer.
but for COLO, we will force the VM to stop when something is doing a
checkpoint. This will ensure the success of doing a checkpoint and this has
nothing to do with auto-converge.
Thanks,
Lei.
-----Original Message-----
From: Dr. David Alan Gilbert <dgilbert@redhat.com>
Sent: Wednesday, January 13, 2021 7:32 PM
To: Rao, Lei <lei.rao@intel.com>
Cc: Zhang, Chen <chen.zhang@intel.com>; lizhijian@cn.fujitsu.com;
jasowang@redhat.com; zhang.zhanghailiang@huawei.com; quintela@redhat.com;
qemu-devel@nongnu.org
Subject: Re: [PATCH 07/10] Disable auto-coverge before entering COLO mode.
* leirao (lei.rao@intel.com) wrote:
> From: "Rao, Lei" <lei.rao@intel.com>
>
> If we don't disable the feature of auto-converge for live migration
> before entering COLO mode, it will continue to run with COLO running,
> and eventually the system will hang due to the CPU throttle reaching
> DEFAULT_MIGRATE_MAX_CPU_THROTTLE.
>
> Signed-off-by: Lei Rao <lei.rao@intel.com>
I don't think that's the right answer, because it would seem reasonable to use
auto-converge to ensure that a COLO snapshot succeeded by limiting guest CPU
time. Is the right fix here to reset the state of the auto-converge counters
at the start of each colo snapshot?
Dave
> ---
> migration/migration.c | 16 +++++++++++++++-
> 1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/migration/migration.c b/migration/migration.c index
> 31417ce..6ab37e5 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1673,6 +1673,20 @@ void migrate_set_block_enabled(bool value, Error
> **errp)
> qapi_free_MigrationCapabilityStatusList(cap);
> }
>
> +static void colo_auto_converge_enabled(bool value, Error **errp) {
> + MigrationCapabilityStatusList *cap = NULL;
> +
> + if (migrate_colo_enabled() && migrate_auto_converge()) {
> + QAPI_LIST_PREPEND(cap,
> + migrate_cap_add(MIGRATION_CAPABILITY_AUTO_CONVERGE,
> + value));
> + qmp_migrate_set_capabilities(cap, errp);
> + qapi_free_MigrationCapabilityStatusList(cap);
> + }
> + cpu_throttle_stop();
> +}
> +
> static void migrate_set_block_incremental(MigrationState *s, bool
> value) {
> s->parameters.block_incremental = value; @@ -3401,7 +3415,7 @@
> static MigIterateState migration_iteration_run(MigrationState *s)
> static void migration_iteration_finish(MigrationState *s) {
> /* If we enabled cpu throttling for auto-converge, turn it off. */
> - cpu_throttle_stop();
> + colo_auto_converge_enabled(false, &error_abort);
>
> qemu_mutex_lock_iothread();
> switch (s->state) {
> --
> 1.8.3.1
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
- Re: [PATCH 03/10] Optimize the function of filter_send, (continued)
- [PATCH 04/10] Remove migrate_set_block_enabled in checkpoint, leirao, 2021/01/12
- [PATCH 05/10] Optimize the function of packet_new, leirao, 2021/01/12
- [PATCH 06/10] Add the function of colo_compare_cleanup, leirao, 2021/01/12
- [PATCH 08/10] Reduce the PVM stop time during Checkpoint, leirao, 2021/01/12
- [PATCH 07/10] Disable auto-coverge before entering COLO mode., leirao, 2021/01/12
- [PATCH 09/10] Add the function of colo_bitmap_clear_diry, leirao, 2021/01/12
- [PATCH 10/10] Fixed calculation error of pkt->header_size in fill_pkt_tcp_info(), leirao, 2021/01/12