qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] vfio/migrate: Move switch of dirty tracking into vfio_memory


From: Kirti Wankhede
Subject: Re: [PATCH] vfio/migrate: Move switch of dirty tracking into vfio_memory_listener
Date: Thu, 28 Jan 2021 02:33:55 +0530
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.12.1



On 1/11/2021 1:04 PM, Keqian Zhu wrote:
For now the switch of vfio dirty page tracking is integrated into
the vfio_save_handler, it causes some problems [1].


Sorry, I missed [1] mail, somehow it didn't landed in my inbox.

The object of dirty tracking is guest memory, but the object of
the vfio_save_handler is device state. This mixed logic produces
unnecessary coupling and conflicts:

1. Coupling: Their saving granule is different (perVM vs perDevice).
    vfio will enable dirty_page_tracking for each devices, actually
    once is enough.

That's correct, enabling dirty page tracking once is enough. But log_start and log_stop gets called on address space update transaction, region_add() or region_del(), at this point migration may not be active. We don't want to allocate bitmap memory in kernel for lifetime of VM, without knowing migration will be happen or not. vfio_iommu_type1 module should allocate bitmap memory only while migration is active.

Paolo's suggestion here to use log_global_start and log_global_stop callbacks seems correct here. But at this point vfio device state is not yet changed to |_SAVING as you had identified it in [1]. May be we can start tracking bitmap in iommu_type1 module while device is not yet _SAVING, but getting dirty bitmap while device is yet not in _SAVING|_RUNNING state doesn't seem optimal solution.

Pasting here your question from [1]

> Before start dirty tracking, we will check and ensure that the device
>  is at _SAVING state and return error otherwise.  But the question is
>  that what is the rationale?  Why does the VFIO_IOMMU_DIRTY_PAGES
> ioctl have something to do with the device state?

Lets walk through the types of devices we are supporting:
1. mdev devices without IOMMU backed device
Vendor driver pins pages as and when required during runtime. We can say that vendor driver is smart which identifies the pages to pin. We are good here.

2. mdev device with IOMMU backed device
This is similar to vfio-pci, direct assigned device, where all pages are pinned at VM bootup. Vendor driver is not smart, so bitmap query will report all pages dirty always. If --auto-converge is not set, VM stucks infinitely in pre-copy phase. This is known to us.

3. mdev device with IOMMU backed device with smart vendor driver
In this case as well all pages are pinned at VM bootup, but vendor driver is smart to identify the pages and pin them explicitly. Pages can be pinned anytime, i.e. during normal VM runtime or on setting _SAVING flag (entering pre-copy phase) or while in iterative pre-copy phase. There is no restriction based on these phases for calling vfio_pin_pages(). Vendor driver can start pinning pages based on its device state when _SAVING flag is set. In that case, if dirty bitmap is queried before that then it will report all sysmem as dirty with an unnecessary copy of sysmem. As an optimal solution, I think its better to query bitmap only after all vfio devices are in pre-copy phase, i.e. after _SAVING flag is set.

2. Conflicts: The ram_save_setup() traverses all memory_listeners
    to execute their log_start() and log_sync() hooks to get the
    first round dirty bitmap, which is used by the bulk stage of
    ram saving. However, it can't get dirty bitmap from vfio, as
    @savevm_ram_handlers is registered before @vfio_save_handler.

Right, but it can get dirty bitmap from vfio device in it's iterative callback
ram_save_pending ->
        migration_bitmap_sync_precopy() .. ->
                 vfio_listerner_log_sync

Thanks,
Kirti

Move the switch of vfio dirty_page_tracking into vfio_memory_listener
can solve above problems. Besides, Do not require devices in SAVING
state for vfio_sync_dirty_bitmap().

[1] https://www.spinics.net/lists/kvm/msg229967.html

Reported-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
---
  hw/vfio/common.c    | 53 +++++++++++++++++++++++++++++++++++++--------
  hw/vfio/migration.c | 35 ------------------------------
  2 files changed, 44 insertions(+), 44 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 6ff1daa763..9128cd7ee1 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -311,7 +311,7 @@ bool vfio_mig_active(void)
      return true;
  }
-static bool vfio_devices_all_saving(VFIOContainer *container)
+static bool vfio_devices_all_dirty_tracking(VFIOContainer *container)
  {
      VFIOGroup *group;
      VFIODevice *vbasedev;
@@ -329,13 +329,8 @@ static bool vfio_devices_all_saving(VFIOContainer 
*container)
                  return false;
              }
- if (migration->device_state & VFIO_DEVICE_STATE_SAVING) {
-                if ((vbasedev->pre_copy_dirty_page_tracking == ON_OFF_AUTO_OFF)
-                    && (migration->device_state & VFIO_DEVICE_STATE_RUNNING)) {
-                        return false;
-                }
-                continue;
-            } else {
+            if ((vbasedev->pre_copy_dirty_page_tracking == ON_OFF_AUTO_OFF)
+                && (migration->device_state & VFIO_DEVICE_STATE_RUNNING)) {
                  return false;
              }
          }
@@ -987,6 +982,44 @@ static void vfio_listener_region_del(MemoryListener 
*listener,
      }
  }
+static void vfio_set_dirty_page_tracking(VFIOContainer *container, bool start)
+{
+    int ret;
+    struct vfio_iommu_type1_dirty_bitmap dirty = {
+        .argsz = sizeof(dirty),
+    };
+
+    if (start) {
+        dirty.flags = VFIO_IOMMU_DIRTY_PAGES_FLAG_START;
+    } else {
+        dirty.flags = VFIO_IOMMU_DIRTY_PAGES_FLAG_STOP;
+    }
+
+    ret = ioctl(container->fd, VFIO_IOMMU_DIRTY_PAGES, &dirty);
+    if (ret) {
+        error_report("Failed to set dirty tracking flag 0x%x errno: %d",
+                     dirty.flags, errno);
+    }
+}
+
+static void vfio_listener_log_start(MemoryListener *listener,
+                                    MemoryRegionSection *section,
+                                    int old, int new)
+{
+    VFIOContainer *container = container_of(listener, VFIOContainer, listener);
+
+    vfio_set_dirty_page_tracking(container, true);
+}
+
+static void vfio_listener_log_stop(MemoryListener *listener,
+                                   MemoryRegionSection *section,
+                                   int old, int new)
+{
+    VFIOContainer *container = container_of(listener, VFIOContainer, listener);
+
+    vfio_set_dirty_page_tracking(container, false);
+}
+
  static int vfio_get_dirty_bitmap(VFIOContainer *container, uint64_t iova,
                                   uint64_t size, ram_addr_t ram_addr)
  {
@@ -1128,7 +1161,7 @@ static void vfio_listerner_log_sync(MemoryListener 
*listener,
          return;
      }
- if (vfio_devices_all_saving(container)) {
+    if (vfio_devices_all_dirty_tracking(container)) {
          vfio_sync_dirty_bitmap(container, section);
      }
  }
@@ -1136,6 +1169,8 @@ static void vfio_listerner_log_sync(MemoryListener 
*listener,
  static const MemoryListener vfio_memory_listener = {
      .region_add = vfio_listener_region_add,
      .region_del = vfio_listener_region_del,
+    .log_start = vfio_listener_log_start,
+    .log_stop = vfio_listener_log_stop,
      .log_sync = vfio_listerner_log_sync,
  };
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 00daa50ed8..c0f646823a 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -395,40 +395,10 @@ static int vfio_load_device_config_state(QEMUFile *f, 
void *opaque)
      return qemu_file_get_error(f);
  }
-static int vfio_set_dirty_page_tracking(VFIODevice *vbasedev, bool start)
-{
-    int ret;
-    VFIOMigration *migration = vbasedev->migration;
-    VFIOContainer *container = vbasedev->group->container;
-    struct vfio_iommu_type1_dirty_bitmap dirty = {
-        .argsz = sizeof(dirty),
-    };
-
-    if (start) {
-        if (migration->device_state & VFIO_DEVICE_STATE_SAVING) {
-            dirty.flags = VFIO_IOMMU_DIRTY_PAGES_FLAG_START;
-        } else {
-            return -EINVAL;
-        }
-    } else {
-            dirty.flags = VFIO_IOMMU_DIRTY_PAGES_FLAG_STOP;
-    }
-
-    ret = ioctl(container->fd, VFIO_IOMMU_DIRTY_PAGES, &dirty);
-    if (ret) {
-        error_report("Failed to set dirty tracking flag 0x%x errno: %d",
-                     dirty.flags, errno);
-        return -errno;
-    }
-    return ret;
-}
-
  static void vfio_migration_cleanup(VFIODevice *vbasedev)
  {
      VFIOMigration *migration = vbasedev->migration;
- vfio_set_dirty_page_tracking(vbasedev, false);
-
      if (migration->region.mmaps) {
          vfio_region_unmap(&migration->region);
      }
@@ -469,11 +439,6 @@ static int vfio_save_setup(QEMUFile *f, void *opaque)
          return ret;
      }
- ret = vfio_set_dirty_page_tracking(vbasedev, true);
-    if (ret) {
-        return ret;
-    }
-
      qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE);
ret = qemu_file_get_error(f);




reply via email to

[Prev in Thread] Current Thread [Next in Thread]