qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 0/5] virtio-net: Introduce LM early load


From: Yajun Wu
Subject: Re: [RFC PATCH 0/5] virtio-net: Introduce LM early load
Date: Wed, 18 Oct 2023 14:40:57 +0800
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1


On 10/18/2023 12:47 AM, Eugenio Perez Martin wrote:
External email: Use caution opening links or attachments


On Mon, Sep 18, 2023 at 6:51 AM Yajun Wu <yajunw@nvidia.com> wrote:
This series of patches aims to minimize the downtime during live migration of a
virtio-net device with a vhost-user backend. In the case of hardware virtual
Data Path Acceleration (vDPA) implementation, the hardware configuration, which
includes tasks like VQ creation and RSS setting, may take above 200ms. This
significantly increases the downtime of the VM, particularly in terms of
networking.

Hi!

Sorry I totally missed this email. Please CC me in next versions.

Just for completion, there is an ongoing plan to reduce the downtime
in vhost-vdpa. You can find more details at [1].

To send the state periodically is in the roadmap, but some
benchmarking detected that memory pinning and unpinning affects more
to downtime. I'll send a RFC soon with this. The plan was to continue
with iterative state restoring, so I'm happy to know more people are
looking into it!

In the case of vhost-vdpa it already restores the state by not
enabling dataplane until migration completes. All the load is
performed using CVQ, as you can see in
net/vhost-vdpa.c:vhost_vdpa_net_load. After that, all dataplane is
started again.

My idea is to start vhost-vdpa (by calling vhost_vdpa_dev_start) at
the destination at the same moment the migration starts, as it will
not have dataplane enabled. After that, the source should send the
virtio-net vmstate every time it changes. vhost-vdpa net is able to
send and receive through CVQ, so it should be able to modify net
device configuration as many times as needed. I guess that could be
done by calling something in the line of your
vhost_user_set_presetup_state.
This is very good approach. How do you know when virtio-net vmstate change? vhost-user and vhost-vdpa should share same code of virtio-net vmstate early sync.

This can be improved in vhost-vdpa by being able to send only the new state.

When all the migration is completed, vhost-vdpa net dataplane should
start as it does now.

If you are interested in saving changes to vhost-user protocol, maybe
qemu could just disable the dataplane too with
VHOST_USER_SET_VRING_ENABLE? If not, I think both approaches have a
lot in common, so I'm sure we can develop one backend on top of
another.

Thanks!

[1] https://lists.gnu.org/archive/html/qemu-devel/2023-04/msg00659.html

I'm afraid just like DRIVER_OK as a hint for vhost-user vDPA to apply all the configuration to HW. Vhost-user also needs same hint as the end of each round vmstate sync to apply configuration to HW. That's why I need define new protocol message.

Because of MQ can also change, VQ enable is a valid parameter to HW. HW will create only enabled queue, number of enabled queues affects RSS setting.



To reduce the VM downtime, the proposed approach involves capturing the basic
device state/configuration during the VM's running stage and performing the
initial device configuration(presetup). During the normal configuration process
when the VM is in a stopped state, the second configuration is compared to the
first one, and only the differences are applied to reduce downtime. Ideally,
only the vring available index needs to be changed within VM stop.

This feature is disabled by default, because backend like dpdk also needs
adding support for vhost new message. New device property "x-early-migration"
can enable this feature.

1. Register a new vmstate for virtio-net with an early_setup flag to send the
    device state during migration setup.
2. After device state load on destination VM, need to send device status to
    vhost backend in a new way. Introduce new vhost-user message:
    VHOST_USER_PRESETUP, to notify backend of presetup.
3. Let virtio-net, vhost-net, vhost-dev support presetup. Main flow:
    a. vhost-dev sending presetup start.
    b. virtio-net setting mtu.
    c. vhost-dev sending vring configuration and setting dummy call/kick fd.
    d. vhost-net sending vring enable.
    e. vhost-dev sending presetup end.


TODOs:
======
   - No vhost-vdpa/kernel support. Need to discuss/design new kernel interface
     if there's same requirement for vhost-vdpa.

   - No vIOMMU support so far. If there is a need for vIOMMU support, it is
     planned to be addressed in a follow-up patchset.


Test:
=====
   - Live migration VM with 2 virtio-net devices, ping can recover.
     Together with DPDK patch [1].
   - The time consumption of DPDK function dev_conf is reduced from 191.4 ms
     to 6.6 ms.


References:
===========

[1] https://github.com/Mellanox/dpdk-vhost-vfe/pull/37

Any comments or feedback are highly appreciated.

Thanks,
Yajun


Yajun Wu (5):
   vhost-user: Add presetup protocol feature and op
   vhost: Add support for presetup
   vhost-net: Add support for presetup
   virtio: Add VMState for early load
   virtio-net: Introduce LM early load

  docs/interop/vhost-user.rst       |  10 ++
  hw/net/trace-events               |   1 +
  hw/net/vhost_net.c                |  40 +++++++
  hw/net/virtio-net.c               | 100 ++++++++++++++++++
  hw/virtio/vhost-user.c            |  30 ++++++
  hw/virtio/vhost.c                 | 166 +++++++++++++++++++++++++-----
  hw/virtio/virtio.c                | 152 ++++++++++++++++-----------
  include/hw/virtio/vhost-backend.h |   3 +
  include/hw/virtio/vhost.h         |  12 +++
  include/hw/virtio/virtio-net.h    |   1 +
  include/hw/virtio/virtio.h        |  10 +-
  include/net/vhost_net.h           |   3 +
  12 files changed, 443 insertions(+), 85 deletions(-)

--
2.27.0





reply via email to

[Prev in Thread] Current Thread [Next in Thread]