qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC 5/8] vdpa: Add vdpa memory listener


From: Eugenio Perez Martin
Subject: Re: [RFC 5/8] vdpa: Add vdpa memory listener
Date: Fri, 19 Aug 2022 12:34:38 +0200

On Fri, Aug 19, 2022 at 11:01 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Fri, Aug 19, 2022 at 4:30 PM Eugenio Perez Martin
> <eperezma@redhat.com> wrote:
> >
> > On Fri, Aug 19, 2022 at 8:29 AM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > On Thu, Aug 11, 2022 at 2:42 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> > > >
> > > > This enable net/vdpa to restart the full device when a migration is
> > > > started or stopped.
> > > >
> > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > >
> > > I have the following questions
> > >
> > > 1) any reason that we need to make this net specific? The dirty page
> > > tracking via shadow virtqueue is pretty general. And the net specific
> > > part was done via NetClientInfo anyhow.
> >
> > The listener is only used to know when migration is started / stopped,
> > no need for actual memory tracking. Maybe there is a better way to do
> > so?
>
> Not sure, SaveVMHandlers?
>

I'm fine with investigating this, but the only entry in the doc says
it's the "legacy way". I assume the "modern way" is through
VMStateDescription, which is in virtio-net.

The "pre_save" member already assumes the vhost backend is stopped, so
I'm not sure if this way is valid.

> >
> > It's net specific because we are restarting the whole vhost_vdpa
> > backend. We could do inside hw/virtio/vhost_vdpa.c (previous POCs did
> > that way), but it's a little more complicated in my opinion. To do it
> > that way, the setting of _F_LOG was detected and device were resetted
> > and setup there.
>
> Can we still have a general vhost-vdpa one and introduce net specific
> callbacks? Otherwise the block may have its own listener.
>

If we reuse the vhost-vdpa listener, we cannot unregister it and
register it again from its own log_global_start / stop. We need to
track all the memory always, so we can map it again using qemu's IOVA
to use SVQ. Also, we need to duplicate vhost_vdpa_dev_start / stop
without memory register related calls.

On the other hand, it will put the SVQ solution farther away from
being "a start parameter of vhost-vdpa backend". Instead of being an
argument of vhost_vdpa initialization, now it is a variable that can
switch state at any moment. That implies extra code in vhost-vdpa too.

> >
> >
> > > 2) any reason we can't re-use the vhost-vdpa listener?
> > >
> >
> > At this moment, all vhost_vdpa backend is restarted. That implies that
> > the listener will be unregistered and registered again.
> >
> > If we use that listener, it needs either support to unregister itself,
> > or store all entries in the iova tree so we can iterate them, unmap
> > and map them again.
>
> Ok, but let's double check whether or not we have another choice.

Sure!

> Using a dedicated listener to know if migration is started or not
> seems too heavyweight.
>

Take into account that memory.c does not call a callback that does not
exist. Although this new memory listener is registered per vdpa device
(not per queue pair), it's skipped when memory is added or removed.

In that regard, to register all hdev->memory_listener of each queue
pair + cvq is more expensive than adding this. And they have no use in
vdpa.

Thanks!

> Thanks
>
> >
> > > (Anyway, it's better to explain the reason in detail in the changelog.)
> > >
> >
> > Sure, I can expand with this.
> >
> > Thanks!
> >
> > > Thanks
> > >
> > > > ---
> > > >  net/vhost-vdpa.c | 87 ++++++++++++++++++++++++++++++++++++++++++++++++
> > > >  1 file changed, 87 insertions(+)
> > > >
> > > > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> > > > index a035c89c34..4c6947feb8 100644
> > > > --- a/net/vhost-vdpa.c
> > > > +++ b/net/vhost-vdpa.c
> > > > @@ -21,6 +21,7 @@
> > > >  #include "qemu/memalign.h"
> > > >  #include "qemu/option.h"
> > > >  #include "qapi/error.h"
> > > > +#include "exec/address-spaces.h"
> > > >  #include <linux/vhost.h>
> > > >  #include <sys/ioctl.h>
> > > >  #include <err.h>
> > > > @@ -32,6 +33,8 @@
> > > >  typedef struct VhostVDPAState {
> > > >      NetClientState nc;
> > > >      struct vhost_vdpa vhost_vdpa;
> > > > +    MemoryListener memory_listener;
> > > > +
> > > >      VHostNetState *vhost_net;
> > > >
> > > >      /* Control commands shadow buffers */
> > > > @@ -110,6 +113,16 @@ static const uint64_t vdpa_svq_device_features =
> > > >  #define VHOST_VDPA_NET_CVQ_PASSTHROUGH 0
> > > >  #define VHOST_VDPA_NET_CVQ_ASID 1
> > > >
> > > > +/*
> > > > + * Vdpa memory listener must run before vhost one, so vhost_vdpa does 
> > > > not get
> > > > + * _F_LOG_ALL without SVQ.
> > > > + */
> > > > +#define VHOST_VDPA_NET_MEMORY_LISTENER_PRIORITY \
> > > > +                                       
> > > > (VHOST_DEV_MEMORY_LISTENER_PRIORITY - 1)
> > > > +/* Check for underflow */
> > > > +QEMU_BUILD_BUG_ON(VHOST_DEV_MEMORY_LISTENER_PRIORITY <
> > > > +                  VHOST_VDPA_NET_MEMORY_LISTENER_PRIORITY);
> > > > +
> > > >  VHostNetState *vhost_vdpa_get_vhost_net(NetClientState *nc)
> > > >  {
> > > >      VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
> > > > @@ -172,6 +185,9 @@ static void vhost_vdpa_cleanup(NetClientState *nc)
> > > >
> > > >      qemu_vfree(s->cvq_cmd_out_buffer);
> > > >      qemu_vfree(s->cvq_cmd_in_buffer);
> > > > +    if (dev->vq_index == 0) {
> > > > +        memory_listener_unregister(&s->memory_listener);
> > > > +    }
> > > >      if (dev->vq_index + dev->nvqs == dev->vq_index_end) {
> > > >          g_clear_pointer(&s->vhost_vdpa.iova_tree, 
> > > > vhost_iova_tree_delete);
> > > >      }
> > > > @@ -224,6 +240,69 @@ static ssize_t vhost_vdpa_receive(NetClientState 
> > > > *nc, const uint8_t *buf,
> > > >      return 0;
> > > >  }
> > > >
> > > > +static void vhost_vdpa_net_log_global_enable(MemoryListener *listener,
> > > > +                                             bool enable)
> > > > +{
> > > > +    VhostVDPAState *s = container_of(listener, VhostVDPAState,
> > > > +                                     memory_listener);
> > > > +    struct vhost_vdpa *v = &s->vhost_vdpa;
> > > > +    VirtIONet *n;
> > > > +    VirtIODevice *vdev;
> > > > +    int data_queue_pairs, cvq, r;
> > > > +    NetClientState *peer;
> > > > +
> > > > +    if (s->always_svq || s->log_enabled == enable) {
> > > > +        return;
> > > > +    }
> > > > +
> > > > +    s->log_enabled = enable;
> > > > +    vdev = v->dev->vdev;
> > > > +    n = VIRTIO_NET(vdev);
> > > > +    if (!n->vhost_started) {
> > > > +        return;
> > > > +    }
> > > > +
> > > > +    if (enable) {
> > > > +        ioctl(v->device_fd, VHOST_VDPA_SUSPEND);
> > > > +    }
> > > > +    data_queue_pairs = n->multiqueue ? n->max_queue_pairs : 1;
> > > > +    cvq = virtio_vdev_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ) ?
> > > > +                                  n->max_ncs - n->max_queue_pairs : 0;
> > > > +    vhost_net_stop(vdev, n->nic->ncs, data_queue_pairs, cvq);
> > > > +
> > > > +    peer = s->nc.peer;
> > > > +    for (int i = 0; i < data_queue_pairs + cvq; i++) {
> > > > +        VhostVDPAState *vdpa_state;
> > > > +        NetClientState *nc;
> > > > +
> > > > +        if (i < data_queue_pairs) {
> > > > +            nc = qemu_get_peer(peer, i);
> > > > +        } else {
> > > > +            nc = qemu_get_peer(peer, n->max_queue_pairs);
> > > > +        }
> > > > +
> > > > +        vdpa_state = DO_UPCAST(VhostVDPAState, nc, nc);
> > > > +        vdpa_state->vhost_vdpa.listener_shadow_vq = enable;
> > > > +        vdpa_state->vhost_vdpa.shadow_vqs_enabled = enable;
> > > > +        vdpa_state->log_enabled = enable;
> > > > +    }
> > > > +
> > > > +    r = vhost_net_start(vdev, n->nic->ncs, data_queue_pairs, cvq);
> > > > +    if (unlikely(r < 0)) {
> > > > +        error_report("unable to start vhost net: %s(%d)", 
> > > > g_strerror(-r), -r);
> > > > +    }
> > > > +}
> > > > +
> > > > +static void vhost_vdpa_net_log_global_start(MemoryListener *listener)
> > > > +{
> > > > +    vhost_vdpa_net_log_global_enable(listener, true);
> > > > +}
> > > > +
> > > > +static void vhost_vdpa_net_log_global_stop(MemoryListener *listener)
> > > > +{
> > > > +    vhost_vdpa_net_log_global_enable(listener, false);
> > > > +}
> > > > +
> > > >  static NetClientInfo net_vhost_vdpa_info = {
> > > >          .type = NET_CLIENT_DRIVER_VHOST_VDPA,
> > > >          .size = sizeof(VhostVDPAState),
> > > > @@ -413,6 +492,7 @@ static void vhost_vdpa_net_cvq_stop(NetClientState 
> > > > *nc)
> > > >
> > > >      assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> > > >
> > > > +    memory_listener_unregister(&s->memory_listener);
> > > >      if (s->vhost_vdpa.shadow_vqs_enabled) {
> > > >          vhost_vdpa_cvq_unmap_buf(&s->vhost_vdpa, 
> > > > s->cvq_cmd_out_buffer);
> > > >          vhost_vdpa_cvq_unmap_buf(&s->vhost_vdpa, s->cvq_cmd_in_buffer);
> > > > @@ -671,6 +751,13 @@ static NetClientState 
> > > > *net_vhost_vdpa_init(NetClientState *peer,
> > > >      s->vhost_vdpa.shadow_vqs_enabled = svq;
> > > >      s->vhost_vdpa.listener_shadow_vq = svq;
> > > >      s->vhost_vdpa.iova_tree = iova_tree;
> > > > +    if (queue_pair_index == 0) {
> > > > +        s->memory_listener = (MemoryListener) {
> > > > +            .log_global_start = vhost_vdpa_net_log_global_start,
> > > > +            .log_global_stop = vhost_vdpa_net_log_global_stop,
> > > > +        };
> > > > +        memory_listener_register(&s->memory_listener, 
> > > > &address_space_memory);
> > > > +    }
> > > >      if (!is_datapath) {
> > > >          s->cvq_cmd_out_buffer = 
> > > > qemu_memalign(qemu_real_host_page_size(),
> > > >                                              
> > > > vhost_vdpa_net_cvq_cmd_page_len());
> > > > --
> > > > 2.31.1
> > > >
> > >
> >
>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]