qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 0/7] vhost-vdpa multiqueue fixes


From: Jason Wang
Subject: Re: [PATCH 0/7] vhost-vdpa multiqueue fixes
Date: Thu, 5 May 2022 16:40:39 +0800

On Sat, Apr 30, 2022 at 10:07 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>
>
>
> On 4/28/2022 7:30 PM, Jason Wang wrote:
> > On Wed, Apr 27, 2022 at 5:09 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
> >>
> >>
> >> On 4/27/2022 1:38 AM, Jason Wang wrote:
> >>> On Wed, Apr 27, 2022 at 4:30 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
> >>>>
> >>>> On 4/26/2022 9:28 PM, Jason Wang wrote:
> >>>>> 在 2022/3/30 14:33, Si-Wei Liu 写道:
> >>>>>> Hi,
> >>>>>>
> >>>>>> This patch series attempt to fix a few issues in vhost-vdpa
> >>>>>> multiqueue functionality.
> >>>>>>
> >>>>>> Patch #1 is the formal submission for RFC patch in:
> >>>>>> https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/c3e931ee-1a1b-9c2f-2f59-cb4395c230f9@oracle.com/__;!!ACWV5N9M2RV99hQ!OoUKcyWauHGQOM4MTAUn88TINQo5ZP4aaYyvyUCK9ggrI_L6diSZo5Nmq55moGH769SD87drxQyqg3ysNsk$
> >>>>>>
> >>>>>> Patch #2 and #3 were taken from a previous patchset posted on
> >>>>>> qemu-devel:
> >>>>>> https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/20211117192851.65529-1-eperezma@redhat.com/__;!!ACWV5N9M2RV99hQ!OoUKcyWauHGQOM4MTAUn88TINQo5ZP4aaYyvyUCK9ggrI_L6diSZo5Nmq55moGH769SD87drxQyqc3mXqDs$
> >>>>>>
> >>>>>> albeit abandoned, two patches in that set turn out to be useful for
> >>>>>> patch #4, which is to fix a QEMU crash due to race condition.
> >>>>>>
> >>>>>> Patch #5 through #7 are obviously small bug fixes. Please find the
> >>>>>> description of each in the commit log.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> -Siwei
> >>>>> Hi Si Wei:
> >>>>>
> >>>>> I think we need another version of this series?
> >>>> Hi Jason,
> >>>>
> >>>> Apologies for the long delay. I was in the middle of reworking the patch
> >>>> "virtio: don't read pending event on host notifier if disabled", but
> >>>> found out that it would need quite some code change for the userspace
> >>>> fallback handler to work properly (for the queue no. change case
> >>>> specifically).
> >>> We probably need this fix for -stable, so I wonder if we can have a
> >>> workaround first and do refactoring on top?
> >> Hmmm, a nasty fix but may well address the segfault is something like this:
> >>
> >> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> >> index 8ca0b80..3ac93a4 100644
> >> --- a/hw/net/virtio-net.c
> >> +++ b/hw/net/virtio-net.c
> >> @@ -368,6 +368,10 @@ static void virtio_net_set_status(struct
> >> VirtIODevice *vdev, uint8_t status)
> >>        int i;
> >>        uint8_t queue_status;
> >>
> >> +    if (n->status_pending)
> >> +        return;
> >> +
> >> +    n->status_pending = true;
> >>        virtio_net_vnet_endian_status(n, status);
> >>        virtio_net_vhost_status(n, status);
> >>
> >> @@ -416,6 +420,7 @@ static void virtio_net_set_status(struct
> >> VirtIODevice *vdev, uint8_t status)
> >>                }
> >>            }
> >>        }
> >> +    n->status_pending = false;
> >>    }
> >>
> >>    static void virtio_net_set_link_status(NetClientState *nc)
> >> diff --git a/include/hw/virtio/virtio-net.h 
> >> b/include/hw/virtio/virtio-net.h
> >> index eb87032..95efea8 100644
> >> --- a/include/hw/virtio/virtio-net.h
> >> +++ b/include/hw/virtio/virtio-net.h
> >> @@ -216,6 +216,7 @@ struct VirtIONet {
> >>        VirtioNetRssData rss_data;
> >>        struct NetRxPkt *rx_pkt;
> >>        struct EBPFRSSContext ebpf_rss;
> >> +    bool status_pending;
> >>    };
> >>
> >>    void virtio_net_set_netclient_name(VirtIONet *n, const char *name,
> >>
> >> To be honest, I am not sure if this is worth a full blown fix to make it
> >> completely work. Without applying vq suspend patch (the one I posted in
> >> https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/df7c9a87-b2bd-7758-a6b6-bd834a7336fe@oracle.com/__;!!ACWV5N9M2RV99hQ!L4qque3YpPr-CGp12NrNdMMT1HROfEY_Juw2vnfZXHjOhtT0XJCR9GB8cvWEbJL9Aeh-WhDogBVArJn91P0$
> >>  ),
> >> it's very hard for me to effectively verify my code change - it's very
> >> easy for the guest vq index to be out of sync if not stopping the vq
> >> once the vhost is up and running (I tested it with repeatedly set_link
> >> off and on).
> > Can we test via vmstop?
> Yes, of coz, that's where the segfault happened. The tight loop of
> set_link on/off doesn't even work for the single queue case, hence
> that's why I doubted it ever worked for vhost-vdpa.

Right, this is something we need to check. Set_link should stop the
vhost device anyhow, otherwise it should be a bug.

>
> >
> >> I am not sure if there's real chance we can run into issue
> >> in practice due to the incomplete fix, if we don't fix the vq
> >> stop/suspend issue first. Anyway I will try, as other use case e.g, live
> >> migration is likely to get stumbled on it, too.
> > Ok, so I think we probably don't need the "nasty" fix above. Let's fix
> > it with the issue of stop/resume.
> Ok, then does below tentative code change suffice the need? i.e. it
> would fail the request of changing queue pairs when the vhost-vdpa
> backend falls back to the userspace handler, but it's probably the
> easiest way to fix the vmstop segfault.

Probably, let's see.

Thanks

>
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index ed231f9..8ba9f09 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -1177,6 +1177,7 @@ static int virtio_net_handle_mq(VirtIONet *n,
> uint8_t cmd,
>       struct virtio_net_ctrl_mq mq;
>       size_t s;
>       uint16_t queue_pairs;
> +    NetClientState *nc = qemu_get_queue(n->nic);
>
>       s = iov_to_buf(iov, iov_cnt, 0, &mq, sizeof(mq));
>       if (s != sizeof(mq)) {
> @@ -1196,6 +1197,13 @@ static int virtio_net_handle_mq(VirtIONet *n,
> uint8_t cmd,
>           return VIRTIO_NET_ERR;
>       }
>
> +    /* avoid changing the number of queue_pairs for vdpa device in
> +     * userspace handler.
> +     * TODO: get userspace fallback to work with future patch */
> +    if (nc->peer && nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) {
> +        return VIRTIO_NET_ERR;
> +    }
> +
>       n->curr_queue_pairs = queue_pairs;
>       /* stop the backend before changing the number of queue_pairs to
> avoid handling a
>        * disabled queue */
>
> Thanks,
> -Siwei
> >
> > Thanks
> >
> >> -Siwei
> >>
> >>
> >>>> I have to drop it from the series and posted it later on
> >>>> when ready. Will post a v2 with relevant patches removed.
> >>> Thanks
> >>>
> >>>> Regards,
> >>>> -Siwei
> >>>>
> >>>>> Thanks
> >>>>>
> >>>>>
> >>>>>> ---
> >>>>>>
> >>>>>> Eugenio Pérez (2):
> >>>>>>      virtio-net: Fix indentation
> >>>>>>      virtio-net: Only enable userland vq if using tap backend
> >>>>>>
> >>>>>> Si-Wei Liu (5):
> >>>>>>      virtio-net: align ctrl_vq index for non-mq guest for vhost_vdpa
> >>>>>>      virtio: don't read pending event on host notifier if disabled
> >>>>>>      vhost-vdpa: fix improper cleanup in net_init_vhost_vdpa
> >>>>>>      vhost-net: fix improper cleanup in vhost_net_start
> >>>>>>      vhost-vdpa: backend feature should set only once
> >>>>>>
> >>>>>>     hw/net/vhost_net.c         |  4 +++-
> >>>>>>     hw/net/virtio-net.c        | 25 +++++++++++++++++++++----
> >>>>>>     hw/virtio/vhost-vdpa.c     |  2 +-
> >>>>>>     hw/virtio/virtio-bus.c     |  3 ++-
> >>>>>>     hw/virtio/virtio.c         | 21 +++++++++++++--------
> >>>>>>     include/hw/virtio/virtio.h |  2 ++
> >>>>>>     net/vhost-vdpa.c           |  4 +++-
> >>>>>>     7 files changed, 45 insertions(+), 16 deletions(-)
> >>>>>>
>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]