[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] vhost: Ignore vrings in dirty log when using a vIOMMU
From: |
Greg Kurz |
Subject: |
Re: [PATCH] vhost: Ignore vrings in dirty log when using a vIOMMU |
Date: |
Mon, 28 Sep 2020 09:37:18 +0200 |
On Mon, 28 Sep 2020 16:23:43 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:
> On Fri, Sep 25, 2020 at 07:29:43PM +0200, Greg Kurz wrote:
> > When a vIOMMU is present, any address comming from the guest is an IO
> > virtual address, including those of the vrings. The backend's accesses
> > to the vrings happen through vIOMMU translation : the backend hence
> > only logs the final guest physical address, not the IO virtual one.
> > It thus doesn't make sense to make room for the vring addresses in the
> > dirty log in this case.
> >
> > This fixes a crash of the source when migrating a guest using in-kernel
> > vhost-net and iommu_platform=on on POWER, because DMA regions are put
> > at very high addresses and the resulting log size is likely to cause
> > g_malloc0() to abort.
> >
> > BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1879349
> > Signed-off-by: Greg Kurz <groug@kaod.org>
>
> I'm a little confused as to what's going on here. Obviously
> allocating dirty bitmaps in IOVA space doesn't make much sense.
> But.. in all cases isn't the ring ending up in guest memory, whether
> translated or not. So why do specific addresses of the ring make a
> difference in *any* case.
>
I admit I'm a bit surprised as well... I can't think of a scenario
where the address of the used ring would be higher than the guest
memory... Maybe MST can shed some light here ?
> > ---
> > hw/virtio/vhost.c | 38 ++++++++++++++++++++++++--------------
> > 1 file changed, 24 insertions(+), 14 deletions(-)
> >
> > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > index 1a1384e7a642..0b83d6b8e65e 100644
> > --- a/hw/virtio/vhost.c
> > +++ b/hw/virtio/vhost.c
> > @@ -106,6 +106,20 @@ static void vhost_dev_sync_region(struct vhost_dev
> > *dev,
> > }
> > }
> >
> > +static int vhost_dev_has_iommu(struct vhost_dev *dev)
> > +{
> > + VirtIODevice *vdev = dev->vdev;
> > +
> > + /*
> > + * For vhost, VIRTIO_F_IOMMU_PLATFORM means the backend support
> > + * incremental memory mapping API via IOTLB API. For platform that
> > + * does not have IOMMU, there's no need to enable this feature
> > + * which may cause unnecessary IOTLB miss/update trnasactions.
> > + */
> > + return vdev->dma_as != &address_space_memory &&
> > + virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> > +}
> > +
> > static int vhost_sync_dirty_bitmap(struct vhost_dev *dev,
> > MemoryRegionSection *section,
> > hwaddr first,
> > @@ -130,6 +144,11 @@ static int vhost_sync_dirty_bitmap(struct vhost_dev
> > *dev,
> > range_get_last(reg->guest_phys_addr,
> > reg->memory_size));
> > }
> > +
> > + if (vhost_dev_has_iommu(dev)) {
> > + return 0;
> > + }
> > +
> > for (i = 0; i < dev->nvqs; ++i) {
> > struct vhost_virtqueue *vq = dev->vqs + i;
> >
> > @@ -172,6 +191,11 @@ static uint64_t vhost_get_log_size(struct vhost_dev
> > *dev)
> > reg->memory_size);
> > log_size = MAX(log_size, last / VHOST_LOG_CHUNK + 1);
> > }
> > +
> > + if (vhost_dev_has_iommu(dev)) {
> > + return log_size;
> > + }
> > +
> > for (i = 0; i < dev->nvqs; ++i) {
> > struct vhost_virtqueue *vq = dev->vqs + i;
> >
> > @@ -287,20 +311,6 @@ static inline void vhost_dev_log_resize(struct
> > vhost_dev *dev, uint64_t size)
> > dev->log_size = size;
> > }
> >
> > -static int vhost_dev_has_iommu(struct vhost_dev *dev)
> > -{
> > - VirtIODevice *vdev = dev->vdev;
> > -
> > - /*
> > - * For vhost, VIRTIO_F_IOMMU_PLATFORM means the backend support
> > - * incremental memory mapping API via IOTLB API. For platform that
> > - * does not have IOMMU, there's no need to enable this feature
> > - * which may cause unnecessary IOTLB miss/update trnasactions.
> > - */
> > - return vdev->dma_as != &address_space_memory &&
> > - virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM);
> > -}
> > -
> > static void *vhost_memory_map(struct vhost_dev *dev, hwaddr addr,
> > hwaddr *plen, bool is_write)
> > {
> >
> >
>
pgpYPeWaQQSaW.pgp
Description: OpenPGP digital signature