qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1 1/2] vhost: Defer filtering memory sections until building


From: David Hildenbrand
Subject: Re: [PATCH v1 1/2] vhost: Defer filtering memory sections until building the vhost memory structure
Date: Wed, 8 Mar 2023 16:21:49 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.8.0


So we tricked used_memslots to be smaller than it actually has to be, because
we're ignoring the memslots filtered out by the vhost-user device.


Now, this is all far from relevant in practice as of now I think, and
usually would indicate user errors already (memory that's not shared with
vhost-user?).

well vhost-user device_add should fail if it can't get hands on all RAM
(if it doesn't we have a bug somewhere else)

There are some corner cases already. Like R/O NVDIMMs are completely ignored -- I assume because we wouldn't be able to use them for virtio queues either way, so it kind-of makes sense I guess.

I have not yet figured out *why* 988a27754bbb ("vhost: allow backends to filter memory sections") was included at all. Why should we have memory-backend-ram mapped into guest address space that vhost-user cannot handle?

Take my NVDIMM example, if we'd use that NVDIMM memory in the guest as ordinary RAM, it could eventually be used for virtio queues ... and we don't even warn the user.

So I agree that hotplugging that device should fail. But it could also break some existing setups (we could work around it using compat machines I guess).

But we also have to fail hotplugging such a vhost-user device, ... and I am not sure where to even put such checks.




It might gets more relevant when virtio-mem dynamically adds/removes memslots 
and
relies on precise tracking of used vs. free memslots.


But maybe I should just ignore that case and live a happy life instead, it's
certainly hard to even trigger right now :)
Further, it will be helpful in memory device context in the near future
to know that a RAM memory region section will consume a memslot, and be
accounted for in the used vs. free memslots, such that we can implement
reservation of memslots for memory devices properly. Whether a device
filters this out and would theoretically still have a free memslot is
then hidden internally, making overall vhost memslot accounting easier.


Let's filter the memslots when creating the vhost memory array,
accounting all RAM && !ROM memory regions as "used_memslots" even if
vhost_user isn't interested in anonymous RAM regions, because it needs
an fd.

that would regress existing setups where it was possible
to start with N DIMMs and after this patch the same VM could fail to
start if N was close to vhost's limit in otherwise perfectly working
configuration. So this approach doesn't seem right.

As discussed already with MST, this was the state before that change. So I strongly doubt that we would break anything because using memory-backend-ram in that setup would already be suspicious.

Again, I did not figure out *why* that change was even included and which setups even care about that.

Maybe Tiwei can comment.


Perhaps redoing vhost's used_memslots accounting would be
a better approach, right down to introducing reservations you'd
like to have eventually.

The question what to do with memory-backend-ram for vhost-user still remains. It's independent of the "used_memslot" tracking, because used vs. reserved would depend on the vhost-backend filtering demands ... and I really wanted to avoid that with this commit. It just makes tracking much harder.


Something like:
   1: s/vhost_has_free_slot/vhost_memory_region_limit/
      and maybe the same for kvm_has_free_slot
   then rewrite memory_device_check_addable() moving all
   used_memslots accounting into memory_device core.

I do something similar already, however, by querying the free memslots from kvm and vhost, not the limits (limits are not very expressive).

Thanks!

--
Thanks,

David / dhildenb




reply via email to

[Prev in Thread] Current Thread [Next in Thread]