qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH RFC 0/7] hostmem: NUMA-aware memory preallocation using Threa


From: David Hildenbrand
Subject: Re: [PATCH RFC 0/7] hostmem: NUMA-aware memory preallocation using ThreadContext
Date: Fri, 5 Aug 2022 17:47:22 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0

> 
> I've timed 'virsh start' with a guest that has 47GB worth of 1GB
> hugepages and seen the startup time halved basically (from 10.5s to
> 5.6s). The host has 4 NUMA nodes and I'm pinning the guest onto two nodes.
> 
> I've written libvirt counterpart (which I'll post as soon as these are
> merged). The way it works is the whenever .prealloc-threads= is to be
> used AND qemu is capable of thread-context the thread-context object is
> generated before every memory-backend-*, like this:

Once interesting corner case might be with CPU-less NUMA nodes. Setting
the node-affinity would fail because there are no CPUs. Libvirt could
figure that out by testing if the selected node(s) have CPUs.

> 
> -object
> '{"qom-type":"thread-context","id":"tc-ram-node0","node-affinity":[2]}' \
> -object
> '{"qom-type":"memory-backend-memfd","id":"ram-node0","hugetlb":true,"hugetlbsize":1073741824,"share":true,"prealloc":true,"prealloc-threads":16,"size":21474836480,"host-nodes":[2],"policy":"bind","prealloc-context":"tc-ram-node0"}'
> \
> -numa node,nodeid=0,cpus=0,cpus=2,memdev=ram-node0 \
> -object
> '{"qom-type":"thread-context","id":"tc-ram-node1","node-affinity":[3]}' \
> -object
> '{"qom-type":"memory-backend-memfd","id":"ram-node1","hugetlb":true,"hugetlbsize":1073741824,"share":true,"prealloc":true,"prealloc-threads":16,"size":28991029248,"host-nodes":[3],"policy":"bind","prealloc-context":"tc-ram-node1"}'
> \
> 
> 
> Now, it's not visible in this snippet, but my code does not reuse
> thread-context objects. So if there's another memfd, it'll get its own TC:
> 
> -object
> '{"qom-type":"thread-context","id":"tc-memdimm0","node-affinity":[1]}' \
> -object
> '{"qom-type":"memory-backend-memfd","id":"memdimm0","hugetlb":true,"hugetlbsize":1073741824,"share":true,"prealloc":true,"prealloc-threads":16,"size":1073741824,"host-nodes":[1],"policy":"bind","prealloc-context":"tc-memdimm0"}'
> \
> 
> The reason is that logic generating memory-backends is very complex and
> separating out parts of it so that thread-context objects can be
> generated first and reused by those backends would inevitably lead to

Sounds like something we can work on later.

> regression. I guess my question is, whether it's a problem that libvirt
> would leave one additional thread, sleeping in a semaphore, for each
> memory-backend (iff prealloc-threads are used).

I guess in most setups we just don't care. Of course, with 256 DIMMs or
endless number of nodes, we *might* care.


One optimization for some ordinary setups (not caring about NUMA-aware
preallocation during DIMM hotplug) would be to assign some dummy thread
context once prealloc finished (e.g., once QEMU initialized after
prealloc) and delete the original thread context along with the thread.

> 
> Although, if I read the code correctly, thread-context object can be
> specified AFTER memory backends, because they are parsed and created
> before backends anyway. Well, something to think over the weekend.

Yes, the command line order does not matter.

[...]

> 
> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>

Thanks!

-- 
Thanks,

David / dhildenb




reply via email to

[Prev in Thread] Current Thread [Next in Thread]