qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] util: NUMA aware memory preallocation


From: Daniel P . Berrangé
Subject: Re: [PATCH] util: NUMA aware memory preallocation
Date: Wed, 11 May 2022 11:10:03 +0100
User-agent: Mutt/2.1.5 (2021-12-30)

On Wed, May 11, 2022 at 12:03:24PM +0200, David Hildenbrand wrote:
> On 11.05.22 11:34, Daniel P. Berrangé wrote:
> > On Wed, May 11, 2022 at 11:31:23AM +0200, David Hildenbrand wrote:
> >>>> Long story short, management application has no way of learning
> >>>> TIDs of allocator threads so it can't make them run NUMA aware.
> >>>
> >>> This feels like the key issue. The preallocation threads are
> >>> invisible to libvirt, regardless of whether we're doing coldplug
> >>> or hotplug of memory-backends. Indeed the threads are invisible
> >>> to all of QEMU, except the memory backend code.
> >>>
> >>> Conceptually we need 1 or more explicit worker threads, that we
> >>> can assign CPU affinity to, and then QEMU can place jobs on them.
> >>> I/O threads serve this role, but limited to blockdev work. We
> >>> need a generalization of I/O threads, for arbitrary jobs that
> >>> QEMU might want to farm out to specific numa nodes.
> >>
> >> At least the "-object iothread" thingy can already be used for actions
> >> outside of blockdev. virtio-balloon uses one for free page hinting.
> > 
> > Ah that's good to know, so my idea probably isn't so much work as
> > I thought it might be.
> 
> I guess we'd have to create a bunch of iothreads on the command line and
> then feed them as an array to the memory backend we want to create. We
> could then forward the threads to a new variant of os_mem_prealloc().
> 
> We could
> 
> a) Allocate new iothreads for each memory backend we create. Hm, that
> might be suboptimal, we could end up with many iothreads.
> 
> b) Reuse iothreads and have separate sets of iothreads per NUMA node.
> Assign them to a node once.
> 
> c) Reuse iothreads and reassign them to NUMA nodes on demand.

If all we needs is NUMA affinity, not CPU affinity, then it would
be sufficient to create 1 I/O thread per host NUMA node that the
VM needs to use. The job running in the I/O can spawn further
threads and inherit the NUMA affinity.  This might be more clever
than it is needed though.

I expect creating/deleting I/O threads is cheap in comparison to
the work done for preallocation. If libvirt is using -preconfig
and object-add to create the memory backend, then we could have
option of creating the I/O threads dynamically in -preconfig mode,
create the memory backend, and then delete the I/O threads again.

> However, I'm not sure what the semantics are when having multiple
> backends referencing the iothreads ...

Yep, we don't especially need an "ownership" relationship for what
we want todo with preallocatino, specially because it is a one
off point-in-time usage, not continuous usage as with block devices

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




reply via email to

[Prev in Thread] Current Thread [Next in Thread]