[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: tools/virtiofs: Multi threading seems to hurt performance
From: |
Dr. David Alan Gilbert |
Subject: |
Re: tools/virtiofs: Multi threading seems to hurt performance |
Date: |
Tue, 22 Sep 2020 12:09:46 +0100 |
User-agent: |
Mutt/1.14.6 (2020-07-11) |
* Vivek Goyal (vgoyal@redhat.com) wrote:
> On Fri, Sep 18, 2020 at 05:34:36PM -0400, Vivek Goyal wrote:
> > Hi All,
> >
> > virtiofsd default thread pool size is 64. To me it feels that in most of
> > the cases thread pool size 1 performs better than thread pool size 64.
> >
> > I ran virtiofs-tests.
> >
> > https://github.com/rhvgoyal/virtiofs-tests
>
> I spent more time debugging this. First thing I noticed is that we
> are using "exclusive" glib thread pool.
>
> https://developer.gnome.org/glib/stable/glib-Thread-Pools.html#g-thread-pool-new
>
> This seems to run pre-determined number of threads dedicated to that
> thread pool. Little instrumentation of code revealed that every new
> request gets assiged to new thread (despite the fact that previous
> thread finished its job). So internally there might be some kind of
> round robin policy to choose next thread for running the job.
>
> I decided to switch to "shared" pool instead where it seemed to spin
> up new threads only if there is enough work. Also threads can be shared
> between pools.
>
> And looks like testing results are way better with "shared" pools. So
> may be we should switch to shared pool by default. (Till somebody shows
> in what cases exclusive pools are better).
>
> Second thought which came to mind was what's the impact of NUMA. What
> if qemu and virtiofsd process/threads are running on separate NUMA
> node. That should increase memory access latency and increased overhead.
> So I used "numactl --cpubind=0" to bind both qemu and virtiofsd to node
> 0. My machine seems to have two numa nodes. (Each node is having 32
> logical processors). Keeping both qemu and virtiofsd on same node
> improves throughput further.
>
> So here are the results.
>
> vtfs-none-epool --> cache=none, exclusive thread pool.
> vtfs-none-spool --> cache=none, shared thread pool.
> vtfs-none-spool-numa --> cache=none, shared thread pool, same numa node
Do you have the numbers for:
epool
epool thread-pool-size=1
spool
?
Dave
>
> NAME WORKLOAD Bandwidth IOPS
>
> vtfs-none-epool seqread-psync 36(MiB/s) 9392
>
> vtfs-none-spool seqread-psync 68(MiB/s) 17k
>
> vtfs-none-spool-numa seqread-psync 73(MiB/s) 18k
>
>
> vtfs-none-epool seqread-psync-multi 210(MiB/s) 52k
>
> vtfs-none-spool seqread-psync-multi 260(MiB/s) 65k
>
> vtfs-none-spool-numa seqread-psync-multi 309(MiB/s) 77k
>
>
> vtfs-none-epool seqread-libaio 286(MiB/s) 71k
>
> vtfs-none-spool seqread-libaio 328(MiB/s) 82k
>
> vtfs-none-spool-numa seqread-libaio 332(MiB/s) 83k
>
>
> vtfs-none-epool seqread-libaio-multi 201(MiB/s) 50k
>
> vtfs-none-spool seqread-libaio-multi 254(MiB/s) 63k
>
> vtfs-none-spool-numa seqread-libaio-multi 276(MiB/s) 69k
>
>
> vtfs-none-epool randread-psync 40(MiB/s) 10k
>
> vtfs-none-spool randread-psync 64(MiB/s) 16k
>
> vtfs-none-spool-numa randread-psync 72(MiB/s) 18k
>
>
> vtfs-none-epool randread-psync-multi 211(MiB/s) 52k
>
> vtfs-none-spool randread-psync-multi 252(MiB/s) 63k
>
> vtfs-none-spool-numa randread-psync-multi 297(MiB/s) 74k
>
>
> vtfs-none-epool randread-libaio 313(MiB/s) 78k
>
> vtfs-none-spool randread-libaio 320(MiB/s) 80k
>
> vtfs-none-spool-numa randread-libaio 330(MiB/s) 82k
>
>
> vtfs-none-epool randread-libaio-multi 257(MiB/s) 64k
>
> vtfs-none-spool randread-libaio-multi 274(MiB/s) 68k
>
> vtfs-none-spool-numa randread-libaio-multi 319(MiB/s) 79k
>
>
> vtfs-none-epool seqwrite-psync 34(MiB/s) 8926
>
> vtfs-none-spool seqwrite-psync 55(MiB/s) 13k
>
> vtfs-none-spool-numa seqwrite-psync 66(MiB/s) 16k
>
>
> vtfs-none-epool seqwrite-psync-multi 196(MiB/s) 49k
>
> vtfs-none-spool seqwrite-psync-multi 225(MiB/s) 56k
>
> vtfs-none-spool-numa seqwrite-psync-multi 270(MiB/s) 67k
>
>
> vtfs-none-epool seqwrite-libaio 257(MiB/s) 64k
>
> vtfs-none-spool seqwrite-libaio 304(MiB/s) 76k
>
> vtfs-none-spool-numa seqwrite-libaio 267(MiB/s) 66k
>
>
> vtfs-none-epool seqwrite-libaio-multi 312(MiB/s) 78k
>
> vtfs-none-spool seqwrite-libaio-multi 366(MiB/s) 91k
>
> vtfs-none-spool-numa seqwrite-libaio-multi 381(MiB/s) 95k
>
>
> vtfs-none-epool randwrite-psync 38(MiB/s) 9745
>
> vtfs-none-spool randwrite-psync 55(MiB/s) 13k
>
> vtfs-none-spool-numa randwrite-psync 67(MiB/s) 16k
>
>
> vtfs-none-epool randwrite-psync-multi 186(MiB/s) 46k
>
> vtfs-none-spool randwrite-psync-multi 240(MiB/s) 60k
>
> vtfs-none-spool-numa randwrite-psync-multi 271(MiB/s) 67k
>
>
> vtfs-none-epool randwrite-libaio 224(MiB/s) 56k
>
> vtfs-none-spool randwrite-libaio 296(MiB/s) 74k
>
> vtfs-none-spool-numa randwrite-libaio 290(MiB/s) 72k
>
>
> vtfs-none-epool randwrite-libaio-multi 300(MiB/s) 75k
>
> vtfs-none-spool randwrite-libaio-multi 350(MiB/s) 87k
>
> vtfs-none-spool-numa randwrite-libaio-multi 383(MiB/s) 95k
>
>
> Thanks
> Vivek
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
- Re: tools/virtiofs: Multi threading seems to hurt performance, (continued)
Re: tools/virtiofs: Multi threading seems to hurt performance, Vivek Goyal, 2020/09/21
- Re: tools/virtiofs: Multi threading seems to hurt performance,
Dr. David Alan Gilbert <=
Re: [Virtio-fs] tools/virtiofs: Multi threading seems to hurt performance, Chirantan Ekbote, 2020/09/23