qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 0/3] Reorg ppc64 pmu insn counting


From: Daniel Henrique Barboza
Subject: Re: [PATCH 0/3] Reorg ppc64 pmu insn counting
Date: Thu, 23 Dec 2021 17:36:30 -0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.4.0



On 12/23/21 00:01, Richard Henderson wrote:
In contrast to Daniel's version, the code stays in power8-pmu.c,
but is better organized to not take so much overhead.

Before:

     32.97%  qemu-system-ppc  qemu-system-ppc64   [.] pmc_get_event
     20.22%  qemu-system-ppc  qemu-system-ppc64   [.] helper_insns_inc
      4.52%  qemu-system-ppc  qemu-system-ppc64   [.] hreg_compute_hflags_value
      3.30%  qemu-system-ppc  qemu-system-ppc64   [.] helper_lookup_tb_ptr
      2.68%  qemu-system-ppc  qemu-system-ppc64   [.] tcg_gen_code
      2.28%  qemu-system-ppc  qemu-system-ppc64   [.] cpu_exec
      1.84%  qemu-system-ppc  qemu-system-ppc64   [.] pmu_insn_cnt_enabled

After:

      8.42%  qemu-system-ppc  qemu-system-ppc64   [.] hreg_compute_hflags_value
      6.65%  qemu-system-ppc  qemu-system-ppc64   [.] cpu_exec
      6.63%  qemu-system-ppc  qemu-system-ppc64   [.] helper_insns_inc


Thanks for looking this up. I had no idea the original C code was that slow.

This reorg is breaking PMU-EBB tests, unfortunately. These tests are run from 
the kernel
tree [1] and I test them inside a pSeries TCG guest. You'll need to apply 
patches 9 and
10 of [2] beforehand (they apply cleanly in current master) because they aren't 
upstream
yet and EBB needs it.

The tests that are breaking consistently with this reorg are:

back_to_back_ebbs_test.c
cpu_event_pinned_vs_ebb_test.c
cycles_test.c
task_event_pinned_vs_ebb_test.c


The issue here is that these tests exercises different Perf events and aspects 
of branching
(e.g. how fast we're detecting a counter overflow, how many times, etc) and I 
wasn't able to
find out a fix using your C reorg yet.

With that in mind I decided to post a new version of my TCG rework, with less 
repetition and
a bit more concise, to have an alternative that can be used upstream to fix the 
Avocado tests.
Meanwhile I'll see if I can get your reorg working with all EBB tests we need. 
All things
equal - similar performance, all EBB tests passing - I'd rather stay with your 
C code than my
TCG rework since yours doesn't rely on TCG Ops knowledge to maintain it.


Thanks,


Daniel


[1] 
https://github.com/torvalds/linux/tree/master/tools/testing/selftests/powerpc/pmu/ebb
[2] https://lists.gnu.org/archive/html/qemu-devel/2021-12/msg00073.html


r~


Richard Henderson (3):
   target/ppc: Cache per-pmc insn and cycle count settings
   target/ppc: Rewrite pmu_increment_insns
   target/ppc: Use env->pnc_cyc_cnt

  target/ppc/cpu.h         |   3 +
  target/ppc/power8-pmu.h  |  14 +--
  target/ppc/cpu_init.c    |   1 +
  target/ppc/helper_regs.c |   2 +-
  target/ppc/machine.c     |   2 +
  target/ppc/power8-pmu.c  | 230 ++++++++++++++++-----------------------
  6 files changed, 108 insertions(+), 144 deletions(-)




reply via email to

[Prev in Thread] Current Thread [Next in Thread]