qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 20/21] accel/tcg: allow plugin instrumentation to be disa


From: Aaron Lindsay
Subject: Re: [PATCH v2 20/21] accel/tcg: allow plugin instrumentation to be disable via cflags
Date: Thu, 11 Feb 2021 19:53:56 -0500

On Feb 10 22:10, Alex Bennée wrote:
> When icount is enabled and we recompile an MMIO access we end up
> double counting the instruction execution. To avoid this we introduce
> the CF_NOINSTR cflag which disables instrumentation for the next TB.
> As this is part of the hashed compile flags we will only execute the
> generated TB while coming out of a cpu_io_recompile.

Unfortunately this patch works a little too well!

With this change, the memory access callbacks registered via
`qemu_plugin_register_vcpu_mem_cb()` are never called for the
re-translated instruction making the IO access, since we've disabled all
instrumentation.

Is it possible to selectively disable only instruction callbacks using
this mechanism, while still allowing others that would not yet have been
called for the re-translated instruction?

-Aaron

> While we are at it delete the old TODO. We might as well keep the
> translation handy as it's likely you will repeatedly hit it on each
> MMIO access.
> 
> Reported-by: Aaron Lindsay <aaron@os.amperecomputing.com>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> Message-Id: <20210209182749.31323-12-alex.bennee@linaro.org>
> 
> ---
> v2
>   - squashed CH_HASHMASK to ~CF_INVALID
> ---
>  include/exec/exec-all.h   |  6 +++---
>  accel/tcg/translate-all.c | 17 ++++++++---------
>  accel/tcg/translator.c    |  2 +-
>  3 files changed, 12 insertions(+), 13 deletions(-)
> 
> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> index e08179de34..299282cc59 100644
> --- a/include/exec/exec-all.h
> +++ b/include/exec/exec-all.h
> @@ -454,14 +454,14 @@ struct TranslationBlock {
>      uint32_t cflags;    /* compile flags */
>  #define CF_COUNT_MASK  0x00007fff
>  #define CF_LAST_IO     0x00008000 /* Last insn may be an IO access.  */
> +#define CF_NOINSTR     0x00010000 /* Disable instrumentation of TB */
>  #define CF_USE_ICOUNT  0x00020000
>  #define CF_INVALID     0x00040000 /* TB is stale. Set with @jmp_lock held */
>  #define CF_PARALLEL    0x00080000 /* Generate code for a parallel context */
>  #define CF_CLUSTER_MASK 0xff000000 /* Top 8 bits are cluster ID */
>  #define CF_CLUSTER_SHIFT 24
> -/* cflags' mask for hashing/comparison */
> -#define CF_HASH_MASK   \
> -    (CF_COUNT_MASK | CF_LAST_IO | CF_USE_ICOUNT | CF_PARALLEL | 
> CF_CLUSTER_MASK)
> +/* cflags' mask for hashing/comparison, basically ignore CF_INVALID */
> +#define CF_HASH_MASK   (~CF_INVALID)
>  
>      /* Per-vCPU dynamic tracing state used to generate this TB */
>      uint32_t trace_vcpu_dstate;
> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
> index 0666f9ef14..32a3d8fe24 100644
> --- a/accel/tcg/translate-all.c
> +++ b/accel/tcg/translate-all.c
> @@ -2399,7 +2399,8 @@ void tb_check_watchpoint(CPUState *cpu, uintptr_t 
> retaddr)
>  }
>  
>  #ifndef CONFIG_USER_ONLY
> -/* in deterministic execution mode, instructions doing device I/Os
> +/*
> + * In deterministic execution mode, instructions doing device I/Os
>   * must be at the end of the TB.
>   *
>   * Called by softmmu_template.h, with iothread mutex not held.
> @@ -2430,19 +2431,17 @@ void cpu_io_recompile(CPUState *cpu, uintptr_t 
> retaddr)
>          n = 2;
>      }
>  
> -    /* Generate a new TB executing the I/O insn.  */
> -    cpu->cflags_next_tb = curr_cflags() | CF_LAST_IO | n;
> +    /*
> +     * Exit the loop and potentially generate a new TB executing the
> +     * just the I/O insns. We also disable instrumentation so we don't
> +     * double count the instruction.
> +     */
> +    cpu->cflags_next_tb = curr_cflags() | CF_NOINSTR | CF_LAST_IO | n;
>  
>      qemu_log_mask_and_addr(CPU_LOG_EXEC, tb->pc,
>                             "cpu_io_recompile: rewound execution of TB to "
>                             TARGET_FMT_lx "\n", tb->pc);
>  
> -    /* TODO: If env->pc != tb->pc (i.e. the faulting instruction was not
> -     * the first in the TB) then we end up generating a whole new TB and
> -     *  repeating the fault, which is horribly inefficient.
> -     *  Better would be to execute just this insn uncached, or generate a
> -     *  second new TB.
> -     */
>      cpu_loop_exit_noexc(cpu);
>  }
>  
> diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
> index a49a794065..14d1ea795d 100644
> --- a/accel/tcg/translator.c
> +++ b/accel/tcg/translator.c
> @@ -58,7 +58,7 @@ void translator_loop(const TranslatorOps *ops, 
> DisasContextBase *db,
>      ops->tb_start(db, cpu);
>      tcg_debug_assert(db->is_jmp == DISAS_NEXT);  /* no early exit */
>  
> -    plugin_enabled = plugin_gen_tb_start(cpu, tb);
> +    plugin_enabled = !(tb_cflags(db->tb) & CF_NOINSTR) && 
> plugin_gen_tb_start(cpu, tb);
>  
>      while (true) {
>          db->num_insns++;
> -- 
> 2.20.1
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]