qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/4] target/arm: Add TB flag for "MVE insns not predicated"


From: Peter Maydell
Subject: Re: [PATCH 1/4] target/arm: Add TB flag for "MVE insns not predicated"
Date: Fri, 3 Sep 2021 15:17:45 +0100

On Fri, 3 Sept 2021 at 14:58, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 9/2/21 5:09 PM, Peter Maydell wrote:
> > Our current codegen for MVE always calls out to helper functions,
> > because some byte lanes might be predicated. The common case is
> > that in fact there is no predication active and all lanes should
> > be updated together, so we can produce better code by detecting
> > that and using the TCG generic vector infrastructure.
> >
> > Add a TB flag that is set when we can guarantee that there
> > is no active MVE predication, and a bool in the DisasContext.
> > Subsequent patches will use this flag to generate improved
> > code for some instructions.
> >
> > Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> > ---
> > Another of those awkward-to-review patches where the interesting
> > question is "did I miss anywhere where we set v7m.vpr to non-0
> > or v7m.ltpsize to non-4 while in the middle of a TB ?" ...
> >
> > I've left the TB flag as non-cached because there seemed to
> > be an irritatingly large set of places where we'd have to do
> > an hflags recomputation.
>
> Indeed, non-cached seems like the right option.
>
> > +static bool mve_no_predication(CPUARMState *env)
> > +{
> > +    /*
> > +     * Return true if there is definitely no predication of MVE
> > +     * instructions for any reason. (Returning false even if there
> > +     * isn't any predication is OK; generated code will just be
> > +     * a little worse.)
> > +     * We do not account here for partial insn execution due to
> > +     * ECI bits as those are already in the TB flags elsewhere.
> > +     */
>
> I think you should go ahead and include that here, as it makes the test 
> self-contained,
> and avoids you having to do this:
>
> > +        dc->mve_no_pred = EX_TBFLAG_M32(tb_flags, MVE_NO_PRED) && dc->eci 
> > == 0;

Yeah. I think I was initially thinking that I wanted the MVE_NO_PRED
flag to be cached in hflags, in which case you have to do the combination
with ECI here because the ECI/condexec bits are not cached. But if we're
happy for MVE_NO_PRED to be not cached then we can happily combine with
ECI directly in mve_no_predication().

> > +    /* VLLDM or VLSTM helpers might have updated vpr or ltpsize */
> > +    s->mve_no_pred = false;
>
> For a moment I was thinking that this was ok, just finish the rest of the TB 
> and resync
> naturally, letting the end of the TB use the helpers.  But then I thought 
> about goto_tb.
> If you have this potentially variable condition, you can't chain the links 
> between this TB
> and the next.

I don't really understand what you mean here. What's the difference
between ending the TB now and translating a few more insns in
this TB before we end it?

thanks
-- PMM



reply via email to

[Prev in Thread] Current Thread [Next in Thread]