qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/4] target/arm: Add TB flag for "MVE insns not predicated"


From: Peter Maydell
Subject: Re: [PATCH 1/4] target/arm: Add TB flag for "MVE insns not predicated"
Date: Thu, 9 Sep 2021 15:53:37 +0100

On Fri, 3 Sept 2021 at 14:58, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 9/2/21 5:09 PM, Peter Maydell wrote:
> > Our current codegen for MVE always calls out to helper functions,
> > because some byte lanes might be predicated. The common case is
> > that in fact there is no predication active and all lanes should
> > be updated together, so we can produce better code by detecting
> > that and using the TCG generic vector infrastructure.
> >
> > Add a TB flag that is set when we can guarantee that there
> > is no active MVE predication, and a bool in the DisasContext.
> > Subsequent patches will use this flag to generate improved
> > code for some instructions.
> >
> > Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> > ---
> > Another of those awkward-to-review patches where the interesting
> > question is "did I miss anywhere where we set v7m.vpr to non-0
> > or v7m.ltpsize to non-4 while in the middle of a TB ?" ...
> >
> > I've left the TB flag as non-cached because there seemed to
> > be an irritatingly large set of places where we'd have to do
> > an hflags recomputation.
>
> Indeed, non-cached seems like the right option.
>
> > +static bool mve_no_predication(CPUARMState *env)
> > +{
> > +    /*
> > +     * Return true if there is definitely no predication of MVE
> > +     * instructions for any reason. (Returning false even if there
> > +     * isn't any predication is OK; generated code will just be
> > +     * a little worse.)
> > +     * We do not account here for partial insn execution due to
> > +     * ECI bits as those are already in the TB flags elsewhere.
> > +     */
>
> I think you should go ahead and include that here, as it makes the test 
> self-contained,
> and avoids you having to do this:
>
> > +        dc->mve_no_pred = EX_TBFLAG_M32(tb_flags, MVE_NO_PRED) && dc->eci 
> > == 0;

I found a reason why these should stay split, and it's related to the
discussion in this thread about TB flags whose state changes and whether
that means you need to end the TB. The TB flag state in the CONDEXEC bits
(ie including the ECI bits) changes for a lot of instructions, but we don't
need to end the TB because it always changes in the exact same way
regardless. A TB flag encoding just "MVE predication might happen
due to LTPSIZE and VPR" changes in only a few places, and we can end
the TB in those places. A single TB flag that mixes together all of
LTPSIZE, VPR and ECI can change state in a hard-to-disentangle way,
because a non-zero ECI can change to zero for just about any MVE insn.

If we keep MVE_NO_PRED and ECI distinct, then we don't need to consider
ECI updates when figuring out if we need to end the TB, which works out neater.

thanks
-- PMM



reply via email to

[Prev in Thread] Current Thread [Next in Thread]