qemu-riscv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 3/3] target/riscv/translate.c: set vstart_eq_zero in mark_


From: Daniel Henrique Barboza
Subject: Re: [PATCH v2 3/3] target/riscv/translate.c: set vstart_eq_zero in mark_vs_dirty()
Date: Fri, 16 Feb 2024 23:36:46 -0300
User-agent: Mozilla Thunderbird



On 2/16/24 20:41, Richard Henderson wrote:
On 2/16/24 12:40, Daniel Henrique Barboza wrote:
After reading the reviews of patches 1 and 3 what I'm considering here is:

1 - drop patch 1;

Ok.

2 - there's a patch from Ivan Klokov sent 2 months ago:

"[PATCH 1/1] target/riscv: Clear vstart_qe_zero flag"
https://lore.kernel.org/qemu-riscv/20231214111851.142532-1-ivan.klokov@syntacore.com/

His patch is closer to what you suggested than mine. He already renamed 
mark_vs_dirty()
to finalize_rvv_inst() and made it set start_eq_zero unconditionally. It needs a
little work (i.e. remove the ifds from the function) that I'll do myself.

3 - I'll keep patch 2 to reduce the redundant calls to the now 
finalize_rvv_inst();

Ok.

4 - Add another patch to through all "gen_set_label(over)" cond branches and set
vstart = 0 && vstart_eq_zero manually when we're doing the jump.

In fact, shouldn't we return earlier if we're not taking the branch? Otherwise
we'll set vstart twice in case we didn't get the branch. E.g:

       TCGLabel *over = gen_new_label();
       tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
       (...)
       finalize_rvv_insn();
       return true;

       gen_set_label(over);
       /* some TCG ops to set cpu_vstart to zero. Perhaps a helper?  */
       s->vstart_eq_zero = true;
       return true;

That will break, of course, because you wouldn't emit 'over'.
You really need to get translation-time and run-time separate in your head.

That said, I think these brcond(vstart >= vl) are a mistake.
The loops within the helpers are generally of the form

     for (i = env->vstart; i < evl; i++, env->vstart++) {

which will operate just fine with vstart >= vl, iterating zero times.
We will then fall through to the post-insn cleanup,

     env->vstart = 0;
     vext_set_tail_elems_1s(evl, vd, desc, nf, esz, max_elems);

or whatever.

I would expect the condition vstart >= vl to never happen in practice.  I believe the 
only way to induce it is an explicit write to vstart.  Therefore I think you should not 
attempt to "optimize away" the call to the helper.

Of course you will want to double-check all of the loop iterations in the 
associated helpers when removing the branches.

Ok!



r~


PS: I believe a better form for the ldst loops is

     for (i = env->vstart; i < evl; env->vstart = ++i)

to avoid re-reading from vstart each iteration.


I'll add a patch to convert these loops. Thanks,


Daniel




reply via email to

[Prev in Thread] Current Thread [Next in Thread]