qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [RFC PATCH v3 31/34] Hexagon (target/hexagon) translation


From: Taylor Simpson
Subject: RE: [RFC PATCH v3 31/34] Hexagon (target/hexagon) translation
Date: Sun, 30 Aug 2020 19:37:52 +0000


> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Friday, August 28, 2020 8:50 PM
> To: Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@redhat.com; laurent@vivier.eu; riku.voipio@iki.fi;
> aleksandar.m.mail@gmail.com; ale@rev.ng
> Subject: Re: [RFC PATCH v3 31/34] Hexagon (target/hexagon) translation
>
> > +static inline void ctx_log_reg_write(DisasContext *ctx, int rnum)
> > +{
> > +#if HEX_DEBUG
> > +    int i;
> > +    for (i = 0; i < ctx->reg_log_idx; i++) {
> > +        if (ctx->reg_log[i] == rnum) {
> > +            HEX_DEBUG_LOG("WARNING: Multiple writes to r%d\n", rnum);
> > +        }
> > +    }
> > +#endif
> > +    ctx->reg_log[ctx->reg_log_idx] = rnum;
> > +    ctx->reg_log_idx++;
> > +}
>
> Why not just keep a bitmask of the rnum written?
> Does the order of this log really matter?

OK

> > +static inline void gen_slot_cancelled_check(TCGv check, int slot_num)
> > +{
> > +    TCGv mask = tcg_const_tl(1 << slot_num);
> > +    TCGv one = tcg_const_tl(1);
> > +    TCGv zero = tcg_const_tl(0);
> > +
> > +    tcg_gen_and_tl(mask, hex_slot_cancelled, mask);
> > +    tcg_gen_movcond_tl(TCG_COND_NE, check, mask, zero, one, zero);
>
> This is a bit silly.  Better as
>
>     tcg_gen_extract_i32(check, hex_slot_cancelled, slot_num, 1);

OK

> > +static int read_packet_words(CPUHexagonState *env, DisasContext *ctx,
> > +                             uint32_t words[])
> > +{
> > +    bool found_end = false;
> > +    int max_words;
> > +    int nwords;
> > +    int i;
> > +
> > +    /* Make sure we don't cross a page boundary */
> > +    max_words = -(ctx->base.pc_next | TARGET_PAGE_MASK) /
> sizeof(uint32_t);
> > +    if (max_words < PACKET_WORDS_MAX) {
> > +        /* Might cross a page boundary */
> > +        if (ctx->base.num_insns == 1) {
> > +            /* OK if it's the first packet in the TB */
> > +            max_words = PACKET_WORDS_MAX;
> > +        }
> > +    } else {
> > +        max_words = PACKET_WORDS_MAX;
> > +    }
> > +
> > +    memset(words, 0, PACKET_WORDS_MAX * sizeof(uint32_t));
> > +    for (nwords = 0; !found_end && nwords < max_words; nwords++) {
> > +        words[nwords] = cpu_ldl_code(env,
> > +                                ctx->base.pc_next + nwords * 
> > sizeof(uint32_t));
> > +        found_end = is_packet_end(words[nwords]);
> > +    }
> > +    if (!found_end) {
> > +        if (nwords == PACKET_WORDS_MAX) {
> > +            /* Read too many words without finding the end */
> > +            gen_exception(HEX_EXCP_INVALID_PACKET);
> > +            ctx->base.is_jmp = DISAS_NORETURN;
> > +            return 0;
> > +        }
> > +        /* Crosses page boundary - defer to next TB */
> > +        ctx->base.is_jmp = DISAS_TOO_MANY;
>
> The problem with this is that the translator has asked for the next insn, and
> you havn't provided it.
>
> One way to fix this might be to decrement ctx->base.num_insns, to
> compensate
> for the increment that *will* happen after you return.
>
> Another way, which involves less poking about into internals is to look for 
> the
> next packet once you've completed the current packet.  This is what Arm
> does
> for thumb2 insns:
>
> >     if (dc->base.is_jmp == DISAS_NEXT
> >         && (dc->base.pc_next - dc->page_start >= TARGET_PAGE_SIZE
> >             || (dc->base.pc_next - dc->page_start >= TARGET_PAGE_SIZE - 3
> >                 && insn_crosses_page(env, dc)))) {
> >         dc->base.is_jmp = DISAS_TOO_MANY;
> >     }
>
> I.e. you only have to do this for the few packets that are near enough to the
> end of the page that PACKET_WORDS_MAX crosses.

I'm actually checking two conditions here.
1) packet crossing a page boundary
2) reading too many words without finding the end of the packet.
I guess it would be better to separate them.

What is the correct behavior for the second case?  Should we return an error 
code from here and have the higher level code generate the invalid packet 
exception?

> > +    tcg_gen_movi_tl(hex_gpr[HEX_REG_PC], ctx->base.pc_next);
> > +    tcg_gen_movi_tl(hex_slot_cancelled, 0);
> > +    if (pkt->pkt_has_cof) {
> > +        tcg_gen_movi_tl(hex_branch_taken, 0);
> > +        tcg_gen_movi_tl(hex_next_PC, next_PC);
> > +    }
> > +    tcg_gen_movi_tl(hex_pred_written, 0);
> > +}
>
> Surely you don't need to actually set PC for every PC?

What do other targets do?

> Nor set hex_slot_cancelled if the packet contains nothing that can cancel
> anything.  Nor set hex_pred_written if no predicates are written.

Checking for instructions that can cancel is pretty straightforward because we 
have the CONDEXEC attribute.  Checking if any predicates are written will be 
more complex.  I'll scratch my head and figure out the cleanest way to do this.

>
> > +    TCGv cancelled = tcg_temp_local_new();
> > +    TCGLabel *label_end = gen_new_label();
> > +
> > +    /* Don't do anything if the slot was cancelled */
> > +    gen_slot_cancelled_check(cancelled, slot_num);
> > +    tcg_gen_brcondi_tl(TCG_COND_NE, cancelled, 0, label_end);
>
> cancelled does not need to be local; it is consumed by the branch and not
> consumed afterward.  Just free it here.

OK

> > +        /*
> > +         * If we know the width from the DisasContext, we can
> > +         * generate much cleaner code.
> > +         * Unfortunately, not all instructions execute the fSTORE
> > +         * macro during code generation.  Anything that uses the
> > +         * generic helper will have this problem.  Instructions
> > +         * that use fWRAP to generate proper TCG code will be OK.
> > +         */
>
> OMG.  How disgusting.

The word "generic" is a typo - should be "generated".  In order to keep this 
series small, we're only overriding the helper for the minimal number of 
instructions.  Over time, we'll override all the stores and we can eliminate 
this.


> > +            value = tcg_temp_new();
> > +            tcg_gen_mov_tl(value, hex_store_val32[slot_num]);
> > +            tcg_gen_qemu_st8(value, address, ctx->mem_idx);
> > +            tcg_temp_free(value);
>
> Why are you copying to a temporary?

Will fix.

> > +        default:
> > +            /*
> > +             * If we get to here, we don't know the width at
> > +             * TCG generation time, we'll generate branching
> > +             * based on the width at runtime.
> > +             */
> > +            label_w2 = gen_new_label();
> > +            label_w4 = gen_new_label();
> > +            label_w8 = gen_new_label();
> > +            TCGv width = tcg_temp_local_new();
>
> You might as well make this a helper.  This is going to generate a *lot* of
> code.

This is the part that will go away when we override all the stores.

> > +static void gen_exec_counters(packet_t *pkt)
> > +{
> > +    int num_insns = pkt->num_insns;
> > +    int num_real_insns = 0;
> > +    int i;
> > +
> > +    for (i = 0; i < num_insns; i++) {
> > +        if (!pkt->insn[i].is_endloop &&
> > +            !pkt->insn[i].part1 &&
> > +            !GET_ATTRIB(pkt->insn[i].opcode, A_IT_NOP)) {
> > +            num_real_insns++;
> > +        }
> > +    }
> > +
> > +    tcg_gen_addi_tl(hex_gpr[HEX_REG_QEMU_PKT_CNT],
> > +                    hex_gpr[HEX_REG_QEMU_PKT_CNT], 1);
> > +    if (num_real_insns) {
> > +        tcg_gen_addi_tl(hex_gpr[HEX_REG_QEMU_INSN_CNT],
> > +                        hex_gpr[HEX_REG_QEMU_INSN_CNT], num_real_insns);
> > +    }
>
> tcg_gen_addi_tl will check for the immediate == 0.

OK, great.  I also see that tcg_gen_mov_i32 will check that the source and dest 
are different.  So no TCG code will be generated.

> As with updating PC for every insn, this is going to be expensive.
>
> You could accumulate these values through the TB and then update them at
> the
> end.  You'd need to store the intermediate values in the space managed by
> TARGET_INSN_START_EXTRA_WORDS, so that you can update them on any
> exceptional
> path out of the TB, in restore_state_to_opc().

OK

> > +    if (end_tb) {
> > +        tcg_gen_exit_tb(NULL, 0);
>
> This misses out on ctx->base.singlestep_enabled, and almost certainly
> belongs
> in hexagon_tr_tb_stop.  Use
>
> #define DISAS_EXIT  DISAS_TARGET_0
>
> or some other appropriate naming.

OK


reply via email to

[Prev in Thread] Current Thread [Next in Thread]