qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 08/29] tcg/aarch64: Generate TBZ, TBNZ


From: Paolo Bonzini
Subject: Re: [PATCH 08/29] tcg/aarch64: Generate TBZ, TBNZ
Date: Fri, 27 Oct 2023 06:44:10 +0200
User-agent: Mozilla Thunderbird

On 10/26/23 02:13, Richard Henderson wrote:
+    case TCG_COND_TSTEQ:
+    case TCG_COND_TSTNE:
+        if (b_const && is_power_of_2(b)) {
+            tbit = ctz64(b);
+            need_cmp = false;
+        }

I think another value that can be handled efficiently is 0xffffffff which becomes a "cbz/cbnz wNN, LABEL" instruction.

This could be interesting if the i386 frontend implemented JE/JNE and JS/JNS (of sizes smaller than MO_TL) using masks like 0xffffffff and 0x80000000 respectively. Like (for SF):

     MemOp size = (s->cc_op - CC_OP_ADDB) & 3;
     if (size == MO_TL) {
         return (CCPrepare) { .cond = TCG_COND_EQ, .reg = cpu_cc_dst,
                              .mask = -1 };
     } else {
         return (CCPrepare) { .cond = TCG_COND_TSTEQ, .reg = cpu_cc_dst,
                              .imm = (1ull << (8 << size)) - 1,
                              .mask = -1 };
    }

Then on aarch64, JE could become CBZ and JS could become TBNZ.

Unfortunately, the code produced on x86 is not awful but also not too good; we discussed earlier how TST against 0xffffffff and 0x80000000 can be computed efficiently using "testl reg, reg", but you don't get to that point in tcg_out_testi because the other conditions require an S32 constraint. Those constants don't satisfy it. :( So you lose the sign extension instructions, but you get a somewhat bulky MOV to load the constant followed by "testl reg, reg_containing_imm".

I guess in principle you could add TCG_TARGET_{br,mov,set}condi_valid(cond, const) but it's pretty ugly.

Paolo




reply via email to

[Prev in Thread] Current Thread [Next in Thread]