Re: [PATCH 7/7] tcg/ppc: Use prefixed instructions for tcg_out_goto

qemu-ppc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 7/7] tcg/ppc: Use prefixed instructions for tcg_out_goto_tb

From:	Richard Henderson
Subject:	Re: [PATCH 7/7] tcg/ppc: Use prefixed instructions for tcg_out_goto_tb
Date:	Sun, 6 Aug 2023 07:13:13 -0700
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0

On 8/6/23 05:55, Nicholas Piggin wrote:

On Sat Aug 5, 2023 at 7:33 AM AEST, Richard Henderson wrote:

When a direct branch is out of range, we can load the destination for
the indirect branch using PLA (for 16GB worth of buffer) and PLD from
the TranslationBlock for everything larger.

This means the patch affects exactly one instruction: B (plus filler),
PLA or PLD.  Which means we can update and execute the patch atomically.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
  tcg/ppc/tcg-target.c.inc | 76 ++++++++++++++++++++++++++++++----------
  1 file changed, 58 insertions(+), 18 deletions(-)

diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index 5b243b2353..47c71bb5f2 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -2642,31 +2642,41 @@ static void tcg_out_goto_tb(TCGContext *s, int which)
      uintptr_t ptr = get_jmp_target_addr(s, which);

if (USE_REG_TB) {

+        /*
+         * With REG_TB, we must always use indirect branching,
+         * so that the branch destination and TCG_REG_TB match.
+         */
          ptrdiff_t offset = tcg_tbrel_diff(s, (void *)ptr);
          tcg_out_mem_long(s, LD, LDX, TCG_REG_TB, TCG_REG_TB, offset);
-
-        /* TODO: Use direct branches when possible. */
-        set_jmp_insn_offset(s, which);
          tcg_out32(s, MTSPR | RS(TCG_REG_TB) | CTR);
-
          tcg_out32(s, BCCTR | BO_ALWAYS);

/* For the unlinked case, need to reset TCG_REG_TB. */

          set_jmp_reset_offset(s, which);
          tcg_out_mem_long(s, ADDI, ADD, TCG_REG_TB, TCG_REG_TB,
                           -tcg_current_code_size(s));
+        return;
+    }
+
+    if (have_isa_3_10) {
+        /* Align, so that we can patch 8 bytes atomically. */
+        if ((uintptr_t)s->code_ptr & 7) {
+            tcg_out32(s, NOP);
+        }
+        set_jmp_insn_offset(s, which);
+        /* Direct branch will be patched by tb_target_set_jmp_target. */
+        tcg_out_mls_d(s, ADDI, TCG_REG_TMP1, 0, 0, 1);
      } else {
          /* Direct branch will be patched by tb_target_set_jmp_target. */
-        set_jmp_insn_offset(s, which);
-        tcg_out32(s, NOP);
-
+        tcg_out32(s, B);
          /* When branch is out of range, fall through to indirect. */
          tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_TMP1, ptr - (int16_t)ptr);
          tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP1, TCG_REG_TMP1, (int16_t)ptr);
-        tcg_out32(s, MTSPR | RS(TCG_REG_TMP1) | CTR);
-        tcg_out32(s, BCCTR | BO_ALWAYS);
-        set_jmp_reset_offset(s, which);
      }
+
+    tcg_out32(s, MTSPR | RS(TCG_REG_TMP1) | CTR);
+    tcg_out32(s, BCCTR | BO_ALWAYS);
+    set_jmp_reset_offset(s, which);
  }

void tb_target_set_jmp_target(const TranslationBlock *tb, int n,

@@ -2674,20 +2684,50 @@ void tb_target_set_jmp_target(const TranslationBlock 
*tb, int n,
  {
      uintptr_t addr = tb->jmp_target_addr[n];
      intptr_t diff = addr - jmp_rx;
-    tcg_insn_unit insn;

if (USE_REG_TB) {

          return;
      }

- if (in_range_b(diff)) {

-        insn = B | (diff & 0x3fffffc);
-    } else {
-        insn = NOP;
-    }
+    if (have_isa_3_10) {
+        tcg_insn_unit insn1, insn2;
+        uint64_t pair;

- qatomic_set((uint32_t *)jmp_rw, insn);

-    flush_idcache_range(jmp_rx, jmp_rw, 4);
+        if (in_range_b(diff)) {
+            insn1 = B | (diff & 0x3fffffc);
+            insn2 = NOP;
+        } else if (diff == sextract64(diff, 0, 34)) {
+            /* PLA tmp1, diff */
+            insn1 = OPCD(1) | (2 << 24) | (1 << 20) | ((diff >> 16) & 0x3ffff);
+            insn2 = ADDI | TAI(TCG_REG_TMP1, 0, diff);
+        } else {
+            addr = (uintptr_t)&tb->jmp_target_addr[n];
+            diff = addr - jmp_rx;
+            tcg_debug_assert(diff == sextract64(diff, 0, 34));
+            /* PLD tmp1, diff */
+            insn1 = OPCD(1) | (1 << 20) | ((diff >> 16) & 0x3ffff);
+            insn2 = PLD | TAI(TCG_REG_TMP1, 0, diff);
+        }


B is a "patch class" word instruction as per CMODX in the ISA, which may
be patched to/from other instructions without a flush+isync sequence
betwen. So that part is okay, at least if you were just patching the B
word. But patching between the PLA and PLD I don't think is kosher per
ISA.

I struggle a bit with this part of the ISA, particularly with prefix
instructions (it only talks about patching 4 bytes at a time).

If we patch something it has to go through a patch instruction, which
is a direct branch, trap, or nop. I think that makes this non-trivial.

It could work if you only patched between B and PLD. B->PLD would have
to patch the suffix word first, possibly with an interleaving sync, and
then the prefix. PLD->B could just patch the B word.

How much would losing the PLA hurt?

Really? I can't imagine how some icache would see a torn prefixed insn given an atomicstore (CMODX talks about prefixed instructions which "may be unaligned" -- but what ifthey are not?).

But if patching an aligned prefixed insn isn't allowed, I would patch between B and NOP,leave the PLD alone on the fall-through path, and drop the PLA.

r~

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [PATCH 2/7] tcg/ppc: Use PADDI in tcg_out_movi, (continued)
- [PATCH 1/7] tcg/ppc: Untabify tcg-target.c.inc, Richard Henderson, 2023/08/04
- [PATCH 3/7] tcg/ppc: Use prefixed instructions in tcg_out_mem_long, Richard Henderson, 2023/08/04
  - Re: [PATCH 3/7] tcg/ppc: Use prefixed instructions in tcg_out_mem_long, Jordan Niethe, 2023/08/06
- [PATCH 4/7] tcg/ppc: Use PLD in tcg_out_movi for constant pool, Richard Henderson, 2023/08/04
- [PATCH 6/7] tcg/ppc: Disable USE_REG_TB for Power v3.1, Richard Henderson, 2023/08/04
  - Re: [PATCH 6/7] tcg/ppc: Disable USE_REG_TB for Power v3.1, Nicholas Piggin, 2023/08/06
    - Re: [PATCH 6/7] tcg/ppc: Disable USE_REG_TB for Power v3.1, Richard Henderson, 2023/08/06
- [PATCH 7/7] tcg/ppc: Use prefixed instructions for tcg_out_goto_tb, Richard Henderson, 2023/08/04
  - Re: [PATCH 7/7] tcg/ppc: Use prefixed instructions for tcg_out_goto_tb, Nicholas Piggin, 2023/08/06
    - Re: [PATCH 7/7] tcg/ppc: Use prefixed instructions for tcg_out_goto_tb, Richard Henderson <=
    - Re: [PATCH 7/7] tcg/ppc: Use prefixed instructions for tcg_out_goto_tb, Nicholas Piggin, 2023/08/06
    - Re: [PATCH 7/7] tcg/ppc: Use prefixed instructions for tcg_out_goto_tb, Nicholas Piggin, 2023/08/07
    - Re: [PATCH 7/7] tcg/ppc: Use prefixed instructions for tcg_out_goto_tb, Nicholas Piggin, 2023/08/07
  - Re: [PATCH 7/7] tcg/ppc: Use prefixed instructions for tcg_out_goto_tb, Jordan Niethe, 2023/08/07
- [PATCH 5/7] tcg/ppc: Use prefixed instructions in tcg_out_dupi_vec, Richard Henderson, 2023/08/04
- Re: [PATCH for-8.2 0/7] tcg/ppc: Support power10 prefixed instructions, Nicholas Piggin, 2023/08/06
  - Re: [PATCH for-8.2 0/7] tcg/ppc: Support power10 prefixed instructions, Richard Henderson, 2023/08/06

Prev by Date: Re: [PATCH 6/7] tcg/ppc: Disable USE_REG_TB for Power v3.1
Next by Date: Re: [PATCH for-8.2 0/7] tcg/ppc: Support power10 prefixed instructions
Previous by thread: Re: [PATCH 7/7] tcg/ppc: Use prefixed instructions for tcg_out_goto_tb
Next by thread: Re: [PATCH 7/7] tcg/ppc: Use prefixed instructions for tcg_out_goto_tb
Index(es):
- Date
- Thread