qemu-riscv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 10/21] target/riscv: support for 128-bit loads and store


From: Richard Henderson
Subject: Re: [PATCH v3 10/21] target/riscv: support for 128-bit loads and store
Date: Wed, 20 Oct 2021 10:31:07 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0

On 10/19/21 2:48 AM, Frédéric Pétrot wrote:
+# Added for 128 bit support
+%uimm_cl_q    5:2 10:3               !function=ex_shift_3
+%uimm_6bit_lq 2:3 12:1 5:2           !function=ex_shift_3
+%uimm_6bit_sq 7:3 10:3               !function=ex_shift_3

These are incorrect. LQ and LQSP are scaled by shift 4, not 3. And the immediate bits are differently swizzled from LD and LW.


-fld               001  ... ... .. ... 00 @cl_d
+{
+  fld             001  ... ... .. ... 00 @cl_d
+  # *** RV128C specific Standard Extension (Quadrant 0) ***
+  lq              001  ... ... .. ... 00 @cl_q
+}

You need to move lq first, so that it overrides fld when RV128 is enabled. Otherwise you have to invent some c_fld_not_rv32 pattern with the proper XLEN predicate inside.

Likewise for all of the other groups.

+/*
+ * TODO: we should assert that src1h == 0, as we do not change the
+ *       address translation mechanism
+ */
+static bool gen_load_i128(DisasContext *ctx, arg_lb *a, MemOp memop)
+{
+    TCGv src1l = get_gpr(ctx, a->rs1, EXT_NONE);
+    TCGv src1h = get_gprh(ctx, a->rs1);
+    TCGv destl = dest_gpr(ctx, a->rd);
+    TCGv desth = dest_gprh(ctx, a->rd);
+    TCGv addrl = tcg_temp_new();
+    TCGv addrh = tcg_temp_new();
+    TCGv imml = tcg_temp_new();
+    TCGv immh = tcg_constant_tl(-(a->imm < 0));
+
+    /* Build a 128-bit address */
+    if (a->imm != 0) {
+        tcg_gen_movi_tl(imml, a->imm);
+        tcg_gen_add2_tl(addrl, addrh, src1l, src1h, imml, immh);
+    } else {
+        tcg_gen_mov_tl(addrl, src1l);
+        tcg_gen_mov_tl(addrh, src1h);
+    }

Hmm.. I thought I remembered some clause by which the top N bits of the address could be ignored, but I can't find it now.

In any case, even if it should be done eventually, I don't think it's worthwhile to compute addrh at all right now.

+    if (memop != (MemOp)MO_TEO) {

Why the cast?  MO_TEO is a MemOp enumerator.

+        tcg_gen_qemu_ld_tl(memop & MO_BSWAP ? desth : destl, addrl,
+                           ctx->mem_idx, MO_TEQ);
+        gen_addi2_i128(addrl, addrh, addrl, addrh, 8);
+        tcg_gen_qemu_ld_tl(memop & MO_BSWAP ? destl : desth, addrl,
+                           ctx->mem_idx, MO_TEQ);

In addition... we need an atomic load here for aligned 128-bit addresses (unaligned addresses are allowed to be non-atomic).

We don't currently have such an operation in TCG, though we need one (the Power8 LQ instruction is also only atomic when aligned).

We should either add this right away (shouldn't be too hard), or change the default to thread=single for -cpu rv128. We should disable thread=multi if !HAVE_ATOMIC128, because we will be constantly trapping with EXCP_ATOMIC.

Similarly for store, of course.


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]