qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH] tcg: Fix tcg gen for vectorized absolute value


From: Stephen Long
Subject: [PATCH] tcg: Fix tcg gen for vectorized absolute value
Date: Wed, 12 Aug 2020 15:31:10 -0700

---
 tcg/tcg-op-gvec.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

qemu was generating buggy tcg for arm64's vectorized absolute value
insn when the machine didn't support avx insns.

Subtracting a mask of -1 for each negative element doesn't
add 1 to each negative element. For example, subtracting a mask of
0xffff_ffff_ffff_ffff only adds one to the low byte because
~0xffff_ffff_ffff_ffff + 1 is just 1, instead of 0x0101_0101_0101_0101.

diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c
index 3707c0e..793d4ba 100644
--- a/tcg/tcg-op-gvec.c
+++ b/tcg/tcg-op-gvec.c
@@ -2264,12 +2264,13 @@ static void gen_absv_mask(TCGv_i64 d, TCGv_i64 b, 
unsigned vece)
     tcg_gen_muli_i64(t, t, (1 << nbit) - 1);
 
     /*
-     * Invert (via xor -1) and add one (via sub -1).
+     * Invert (via xor -1) and add one.
      * Because of the ordering the msb is cleared,
      * so we never have carry into the next element.
      */
     tcg_gen_xor_i64(d, b, t);
-    tcg_gen_sub_i64(d, d, t);
+    tcg_gen_andi_i64(t, t, dup_const(vece, 1));
+    tcg_gen_add_i64(d, d, t);
 
     tcg_temp_free_i64(t);
 }
-- 
1.9.1




reply via email to

[Prev in Thread] Current Thread [Next in Thread]