[Qemu-commits] [qemu/qemu] 7b60ef: tcg/i386: Fix dupi/dupm for avx1 and

qemu-commits
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-commits] [qemu/qemu] 7b60ef: tcg/i386: Fix dupi/dupm for avx1 and

From:	Peter Maydell
Subject:	[Qemu-commits] [qemu/qemu] 7b60ef: tcg/i386: Fix dupi/dupm for avx1 and 32-bit hosts
Date:	Fri, 24 May 2019 03:43:13 -0700
  Branch: refs/heads/master
  Home:   https://github.com/qemu/qemu
  Commit: 7b60ef3264e9627ac6efb34e9a6130647e9b55c0
      
https://github.com/qemu/qemu/commit/7b60ef3264e9627ac6efb34e9a6130647e9b55c0
  Author: Richard Henderson <address@hidden>
  Date:   2019-05-22 (Wed, 22 May 2019)

  Changed paths:
    M tcg/i386/tcg-target.inc.c

  Log Message:
  -----------
  tcg/i386: Fix dupi/dupm for avx1 and 32-bit hosts

The VBROADCASTSD instruction only allows %ymm registers as destination.
Rather than forcing VEX.L and writing to the entire 256-bit register,
revert to using MOVDDUP with an %xmm register.  This is sufficient for
an avx1 host since we do not support TCG_TYPE_V256 for that case.

Also fix the 32-bit avx2, which should have used VPBROADCASTW.

Fixes: 1e262b49b533
Tested-by: Mark Cave-Ayland <address@hidden>
Reported-by: Mark Cave-Ayland <address@hidden>
Signed-off-by: Richard Henderson <address@hidden>


  Commit: 532ba368a13712724137228b5e7e9435994d25e1
      
https://github.com/qemu/qemu/commit/532ba368a13712724137228b5e7e9435994d25e1
  Author: Richard Henderson <address@hidden>
  Date:   2019-05-22 (Wed, 22 May 2019)

  Changed paths:
    M tcg/tcg-op-gvec.c

  Log Message:
  -----------
  tcg: Fix missing checks and clears in tcg_gen_gvec_dup_mem

The paths through tcg_gen_dup_mem_vec and through MO_128 were
missing the check_size_align.  The path through MO_128 was also
missing the expand_clr.  This last was not visible because the
only user is ARM SVE, which would set oprsz == maxsz, and not
require the clear.

Fix by adding the check_size_align and using do_dup directly
instead of duplicating the check in tcg_gen_gvec_dup_{i32,i64}.

Signed-off-by: Richard Henderson <address@hidden>


  Commit: 38dc12947ec9106237f9cdbd428792c985cd86ae
      
https://github.com/qemu/qemu/commit/38dc12947ec9106237f9cdbd428792c985cd86ae
  Author: Richard Henderson <address@hidden>
  Date:   2019-05-22 (Wed, 22 May 2019)

  Changed paths:
    M accel/tcg/tcg-runtime-gvec.c
    M accel/tcg/tcg-runtime.h
    M tcg/README
    M tcg/aarch64/tcg-target.h
    M tcg/i386/tcg-target.h
    M tcg/tcg-op-gvec.c
    M tcg/tcg-op-gvec.h
    M tcg/tcg-op-vec.c
    M tcg/tcg-op.h
    M tcg/tcg-opc.h
    M tcg/tcg.c
    M tcg/tcg.h

  Log Message:
  -----------
  tcg: Add support for vector bitwise select

This operation performs d = (b & a) | (c & ~a), and is present
on a majority of host vector units.  Include gvec expanders.

Signed-off-by: Richard Henderson <address@hidden>


  Commit: f75da2988eb2457fa23d006d573220c5c680ec4e
      
https://github.com/qemu/qemu/commit/f75da2988eb2457fa23d006d573220c5c680ec4e
  Author: Richard Henderson <address@hidden>
  Date:   2019-05-22 (Wed, 22 May 2019)

  Changed paths:
    M tcg/README
    M tcg/aarch64/tcg-target.h
    M tcg/i386/tcg-target.h
    M tcg/tcg-op-vec.c
    M tcg/tcg-op.h
    M tcg/tcg-opc.h
    M tcg/tcg.c
    M tcg/tcg.h

  Log Message:
  -----------
  tcg: Add support for vector compare select

Perform a per-element conditional move.  This combination operation is
easier to implement on some host vector units than plain cmp+bitsel.
Omit the usual gvec interface, as this is intended to be used by
target-specific gvec expansion call-backs.

Signed-off-by: Richard Henderson <address@hidden>


  Commit: 17f79944ebeace8bf43047a33b7775ba5ed9070e
      
https://github.com/qemu/qemu/commit/17f79944ebeace8bf43047a33b7775ba5ed9070e
  Author: Richard Henderson <address@hidden>
  Date:   2019-05-22 (Wed, 22 May 2019)

  Changed paths:
    M tcg/tcg-op-vec.c

  Log Message:
  -----------
  tcg: Introduce do_op3_nofail for vector expansion

This makes do_op3 match do_op2 in allowing for failure,
and thus fall back expansions.

Signed-off-by: Richard Henderson <address@hidden>


  Commit: 72b4c792c7a576d9246207a8e9a940ed9e191722
      
https://github.com/qemu/qemu/commit/72b4c792c7a576d9246207a8e9a940ed9e191722
  Author: Richard Henderson <address@hidden>
  Date:   2019-05-22 (Wed, 22 May 2019)

  Changed paths:
    M tcg/tcg-op-vec.c

  Log Message:
  -----------
  tcg: Expand vector minmax using cmp+cmpsel

Provide a generic fallback for the min/max operations.

Signed-off-by: Richard Henderson <address@hidden>


  Commit: 25c012b4009256505be3430480954a0233de343e
      
https://github.com/qemu/qemu/commit/25c012b4009256505be3430480954a0233de343e
  Author: Richard Henderson <address@hidden>
  Date:   2019-05-22 (Wed, 22 May 2019)

  Changed paths:
    M tcg/tcg-opc.h

  Log Message:
  -----------
  tcg: Add TCG_OPF_NOT_PRESENT if TCG_TARGET_HAS_foo is negative

If INDEX_op_foo is always expanded by tcg_expand_vec_op, then
there may be no reasonable set of constraints to return from
tcg_target_op_def for that opcode.

Let TCG_TARGET_HAS_foo be specified as -1 in that case.  Thus a
boolean test for TCG_TARGET_HAS_foo is true, but we will not
assert within process_op_defs when no constraints are specified.

Compare this with tcg_can_emit_vec_op, which already uses this
tri-state indication.

Signed-off-by: Richard Henderson <address@hidden>


  Commit: 904c5e19672778cc3349f4975437cfdf3371abb6
      
https://github.com/qemu/qemu/commit/904c5e19672778cc3349f4975437cfdf3371abb6
  Author: Richard Henderson <address@hidden>
  Date:   2019-05-22 (Wed, 22 May 2019)

  Changed paths:
    M tcg/i386/tcg-target.h
    M tcg/i386/tcg-target.inc.c

  Log Message:
  -----------
  tcg/i386: Support vector comparison select value

We already had backend support for this feature.  Expand the new
cmpsel opcode using vpblendb.  The combination allows us to avoid
an extra NOT for some comparison codes.

Signed-off-by: Richard Henderson <address@hidden>


  Commit: 3ec3538a45f2fead475b0cca6945092c87927b4f
      
https://github.com/qemu/qemu/commit/3ec3538a45f2fead475b0cca6945092c87927b4f
  Author: Richard Henderson <address@hidden>
  Date:   2019-05-22 (Wed, 22 May 2019)

  Changed paths:
    M tcg/i386/tcg-target.inc.c

  Log Message:
  -----------
  tcg/i386: Remove expansion for missing minmax

This is now handled by code within tcg-op-vec.c.

Signed-off-by: Richard Henderson <address@hidden>


  Commit: ebcfb91abed8c0fb180a968b9004419c208dcc02
      
https://github.com/qemu/qemu/commit/ebcfb91abed8c0fb180a968b9004419c208dcc02
  Author: Richard Henderson <address@hidden>
  Date:   2019-05-22 (Wed, 22 May 2019)

  Changed paths:
    M tcg/i386/tcg-target.inc.c

  Log Message:
  -----------
  tcg/i386: Use umin/umax in expanding unsigned compare

Using umin(a, b) == a as an expansion for TCG_COND_LEU is a
better alternative to (a - INT_MIN) <= (b - INT_MIN).

Signed-off-by: Richard Henderson <address@hidden>


  Commit: a9e434a5dc16f71ee156428619fc3c3765b68f26
      
https://github.com/qemu/qemu/commit/a9e434a5dc16f71ee156428619fc3c3765b68f26
  Author: Richard Henderson <address@hidden>
  Date:   2019-05-22 (Wed, 22 May 2019)

  Changed paths:
    M tcg/aarch64/tcg-target.h
    M tcg/aarch64/tcg-target.inc.c

  Log Message:
  -----------
  tcg/aarch64: Support vector bitwise select value

The instruction set has 3 insns that perform the same operation,
only varying in which operand must overlap the destination.  We
can represent the operation without overlap and choose based on
the operands seen.

Signed-off-by: Richard Henderson <address@hidden>


  Commit: 984fdcee342473dfe797897758929dad654693c8
      
https://github.com/qemu/qemu/commit/984fdcee342473dfe797897758929dad654693c8
  Author: Richard Henderson <address@hidden>
  Date:   2019-05-22 (Wed, 22 May 2019)

  Changed paths:
    M tcg/aarch64/tcg-target.inc.c

  Log Message:
  -----------
  tcg/aarch64: Split up is_fimm

There are several sub-classes of vector immediate, and only MOVI
can use them all.  This will enable usage of MVNI and ORRI, which
use progressively fewer sub-classes.

This patch adds no new functionality, merely splits the function
and moves part of the logic into tcg_out_dupi_vec.

Signed-off-by: Richard Henderson <address@hidden>


  Commit: 7e308e003e5b6ddd3130e09711e1d33693230696
      
https://github.com/qemu/qemu/commit/7e308e003e5b6ddd3130e09711e1d33693230696
  Author: Richard Henderson <address@hidden>
  Date:   2019-05-22 (Wed, 22 May 2019)

  Changed paths:
    M tcg/aarch64/tcg-target.inc.c

  Log Message:
  -----------
  tcg/aarch64: Use MVNI in tcg_out_dupi_vec

The compliment of a subset of immediates can be computed
with a single instruction.

Signed-off-by: Richard Henderson <address@hidden>


  Commit: 02f3a5b4744885258758d07ebe09cf965de78bcf
      
https://github.com/qemu/qemu/commit/02f3a5b4744885258758d07ebe09cf965de78bcf
  Author: Richard Henderson <address@hidden>
  Date:   2019-05-22 (Wed, 22 May 2019)

  Changed paths:
    M tcg/aarch64/tcg-target.inc.c

  Log Message:
  -----------
  tcg/aarch64: Build vector immediates with two insns

Use MOVI+ORR or MVNI+BIC in order to build some vector constants,
as opposed to dropping them to the constant pool.  This includes
all 16-bit constants and a similar set of 32-bit constants.

Signed-off-by: Richard Henderson <address@hidden>


  Commit: 9e27f58b9902834dffc0d66d9eb62f78d9c2a632
      
https://github.com/qemu/qemu/commit/9e27f58b9902834dffc0d66d9eb62f78d9c2a632
  Author: Richard Henderson <address@hidden>
  Date:   2019-05-22 (Wed, 22 May 2019)

  Changed paths:
    M tcg/aarch64/tcg-target.inc.c

  Log Message:
  -----------
  tcg/aarch64: Allow immediates for vector ORR and BIC

The allows immediates to be used for ORR and BIC,
as well as the trivial inversions, ORC and AND.

Signed-off-by: Richard Henderson <address@hidden>


  Commit: 11e2bfef799024be4a08fcf6797fe0b22fb16b58
      
https://github.com/qemu/qemu/commit/11e2bfef799024be4a08fcf6797fe0b22fb16b58
  Author: Richard Henderson <address@hidden>
  Date:   2019-05-22 (Wed, 22 May 2019)

  Changed paths:
    M tcg/i386/tcg-target.inc.c

  Log Message:
  -----------
  tcg/i386: Use MOVDQA for TCG_TYPE_V128 load/store

This instruction raises #GP, aka SIGSEGV, if the effective address
is not aligned to 16-bytes.

We have assertions in tcg-op-gvec.c that the offset from ENV is
aligned, for vector types <= V128.  But the offset itself does not
validate that the final pointer is aligned -- one must also remember
to use the QEMU_ALIGNED() attribute on the vector member within ENV.

PowerPC Altivec has vector load/store instructions that silently
discard the low 4 bits of the address, making alignment mistakes
difficult to discover.  Aid that by making the most popular host
visibly signal the error.

Signed-off-by: Richard Henderson <address@hidden>


  Commit: 636011255dec55da4cac28240ffcaa2e740f1e81
      
https://github.com/qemu/qemu/commit/636011255dec55da4cac28240ffcaa2e740f1e81
  Author: Peter Maydell <address@hidden>
  Date:   2019-05-24 (Fri, 24 May 2019)

  Changed paths:
    M accel/tcg/tcg-runtime-gvec.c
    M accel/tcg/tcg-runtime.h
    M tcg/README
    M tcg/aarch64/tcg-target.h
    M tcg/aarch64/tcg-target.inc.c
    M tcg/i386/tcg-target.h
    M tcg/i386/tcg-target.inc.c
    M tcg/tcg-op-gvec.c
    M tcg/tcg-op-gvec.h
    M tcg/tcg-op-vec.c
    M tcg/tcg-op.h
    M tcg/tcg-opc.h
    M tcg/tcg.c
    M tcg/tcg.h

  Log Message:
  -----------
  Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20190522' into staging

Misc gvec improvements

# gpg: Signature made Wed 22 May 2019 23:25:48 BST
# gpg:                using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg:                issuer "address@hidden"
# gpg: Good signature from "Richard Henderson <address@hidden>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* remotes/rth/tags/pull-tcg-20190522:
  tcg/i386: Use MOVDQA for TCG_TYPE_V128 load/store
  tcg/aarch64: Allow immediates for vector ORR and BIC
  tcg/aarch64: Build vector immediates with two insns
  tcg/aarch64: Use MVNI in tcg_out_dupi_vec
  tcg/aarch64: Split up is_fimm
  tcg/aarch64: Support vector bitwise select value
  tcg/i386: Use umin/umax in expanding unsigned compare
  tcg/i386: Remove expansion for missing minmax
  tcg/i386: Support vector comparison select value
  tcg: Add TCG_OPF_NOT_PRESENT if TCG_TARGET_HAS_foo is negative
  tcg: Expand vector minmax using cmp+cmpsel
  tcg: Introduce do_op3_nofail for vector expansion
  tcg: Add support for vector compare select
  tcg: Add support for vector bitwise select
  tcg: Fix missing checks and clears in tcg_gen_gvec_dup_mem
  tcg/i386: Fix dupi/dupm for avx1 and 32-bit hosts

Signed-off-by: Peter Maydell <address@hidden>


Compare: https://github.com/qemu/qemu/compare/ceac83e9ba72...636011255dec
[Prev in Thread]
Current Thread
[Next in Thread]
[Qemu-commits] [qemu/qemu] 7b60ef: tcg/i386: Fix dupi/dupm for avx1 and 32-bit hosts, Peter Maydell <=
Prev by Date: [Qemu-commits] [qemu/qemu] 80ac95: target/arm: Use extract2 for EXTR
Next by Date: [Qemu-commits] [qemu/qemu] d57f25: hw/display/ramfb: fix guest memory un-mapping
Previous by thread: [Qemu-commits] [qemu/qemu] 80ac95: target/arm: Use extract2 for EXTR
Next by thread: [Qemu-commits] [qemu/qemu] d57f25: hw/display/ramfb: fix guest memory un-mapping
Index(es):
- Date
- Thread