[Qemu-commits] [qemu/qemu] c6489d: docs: new design document multi-threa

qemu-commits
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-commits] [qemu/qemu] c6489d: docs: new design document multi-threa

From:	GitHub
Subject:	[Qemu-commits] [qemu/qemu] c6489d: docs: new design document multi-thread-tcg.txt
Date:	Sat, 25 Feb 2017 13:15:09 -0800
  Branch: refs/heads/master
  Home:   https://github.com/qemu/qemu
  Commit: c6489dd921e7450bced1816013eb22cc100ed07c
      
https://github.com/qemu/qemu/commit/c6489dd921e7450bced1816013eb22cc100ed07c
  Author: Alex Bennée <address@hidden>
  Date:   2017-02-24 (Fri, 24 Feb 2017)

  Changed paths:
    A docs/multi-thread-tcg.txt

  Log Message:
  -----------
  docs: new design document multi-thread-tcg.txt

This documents the current design for upgrading TCG emulation to take
advantage of modern CPUs by running a thread-per-CPU. The document goes
through the various areas of the code affected by such a change and
proposes design requirements for each part of the solution.

The text marked with (Current solution[s]) to document what the current
approaches being used are.

Signed-off-by: Alex Bennée <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 6ac3d7e845549f08473f020c1c70f14b8911a67e
      
https://github.com/qemu/qemu/commit/6ac3d7e845549f08473f020c1c70f14b8911a67e
  Author: Pranith Kumar <address@hidden>
  Date:   2017-02-24 (Fri, 24 Feb 2017)

  Changed paths:
    M translate-all.c

  Log Message:
  -----------
  mttcg: translate-all: Enable locking debug in a debug build

Enable tcg lock debug asserts in a debug build by default instead of
relying on DEBUG_LOCKING. None of the other DEBUG_* macros have
asserts, so this patch removes DEBUG_LOCKING and enable these asserts
in a debug build.

CC: Richard Henderson <address@hidden>
Signed-off-by: Pranith Kumar <address@hidden>
[AJB: tweak ifdefs so can be early in series]
Signed-off-by: Alex Bennée <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 4ec667042d9ac017daad318ad848cd05cd823df8
      
https://github.com/qemu/qemu/commit/4ec667042d9ac017daad318ad848cd05cd823df8
  Author: Pranith Kumar <address@hidden>
  Date:   2017-02-24 (Fri, 24 Feb 2017)

  Changed paths:
    M cpu-exec.c

  Log Message:
  -----------
  mttcg: Add missing tb_lock/unlock() in cpu_exec_step()

The recent patch enabling lock assertions uncovered the missing lock
acquisition in cpu_exec_step(). This patch adds them.

Signed-off-by: Pranith Kumar <address@hidden>
Signed-off-by: Alex Bennée <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 20937143145b8f5a4194e5c407731ba38797864e
      
https://github.com/qemu/qemu/commit/20937143145b8f5a4194e5c407731ba38797864e
  Author: Alex Bennée <address@hidden>
  Date:   2017-02-24 (Fri, 24 Feb 2017)

  Changed paths:
    A tcg/tcg-mo.h
    M tcg/tcg.h

  Log Message:
  -----------
  tcg: move TCG_MO/BAR types into own file

We'll be using the memory ordering definitions to define values for
both the host and guest. To avoid fighting with circular header
dependencies just move these types into their own minimal header.

Signed-off-by: Alex Bennée <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 8d4e9146b3568022ea5730d92841345d41275d66
      
https://github.com/qemu/qemu/commit/8d4e9146b3568022ea5730d92841345d41275d66
  Author: KONRAD Frederic <address@hidden>
  Date:   2017-02-24 (Fri, 24 Feb 2017)

  Changed paths:
    M cpus.c
    M include/qom/cpu.h
    M include/sysemu/cpus.h
    M qemu-options.hx
    M tcg/tcg.h
    M vl.c

  Log Message:
  -----------
  tcg: add options for enabling MTTCG

We know there will be cases where MTTCG won't work until additional work
is done in the front/back ends to support. It will however be useful to
be able to turn it on.

As a result MTTCG will default to off unless the combination is
supported. However the user can turn it on for the sake of testing.

Signed-off-by: KONRAD Frederic <address@hidden>
[AJB: move to -accel tcg,thread=multi|single, defaults]
Signed-off-by: Alex Bennée <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 6546706d28bbcec5c14601b446c0a1cde5256597
      
https://github.com/qemu/qemu/commit/6546706d28bbcec5c14601b446c0a1cde5256597
  Author: Alex Bennée <address@hidden>
  Date:   2017-02-24 (Fri, 24 Feb 2017)

  Changed paths:
    M cpus.c

  Log Message:
  -----------
  tcg: add kick timer for single-threaded vCPU emulation

Currently we rely on the side effect of the main loop grabbing the
iothread_mutex to give any long running basic block chains a kick to
ensure the next vCPU is scheduled. As this code is being re-factored and
rationalised we now do it explicitly here.

Signed-off-by: Alex Bennée <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>
Reviewed-by: Pranith Kumar <address@hidden>


  Commit: 791158d93b27f22a17c2ada06621831d54f09a2c
      
https://github.com/qemu/qemu/commit/791158d93b27f22a17c2ada06621831d54f09a2c
  Author: Alex Bennée <address@hidden>
  Date:   2017-02-24 (Fri, 24 Feb 2017)

  Changed paths:
    M cpu-exec-common.c
    M cpu-exec.c
    M cpus.c
    M include/exec/exec-all.h

  Log Message:
  -----------
  tcg: rename tcg_current_cpu to tcg_current_rr_cpu

..and make the definition local to cpus. In preparation for MTTCG the
concept of a global tcg_current_cpu will no longer make sense. However
we still need to keep track of it in the single-threaded case to be able
to exit quickly when required.

qemu_cpu_kick_no_halt() moves and becomes qemu_cpu_kick_rr_cpu() to
emphasise its use-case. qemu_cpu_kick now kicks the relevant cpu as
well as qemu_kick_rr_cpu() which will become a no-op in MTTCG.

For the time being the setting of the global exit_request remains.

Signed-off-by: Alex Bennée <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>
Reviewed-by: Pranith Kumar <address@hidden>


  Commit: 8d04fb55dec381bc5105cb47f29d918e579e8cbd
      
https://github.com/qemu/qemu/commit/8d04fb55dec381bc5105cb47f29d918e579e8cbd
  Author: Jan Kiszka <address@hidden>
  Date:   2017-02-24 (Fri, 24 Feb 2017)

  Changed paths:
    M cpu-exec.c
    M cpus.c
    M cputlb.c
    M exec.c
    M hw/core/irq.c
    M hw/i386/kvmvapic.c
    M hw/intc/arm_gicv3_cpuif.c
    M hw/ppc/ppc.c
    M hw/ppc/spapr.c
    M include/qom/cpu.h
    M memory.c
    M qom/cpu.c
    M target/arm/helper.c
    M target/arm/op_helper.c
    M target/i386/smm_helper.c
    M target/s390x/misc_helper.c
    M translate-all.c
    M translate-common.c

  Log Message:
  -----------
  tcg: drop global lock during TCG code execution

This finally allows TCG to benefit from the iothread introduction: Drop
the global mutex while running pure TCG CPU code. Reacquire the lock
when entering MMIO or PIO emulation, or when leaving the TCG loop.

We have to revert a few optimization for the current TCG threading
model, namely kicking the TCG thread in qemu_mutex_lock_iothread and not
kicking it in qemu_cpu_kick. We also need to disable RAM block
reordering until we have a more efficient locking mechanism at hand.

Still, a Linux x86 UP guest and my Musicpal ARM model boot fine here.
These numbers demonstrate where we gain something:

20338 jan       20   0  331m  75m 6904 R   99  0.9   0:50.95 qemu-system-arm
20337 jan       20   0  331m  75m 6904 S   20  0.9   0:26.50 qemu-system-arm

The guest CPU was fully loaded, but the iothread could still run mostly
independent on a second core. Without the patch we don't get beyond

32206 jan       20   0  330m  73m 7036 R   82  0.9   1:06.00 qemu-system-arm
32204 jan       20   0  330m  73m 7036 S   21  0.9   0:17.03 qemu-system-arm

We don't benefit significantly, though, when the guest is not fully
loading a host CPU.

Signed-off-by: Jan Kiszka <address@hidden>
Message-Id: <address@hidden>
[FK: Rebase, fix qemu_devices_reset deadlock, rm address_space_* mutex]
Signed-off-by: KONRAD Frederic <address@hidden>
[EGC: fixed iothread lock for cpu-exec IRQ handling]
Signed-off-by: Emilio G. Cota <address@hidden>
[AJB: -smp single-threaded fix, clean commit msg, BQL fixes]
Signed-off-by: Alex Bennée <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>
Reviewed-by: Pranith Kumar <address@hidden>
[PM: target-arm changes]
Acked-by: Peter Maydell <address@hidden>


  Commit: e5143e30fb87fbf179029387f83f98a5a9b27f19
      
https://github.com/qemu/qemu/commit/e5143e30fb87fbf179029387f83f98a5a9b27f19
  Author: Alex Bennée <address@hidden>
  Date:   2017-02-24 (Fri, 24 Feb 2017)

  Changed paths:
    M cpu-exec-common.c
    M cpu-exec.c
    M cpus.c
    M include/exec/exec-all.h

  Log Message:
  -----------
  tcg: remove global exit_request

There are now only two uses of the global exit_request left.

The first ensures we exit the run_loop when we first start to process
pending work and in the kick handler. This is just as easily done by
setting the first_cpu->exit_request flag.

The second use is in the round robin kick routine. The global
exit_request ensured every vCPU would set its local exit_request and
cause a full exit of the loop. Now the iothread isn't being held while
running we can just rely on the kick handler to push us out as intended.

We lightly re-factor the main vCPU thread to ensure cpu->exit_requests
cause us to exit the main loop and process any IO requests that might
come along. As an cpu->exit_request may legitimately get squashed
while processing the EXCP_INTERRUPT exception we also check
cpu->queued_work_first to ensure queued work is expedited as soon as
possible.

Signed-off-by: Alex Bennée <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 2f1696066049c25f7f7d75352aa0cad3b0b1d87e
      
https://github.com/qemu/qemu/commit/2f1696066049c25f7f7d75352aa0cad3b0b1d87e
  Author: Alex Bennée <address@hidden>
  Date:   2017-02-24 (Fri, 24 Feb 2017)

  Changed paths:
    M translate-all.c

  Log Message:
  -----------
  tcg: enable tb_lock() for SoftMMU

tb_lock() has long been used for linux-user mode to protect code
generation. By enabling it now we prepare for MTTCG and ensure all code
generation is serialised by this lock. The other major structure that
needs protecting is the l1_map and its PageDesc structures. For the
SoftMMU case we also use tb_lock() to protect these structures instead
of linux-user mmap_lock() which as the name suggests serialises updates
to the structure as a result of guest mmap operations.

Signed-off-by: Alex Bennée <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 372579427a5040a26dfee78464b50e2bdf27ef26
      
https://github.com/qemu/qemu/commit/372579427a5040a26dfee78464b50e2bdf27ef26
  Author: Alex Bennée <address@hidden>
  Date:   2017-02-24 (Fri, 24 Feb 2017)

  Changed paths:
    M cpu-exec.c
    M cpus.c

  Log Message:
  -----------
  tcg: enable thread-per-vCPU

There are a couple of changes that occur at the same time here:

  - introduce a single vCPU qemu_tcg_cpu_thread_fn

  One of these is spawned per vCPU with its own Thread and Condition
  variables. qemu_tcg_rr_cpu_thread_fn is the new name for the old
  single threaded function.

  - the TLS current_cpu variable is now live for the lifetime of MTTCG
    vCPU threads. This is for future work where async jobs need to know
    the vCPU context they are operating in.

The user to switch on multi-thread behaviour and spawn a thread
per-vCPU. For a simple test kvm-unit-test like:

  ./arm/run ./arm/locking-test.flat -smp 4 -accel tcg,thread=multi

Will now use 4 vCPU threads and have an expected FAIL (instead of the
unexpected PASS) as the default mode of the test has no protection when
incrementing a shared variable.

We enable the parallel_cpus flag to ensure we generate correct barrier
and atomic code if supported by the front and backends. This doesn't
automatically enable MTTCG until default_mttcg_enabled() is updated to
check the configuration is supported.

Signed-off-by: KONRAD Frederic <address@hidden>
Signed-off-by: Paolo Bonzini <address@hidden>
[AJB: Some fixes, conditionally, commit rewording]
Signed-off-by: Alex Bennée <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 08e73c48b053566bfe0c994f154f73991cd0ff0e
      
https://github.com/qemu/qemu/commit/08e73c48b053566bfe0c994f154f73991cd0ff0e
  Author: Pranith Kumar <address@hidden>
  Date:   2017-02-24 (Fri, 24 Feb 2017)

  Changed paths:
    M cpu-exec.c
    M cpus.c

  Log Message:
  -----------
  tcg: handle EXCP_ATOMIC exception for system emulation

The patch enables handling atomic code in the guest. This should be
preferably done in cpu_handle_exception(), but the current assumptions
regarding when we can execute atomic sections cause a deadlock.

The current mechanism discards the flags which were set in atomic
execution. We ensure they are properly saved by calling the
cc->cpu_exec_enter/leave() functions around the loop.

As we are running cpu_exec_step_atomic() from the outermost loop we
need to avoid an abort() when single stepping over atomic code since
debug exception longjmp will point to the the setlongjmp in
cpu_exec(). We do this by setting a new jmp_env so that it jumps back
here on an exception.

Signed-off-by: Pranith Kumar <address@hidden>
[AJB: tweak title, merge with new patches, add mmap_lock]
Signed-off-by: Alex Bennée <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>
CC: Paolo Bonzini <address@hidden>


  Commit: f0aff0f124028aaaab24e5e53fb030d389766913
      
https://github.com/qemu/qemu/commit/f0aff0f124028aaaab24e5e53fb030d389766913
  Author: Alex Bennée <address@hidden>
  Date:   2017-02-24 (Fri, 24 Feb 2017)

  Changed paths:
    M cputlb.c

  Log Message:
  -----------
  cputlb: add assert_cpu_is_self checks

For SoftMMU the TLB flushes are an example of a task that can be
triggered on one vCPU by another. To deal with this properly we need to
use safe work to ensure these changes are done safely. The new assert
can be enabled while debugging to catch these cases.

Signed-off-by: Alex Bennée <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 857baec1d9e80947f0c1007c3a3d2331d62b4b53
      
https://github.com/qemu/qemu/commit/857baec1d9e80947f0c1007c3a3d2331d62b4b53
  Author: Alex Bennée <address@hidden>
  Date:   2017-02-24 (Fri, 24 Feb 2017)

  Changed paths:
    M cputlb.c

  Log Message:
  -----------
  cputlb: tweak qemu_ram_addr_from_host_nofail reporting

This moves the helper function closer to where it is called and updates
the error message to report via error_report instead of the deprecated
fprintf.

Signed-off-by: Alex Bennée <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: e3b9ca810980851f93f5719a7df2044c9435f003
      
https://github.com/qemu/qemu/commit/e3b9ca810980851f93f5719a7df2044c9435f003
  Author: KONRAD Frederic <address@hidden>
  Date:   2017-02-24 (Fri, 24 Feb 2017)

  Changed paths:
    M cputlb.c
    M include/exec/exec-all.h
    M include/qom/cpu.h

  Log Message:
  -----------
  cputlb: introduce tlb_flush_* async work.

Some architectures allow to flush the tlb of other VCPUs. This is not a problem
when we have only one thread for all VCPUs but it definitely needs to be an
asynchronous work when we are in true multithreaded work.

We take the tb_lock() when doing this to avoid racing with other threads
which may be invalidating TB's at the same time. The alternative would
be to use proper atomic primitives to clear the tlb entries en-mass.

This patch doesn't do anything to protect other cputlb function being
called in MTTCG mode making cross vCPU changes.

Signed-off-by: KONRAD Frederic <address@hidden>
[AJB: remove need for g_malloc on defer, make check fixes, tb_lock]
Signed-off-by: Alex Bennée <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 0336cbf8532935d8e23c2aabf3e2ce2c0697b6ac
      
https://github.com/qemu/qemu/commit/0336cbf8532935d8e23c2aabf3e2ce2c0697b6ac
  Author: Alex Bennée <address@hidden>
  Date:   2017-02-24 (Fri, 24 Feb 2017)

  Changed paths:
    M cputlb.c
    M include/exec/exec-all.h
    M target/arm/helper.c
    M target/sparc/ldst_helper.c

  Log Message:
  -----------
  cputlb and arm/sparc targets: convert mmuidx flushes from varg to bitmap

While the vargs approach was flexible the original MTTCG ended up
having munge the bits to a bitmap so the data could be used in
deferred work helpers. Instead of hiding that in cputlb we push the
change to the API to make it take a bitmap of MMU indexes instead.

For ARM some the resulting flushes end up being quite long so to aid
readability I've tended to move the index shifting to a new line so
all the bits being or-ed together line up nicely, for example:

    tlb_flush_page_by_mmuidx(other_cs, pageaddr,
                       (1 << ARMMMUIdx_S1SE1) |
                       (1 << ARMMMUIdx_S1SE0));

Signed-off-by: Alex Bennée <address@hidden>
[AT: SPARC parts only]
Reviewed-by: Artyom Tarasenko <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>
[PM: ARM parts only]
Reviewed-by: Peter Maydell <address@hidden>


  Commit: e72184455c2e479199823b617dbea0df6940e646
      
https://github.com/qemu/qemu/commit/e72184455c2e479199823b617dbea0df6940e646
  Author: Alex Bennée <address@hidden>
  Date:   2017-02-24 (Fri, 24 Feb 2017)

  Changed paths:
    M cputlb.c
    M include/qom/cpu.h

  Log Message:
  -----------
  cputlb: add tlb_flush_by_mmuidx async routines

This converts the remaining TLB flush routines to use async work when
detecting a cross-vCPU flush. The only minor complication is having to
serialise the var_list of MMU indexes into a form that can be punted
to an asynchronous job.

The pending_tlb_flush field on QOM's CPU structure also becomes a
bitfield rather than a boolean.

Signed-off-by: Alex Bennée <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: b0706b716769494f321a0d2bfd9fa9893992f995
      
https://github.com/qemu/qemu/commit/b0706b716769494f321a0d2bfd9fa9893992f995
  Author: Alex Bennée <address@hidden>
  Date:   2017-02-24 (Fri, 24 Feb 2017)

  Changed paths:
    M cputlb.c
    M include/exec/cputlb.h

  Log Message:
  -----------
  cputlb: atomically update tlb fields used by tlb_reset_dirty

The main use case for tlb_reset_dirty is to set the TLB_NOTDIRTY flags
in TLB entries to force the slow-path on writes. This is used to mark
page ranges containing code which has been translated so it can be
invalidated if written to. To do this safely we need to ensure the TLB
entries in question for all vCPUs are updated before we attempt to run
the code otherwise a race could be introduced.

To achieve this we atomically set the flag in tlb_reset_dirty_range and
take care when setting it when the TLB entry is filled.

On 32 bit systems attempting to emulate 64 bit guests we don't even
bother as we might not have the atomic primitives available. MTTCG is
disabled in this case and can't be forced on. The copy_tlb_helper
function helps keep the atomic semantics in one place to avoid
confusion.

The dirty helper function is made static as it isn't used outside of
cputlb.

Signed-off-by: Alex Bennée <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: c3b9a07a33de8015726b397270485c3998e7f86a
      
https://github.com/qemu/qemu/commit/c3b9a07a33de8015726b397270485c3998e7f86a
  Author: Alex Bennée <address@hidden>
  Date:   2017-02-24 (Fri, 24 Feb 2017)

  Changed paths:
    M cputlb.c
    M include/exec/exec-all.h

  Log Message:
  -----------
  cputlb: introduce tlb_flush_*_all_cpus[_synced]

This introduces support to the cputlb API for flushing all CPUs TLBs
with one call. This avoids the need for target helpers to iterate
through the vCPUs themselves.

An additional variant of the API (_synced) will cause the source vCPUs
work to be scheduled as "safe work". The result will be all the flush
operations will be complete by the time the originating vCPU executes
its safe work. The calling implementation can either end the TB
straight away (which will then pick up the cpu->exit_request on
entering the next block) or defer the exit until the architectural
sync point (usually a barrier instruction).

Signed-off-by: Alex Bennée <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>


  Commit: 062ba099e01ff1474be98c0a4f3da351efab5d9d
      
https://github.com/qemu/qemu/commit/062ba099e01ff1474be98c0a4f3da351efab5d9d
  Author: Alex Bennée <address@hidden>
  Date:   2017-02-24 (Fri, 24 Feb 2017)

  Changed paths:
    M target/arm/arm-powerctl.c
    M target/arm/arm-powerctl.h
    M target/arm/cpu.c
    M target/arm/cpu.h
    M target/arm/kvm.c
    M target/arm/machine.c
    M target/arm/psci.c

  Log Message:
  -----------
  target-arm/powerctl: defer cpu reset work to CPU context

When switching a new vCPU on we want to complete a bunch of the setup
work before we start scheduling the vCPU thread. To do this cleanly we
defer vCPU setup to async work which will run the vCPUs execution
context as the thread is woken up. The scheduling of the work will kick
the vCPU awake.

This avoids potential races in MTTCG system emulation.

Signed-off-by: Alex Bennée <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>
Reviewed-by: Peter Maydell <address@hidden>


  Commit: c22edfebff29f63d793032e4fbd42a035bb73e27
      
https://github.com/qemu/qemu/commit/c22edfebff29f63d793032e4fbd42a035bb73e27
  Author: Alex Bennée <address@hidden>
  Date:   2017-02-24 (Fri, 24 Feb 2017)

  Changed paths:
    M target/arm/op_helper.c
    M target/arm/translate-a64.c
    M target/arm/translate.c

  Log Message:
  -----------
  target-arm: don't generate WFE/YIELD calls for MTTCG

The WFE and YIELD instructions are really only hints and in TCG's case
they were useful to move the scheduling on from one vCPU to the next. In
the parallel context (MTTCG) this just causes an unnecessary cpu_exit
and contention of the BQL.

Signed-off-by: Alex Bennée <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>
Reviewed-by: Peter Maydell <address@hidden>


  Commit: a67cf2772733e0ff40ed14cfed9e177b050c22a7
      
https://github.com/qemu/qemu/commit/a67cf2772733e0ff40ed14cfed9e177b050c22a7
  Author: Alex Bennée <address@hidden>
  Date:   2017-02-24 (Fri, 24 Feb 2017)

  Changed paths:
    M target/arm/helper.c

  Log Message:
  -----------
  target-arm: ensure all cross vCPUs TLB flushes complete

Previously flushes on other vCPUs would only get serviced when they
exited their TranslationBlocks. While this isn't overly problematic it
violates the semantics of TLB flush from the point of view of source
vCPU.

To solve this we call the cputlb *_all_cpus_synced() functions to do
the flushes which ensures all flushes are completed by the time the
vCPU next schedules its own work. As the TLB instructions are modelled
as CP writes the TB ends at this point meaning cpu->exit_request will
be checked before the next instruction is executed.

Deferring the work until the architectural sync point is a possible
future optimisation.

Signed-off-by: Alex Bennée <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>
Reviewed-by: Peter Maydell <address@hidden>


  Commit: 4881658a4bf6dc5335e5033d0916b2e86687463d
      
https://github.com/qemu/qemu/commit/4881658a4bf6dc5335e5033d0916b2e86687463d
  Author: Alex Bennée <address@hidden>
  Date:   2017-02-24 (Fri, 24 Feb 2017)

  Changed paths:
    M hw/misc/imx6_src.c

  Log Message:
  -----------
  hw/misc/imx6_src: defer clearing of SRC_SCR reset bits

The arm_reset_cpu/set_cpu_on/set_cpu_off() functions do their work
asynchronously in the target vCPUs context. As a result we need to
ensure the SRC_SCR reset bits correctly report the reset status at the
right time. To do this we defer the clearing of the bit with an async
job which will run after the work queued by ARM powerctl functions.

Signed-off-by: Alex Bennée <address@hidden>
Reviewed-by: Peter Maydell <address@hidden>


  Commit: ca759f9e387db87e1719911f019bc60c74be9ed8
      
https://github.com/qemu/qemu/commit/ca759f9e387db87e1719911f019bc60c74be9ed8
  Author: Alex Bennée <address@hidden>
  Date:   2017-02-24 (Fri, 24 Feb 2017)

  Changed paths:
    M configure
    M target/arm/cpu.h
    M tcg/i386/tcg-target.h

  Log Message:
  -----------
  tcg: enable MTTCG by default for ARM on x86 hosts

This enables the multi-threaded system emulation by default for ARMv7
and ARMv8 guests using the x86_64 TCG backend. This is because on the
guest side:

  - The ARM translate.c/translate-64.c have been converted to
    - use MTTCG safe atomic primitives
    - emit the appropriate barrier ops
  - The ARM machine has been updated to
    - hold the BQL when modifying shared cross-vCPU state
    - defer powerctl changes to async safe work

All the host backends support the barrier and atomic primitives but
need to provide same-or-better support for normal load/store
operations.

Signed-off-by: Alex Bennée <address@hidden>
Reviewed-by: Richard Henderson <address@hidden>
Acked-by: Peter Maydell <address@hidden>
Tested-by: Pranith Kumar <address@hidden>
Reviewed-by: Pranith Kumar <address@hidden>


  Commit: 28f997a82cb509bf4775d4006b368e1bde8b7bdd
      
https://github.com/qemu/qemu/commit/28f997a82cb509bf4775d4006b368e1bde8b7bdd
  Author: Peter Maydell <address@hidden>
  Date:   2017-02-25 (Sat, 25 Feb 2017)

  Changed paths:
    M configure
    M cpu-exec-common.c
    M cpu-exec.c
    M cpus.c
    M cputlb.c
    A docs/multi-thread-tcg.txt
    M exec.c
    M hw/core/irq.c
    M hw/i386/kvmvapic.c
    M hw/intc/arm_gicv3_cpuif.c
    M hw/misc/imx6_src.c
    M hw/ppc/ppc.c
    M hw/ppc/spapr.c
    M include/exec/cputlb.h
    M include/exec/exec-all.h
    M include/qom/cpu.h
    M include/sysemu/cpus.h
    M memory.c
    M qemu-options.hx
    M qom/cpu.c
    M target/arm/arm-powerctl.c
    M target/arm/arm-powerctl.h
    M target/arm/cpu.c
    M target/arm/cpu.h
    M target/arm/helper.c
    M target/arm/kvm.c
    M target/arm/machine.c
    M target/arm/op_helper.c
    M target/arm/psci.c
    M target/arm/translate-a64.c
    M target/arm/translate.c
    M target/i386/smm_helper.c
    M target/s390x/misc_helper.c
    M target/sparc/ldst_helper.c
    M tcg/i386/tcg-target.h
    A tcg/tcg-mo.h
    M tcg/tcg.h
    M translate-all.c
    M translate-common.c
    M vl.c

  Log Message:
  -----------
  Merge remote-tracking branch 'remotes/stsquad/tags/pull-mttcg-240217-1' into 
staging

This is the MTTCG pull-request as posted yesterday.

# gpg: Signature made Fri 24 Feb 2017 11:17:51 GMT
# gpg:                using RSA key 0xFBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <address@hidden>"
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8  DF35 FBD0 DB09 5A9E 2A44

* remotes/stsquad/tags/pull-mttcg-240217-1: (24 commits)
  tcg: enable MTTCG by default for ARM on x86 hosts
  hw/misc/imx6_src: defer clearing of SRC_SCR reset bits
  target-arm: ensure all cross vCPUs TLB flushes complete
  target-arm: don't generate WFE/YIELD calls for MTTCG
  target-arm/powerctl: defer cpu reset work to CPU context
  cputlb: introduce tlb_flush_*_all_cpus[_synced]
  cputlb: atomically update tlb fields used by tlb_reset_dirty
  cputlb: add tlb_flush_by_mmuidx async routines
  cputlb and arm/sparc targets: convert mmuidx flushes from varg to bitmap
  cputlb: introduce tlb_flush_* async work.
  cputlb: tweak qemu_ram_addr_from_host_nofail reporting
  cputlb: add assert_cpu_is_self checks
  tcg: handle EXCP_ATOMIC exception for system emulation
  tcg: enable thread-per-vCPU
  tcg: enable tb_lock() for SoftMMU
  tcg: remove global exit_request
  tcg: drop global lock during TCG code execution
  tcg: rename tcg_current_cpu to tcg_current_rr_cpu
  tcg: add kick timer for single-threaded vCPU emulation
  tcg: add options for enabling MTTCG
  ...

Signed-off-by: Peter Maydell <address@hidden>


Compare: https://github.com/qemu/qemu/compare/2421f381dc38...28f997a82cb5
[Prev in Thread]
Current Thread
[Next in Thread]
[Qemu-commits] [qemu/qemu] c6489d: docs: new design document multi-thread-tcg.txt, GitHub <=
Prev by Date: [Qemu-commits] [qemu/qemu] 94b502: s390x/s390-virtio: get rid of DPRINTF
Next by Date: [Qemu-commits] [qemu/qemu] df1d8a: hw/mips: MIPS Boston board support
Previous by thread: [Qemu-commits] [qemu/qemu] 94b502: s390x/s390-virtio: get rid of DPRINTF
Next by thread: [Qemu-commits] [qemu/qemu] df1d8a: hw/mips: MIPS Boston board support
Index(es):
- Date
- Thread