[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-commits] [qemu/qemu] fb0c43: tests-aio-multithread: fix /aio/multi

From: GitHub
Subject: [Qemu-commits] [qemu/qemu] fb0c43: tests-aio-multithread: fix /aio/multi/schedule rac...
Date: Mon, 13 Nov 2017 01:47:50 -0800

  Branch: refs/heads/master
  Home:   https://github.com/qemu/qemu
  Commit: fb0c43f34eed8b18678c6e1f481d8564b35c99ed
  Author: Stefan Hajnoczi <address@hidden>
  Date:   2017-11-08 (Wed, 08 Nov 2017)

  Changed paths:
    M tests/test-aio-multithread.c

  Log Message:
  tests-aio-multithread: fix /aio/multi/schedule race condition

test_multi_co_schedule_entry() set to_schedule[id] in the final loop
iteration before terminating the coroutine.  There is a race condition
where the main thread attempts to enter the terminating or terminated
coroutine when signalling coroutines to stop:

  atomic_mb_set(&now_stopping, true);
  for (i = 0; i < NUM_CONTEXTS; i++) {
      ctx_run(i, finish_cb, NULL);  <--- enters dead coroutine!
      to_schedule[i] = NULL;

Make sure only to set to_schedule[id] if this coroutine really needs to
be scheduled!

Reported-by: "R.Nageswara Sastry" <address@hidden>
Signed-off-by: Stefan Hajnoczi <address@hidden>
Reviewed-by: Paolo Bonzini <address@hidden>
Message-id: address@hidden
Signed-off-by: Stefan Hajnoczi <address@hidden>

  Commit: ef6dada8b44e1e7c4bec5c1115903af9af415b50
  Author: Sergio Lopez <address@hidden>
  Date:   2017-11-08 (Wed, 08 Nov 2017)

  Changed paths:
    M util/async.c

  Log Message:
  util/async: use atomic_mb_set in qemu_bh_cancel

Commit b7a745d added a qemu_bh_cancel call to the completion function
as an optimization to prevent it from unnecessarily rescheduling itself.

This completion function is scheduled from worker_thread, after setting
the state of a ThreadPoolElement to THREAD_DONE.

This was considered to be safe, as the completion function restarts the
loop just after the call to qemu_bh_cancel. But, as this loop lacks a HW
memory barrier, the read of req->state may actually happen _before_ the
call, seeing it still as THREAD_QUEUED, and ending the completion
function without having processed a pending TPE linked at pool->head:
    worker thread             |            I/O thread
                             | speculatively read req->state
req->state = THREAD_DONE;          |
qemu_bh_schedule(p->completion_bh) |
  bh->scheduled = 1;               |
                             | qemu_bh_cancel(p->completion_bh)
                             |   bh->scheduled = 0;
                             | if (req->state == THREAD_DONE)
                             |   // sees THREAD_QUEUED

The source of the misunderstanding was that qemu_bh_cancel is now being
used by the _consumer_ rather than the producer, and therefore now needs
to have acquire semantics just like e.g. aio_bh_poll.

In some situations, if there are no other independent requests in the
same aio context that could eventually trigger the scheduling of the
completion function, the omitted TPE and all operations pending on it
will get stuck forever.

[Added Sergio's updated wording about the HW memory barrier.

Signed-off-by: Sergio Lopez <address@hidden>
Message-id: address@hidden
Signed-off-by: Stefan Hajnoczi <address@hidden>

  Commit: 53fb28d10da4eb41b77139f3458be08fb6f658e4
  Author: Peter Maydell <address@hidden>
  Date:   2017-11-10 (Fri, 10 Nov 2017)

  Changed paths:
    M tests/test-aio-multithread.c
    M util/async.c

  Log Message:
  Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into 

Pull request

 * v1 emails 2/3 and 3/3 weren't sent due to an email failure
 * Included Sergio's updated wording in the commit description

# gpg: Signature made Wed 08 Nov 2017 19:12:01 GMT
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <address@hidden>"
# gpg:                 aka "Stefan Hajnoczi <address@hidden>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/block-pull-request:
  util/async: use atomic_mb_set in qemu_bh_cancel
  tests-aio-multithread: fix /aio/multi/schedule race condition

Signed-off-by: Peter Maydell <address@hidden>

Compare: https://github.com/qemu/qemu/compare/4ffa88c99c54...53fb28d10da4

reply via email to

[Prev in Thread] Current Thread [Next in Thread]