|
From: | Paolo Bonzini |
Subject: | Re: [PATCH 5/8] util/async: add smp_mb__after_rmw() around BH enqueue/dequeue |
Date: | Mon, 6 Mar 2023 10:55:52 +0100 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.7.1 |
On 3/5/23 20:32, Richard Henderson wrote:
On 3/3/23 09:19, Paolo Bonzini wrote:unsigned old_flags; /* * The memory barrier makes sure that: * 1. any writes needed by the callback are visible from the callback * after aio_bh_dequeue() returns bh. * 2. ctx is loaded before the callback has a chance to execute and bh * could be freed. */ old_flags = qatomic_fetch_or(&bh->flags, BH_PENDING | new_flags); + smp_mb__after_rmw(); + if (!(old_flags & BH_PENDING)) { QSLIST_INSERT_HEAD_ATOMIC(&ctx->bh_list, bh, next); }@@ -107,14 +109,15 @@ static QEMUBH *aio_bh_dequeue(BHList *head, unsigned *flags)QSLIST_REMOVE_HEAD(head, next); /* * The memory barrier is paired with aio_bh_enqueue() and it * ensures that the callback sees all writes done by the scheduling * thread. It also ensures that the scheduling thread sees the cleared * flag before bh->cb has run, and thus will call aio_notify again if * necessary. */Is it really paired with aio_bh_enqueue?Seems like the enqueue barrier really is for aio_bh_poll, and the dequeue barrier is for the callback.
The documentation has been quite obsolete since the introduction of bh_list. The logic is as follows: aio_bh_enqueue() load bh->ctx set PENDING flag // (0) if flag was not set // bh becomes visible to dequeuing thread here: insert into bh_list // (1) aio_notify // Write bh->flags and bh_list before ctx->notified smp_wmb() // (2) set notified to true // Write notified before reading notify_me smp_mb() // (3) if notify_me then event_notifier_set() and on the dequeue side it's tricky to see why all barriers are needed; it boils down to the busy-wait polling done at the beginning of aio_poll(): aio_poll() compute approximate timeout (*unordered* read of bh_list) enable notify_me // Write notify_me before reading notified smp_mb() // synchronizes with (3) if notified then timeout = 0 ctx->fdmon_ops->wait(timeout) disable notify_me aio_notify_accept() set notified to false // Write ctx->notified before reading e.g. bh_list smp_mb() // synchronizes with (2) aio_bh_poll() QSLIST_MOVE_ATOMIC // synchronizes with (1) aio_bh_dequeue remove from head clear PENDING/SCHEDULED/IDLE // synchronizes with (0) if flags were set aio_bh_call That is: for synchronization point (0) - the main function here is to ensure that aio_bh_dequeue() removes from the list before the PENDING flag is cleared, otherwise enqueuers can clobber the list, and vice versa for the enqueuers. For this, it is enough that qatomic_fetch_and() is "release" and qatomic_fetch_or() is "acquire" (and they are). for synchronization point (1) - the bottom half machinery between aio_bh_enqueue and aio_bh_poll only needs release-to-acquire synchronization, and that is provided by (1). This is also where ordering ensures that bh->ctx is loaded before the callback executes (which could free bh). - between enqueuers, setting the PENDING flag only needs to be done before inserting into bh_list to avoid double insertion (and of course needs to be done atomically); for this purpose, QSLIST_INSERT_HEAD_ATOMIC needs to be "release" (which it is). Also the call to aio_notify() in aio_bh_enqueue() is unconditional, so aio_bh_dequeue() need not protect against missed calls to aio_notify(). aio_notify() only synchronizes with aio_poll() and aio_notify_accept(). for synchronization point (2) - This cannot be just a smp_rmb() because the timeout that was computed earlier was not ordered against either notified or notify_me. - This is a smp_mb() for generality; as far as bh_list is concerned it could be smp_mb__before_rmw(). Synchronization point (3) is pretty mundane in comparison. So I'm dropping this patch; I have prepared a documentation patch instead. Paolo
[Prev in Thread] | Current Thread | [Next in Thread] |