qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH RFC v2 04/16] vfio-user: connect vfio proxy to remote server


From: John Johnson
Subject: Re: [PATCH RFC v2 04/16] vfio-user: connect vfio proxy to remote server
Date: Wed, 15 Sep 2021 00:21:10 +0000


> On Sep 14, 2021, at 6:06 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> 
> On Mon, Sep 13, 2021 at 05:23:33PM +0000, John Johnson wrote:
>>>> On Sep 9, 2021, at 10:25 PM, John Johnson <john.g.johnson@oracle.com> 
>>>> wrote:
>>>>> On Sep 8, 2021, at 11:29 PM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
>>>>> On Thu, Sep 09, 2021 at 05:11:49AM +0000, John Johnson wrote:
>>>>>>  I did look at coroutines, but they seemed to work when the sender
>>>>>> is triggering the coroutine on send, not when request packets are 
>>>>>> arriving
>>>>>> asynchronously to the sends.
>>>>> 
>>>>> This can be done with a receiver coroutine. Its job is to be the only
>>>>> thing that reads vfio-user messages from the socket. A receiver
>>>>> coroutine reads messages from the socket and wakes up the waiting
>>>>> coroutine that yielded from vfio_user_send_recv() or
>>>>> vfio_user_pci_process_req().
>>>>> 
>>>>> (Although vfio_user_pci_process_req() could be called directly from the
>>>>> receiver coroutine, it seems safer to have a separate coroutine that
>>>>> processes requests so that the receiver isn't blocked in case
>>>>> vfio_user_pci_process_req() yields while processing a request.)
>>>>> 
>>>>> Going back to what you mentioned above, the receiver coroutine does
>>>>> something like this:
>>>>> 
>>>>> if it's a reply
>>>>>    reply = find_reply(...)
>>>>>    qemu_coroutine_enter(reply->co) // instead of signalling reply->cv
>>>>> else
>>>>>    QSIMPLEQ_INSERT_TAIL(&pending_reqs, request, next);
>>>>>    if (pending_reqs_was_empty) {
>>>>>        qemu_coroutine_enter(process_request_co);
>>>>>    }
>>>>> 
>>>>> The pending_reqs queue holds incoming requests that the
>>>>> process_request_co coroutine processes.
>>>>> 
>>>> 
>>>> 
>>>>    How do coroutines work across threads?  There can be multiple vCPU
>>>> threads waiting for replies, and I think the receiver coroutine will be
>>>> running in the main loop thread.  Where would a vCPU block waiting for
>>>> a reply?  I think coroutine_yield() returns to its coroutine_enter() caller
>>> 
>>> 
>>> 
>>> A vCPU thread holding the BQL can iterate the event loop if it has
>>> reached a synchronous point that needs to wait for a reply before
>>> returning. I think we have this situation when a MemoryRegion is
>>> accessed on the proxy device.
>>> 
>>> For example, block/block-backend.c:blk_prw() kicks off a coroutine and
>>> then runs the event loop until the coroutine finishes:
>>> 
>>>  Coroutine *co = qemu_coroutine_create(co_entry, &rwco);
>>>  bdrv_coroutine_enter(blk_bs(blk), co);
>>>  BDRV_POLL_WHILE(blk_bs(blk), rwco.ret == NOT_DONE);
>>> 
>>> BDRV_POLL_WHILE() boils down to a loop like this:
>>> 
>>>  while ((cond)) {
>>>    aio_poll(ctx, true);
>>>  }
>>> 
>> 
>>      I think that would make vCPUs sending requests and the
>> receiver coroutine all poll on the same socket.  If the “wrong”
>> routine reads the message, I’d need a second level of synchronization
>> to pass the message to the “right” one.  e.g., if the vCPU coroutine
>> reads a request, it needs to pass it to the receiver; if the receiver
>> coroutine reads a reply, it needs to pass it to a vCPU.
>> 
>>      Avoiding this complexity is one of the reasons I went with
>> a separate thread that only reads the socket over the mp-qemu model,
>> which does have the sender poll, but doesn’t need to handle incoming
>> requests.
> 
> Only one coroutine reads from the socket, the "receiver" coroutine. In a
> previous reply I sketched what the receiver does:
> 
>  if it's a reply
>      reply = find_reply(...)
>      qemu_coroutine_enter(reply->co) // instead of signalling reply->cv
>  else
>      QSIMPLEQ_INSERT_TAIL(&pending_reqs, request, next);
>      if (pending_reqs_was_empty) {
>          qemu_coroutine_enter(process_request_co);
>      }
> 

        Sorry, I was assuming when you said the coroutine will block with
aio_poll(), you implied it would also read messages from the socket.
 

> The qemu_coroutine_enter(reply->co) call re-enters the coroutine that
> was created by the vCPU thread. Is this the "second level of
> synchronization" that you described? It's very similar to signalling
> reply->cv in the existing patch.
> 

        Yes, the only difference is it would be woken on each message,
even though it doesn’t read them.  Which is what I think you’re addressing
below.


> Now I'm actually thinking about whether this can be improved by keeping
> the condvar so that the vCPU thread doesn't need to call aio_poll()
> (which is awkward because it doesn't drop the BQL and therefore blocks
> other vCPUs from making progress). That approach wouldn't require a
> dedicated thread for vfio-user.
> 

        Wouldn’t you need to acquire BQL twice for every vCPU reply: once to
run the receiver coroutine, and once when the vCPU thread wakes up and wants
to return to the VFIO code.  The migration thread would also add a BQL
dependency, where it didn’t have one before.

        Is your objection with using an iothread, or using a separate thread?
I can change to using qemu_thread_create() and running aio_poll() from the
thread routine, instead of creating an iothread.


        On a related subject:

On Aug 24, 2021, at 8:14 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:

>> +    ret = qio_channel_readv_full(proxy->ioc, &iov, 1, &fdp, &numfds,
>> +                                 &local_err);
> 
> This is a blocking call. My understanding is that the IOThread is shared
> by all vfio-user devices, so other devices will have to wait if one of
> them is acting up (e.g. the device emulation process sent less than
> sizeof(msg) bytes).


        This shouldn’t block if the emulation process sends less than 
sizeof(msg)
bytes.  qio_channel_readv() will eventually call recvmsg(), which only blocks a
short read if MSG_WAITALL is set, and it’s not set in this case.  recvmsg() will
return the data available, and vfio_user_recv() will treat a short read as an 
error.

                                                                JJ


reply via email to

[Prev in Thread] Current Thread [Next in Thread]