qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH v2 1/6] migration/multifd: Remove channels_ready semaphor


From: Peter Xu
Subject: Re: [RFC PATCH v2 1/6] migration/multifd: Remove channels_ready semaphore
Date: Thu, 19 Oct 2023 11:46:25 -0400

On Thu, Oct 19, 2023 at 05:00:02PM +0200, Juan Quintela wrote:
> Peter Xu <peterx@redhat.com> wrote:
> > Fabiano,
> >
> > Sorry to look at this series late; I messed up my inbox after I reworked my
> > arrangement methodology of emails. ;)
> >
> > On Thu, Oct 19, 2023 at 11:06:06AM +0200, Juan Quintela wrote:
> >> Fabiano Rosas <farosas@suse.de> wrote:
> >> > The channels_ready semaphore is a global variable not linked to any
> >> > single multifd channel. Waiting on it only means that "some" channel
> >> > has become ready to send data. Since we need to address the channels
> >> > by index (multifd_send_state->params[i]), that information adds
> >> > nothing of value.
> >> 
> >> NAK.
> >> 
> >> I disagree here O:-)
> >> 
> >> the reason why that channel exist is for multifd_send_pages()
> >> 
> >> And simplifying the function what it does is:
> >> 
> >> sem_wait(channels_ready);
> >> 
> >> for_each_channel()
> >>    look if it is empty()
> >> 
> >> But with the semaphore, we guarantee that when we go to the loop, there
> >> is a channel ready, so we know we donat busy wait searching for a
> >> channel that is free.
> >> 
> >> Notice that I fully agree that the sem is not needed for locking.
> >> Locking is done with the mutex.  It is just used to make sure that we
> >> don't busy loop on that loop.
> >> 
> >> And we use a sem, because it is the easiest way to know how many
> >> channels are ready (even when we only care if there is one when we
> >> arrive to that code).
> >> 
> >> We lost count of that counter, and we fixed that here:
> >> 
> >> commit d2026ee117147893f8d80f060cede6d872ecbd7f
> >> Author: Juan Quintela <quintela@redhat.com>
> >> Date:   Wed Apr 26 12:20:36 2023 +0200
> >> 
> >>     multifd: Fix the number of channels ready
> >> 
> >>     We don't wait in the sem when we are doing a sync_main.  Make it
> >> 
> >> And we were addressing the problem that some users where finding that we
> >> were busy waiting on that loop.
> >
> > Juan,
> >
> > I can understand why send_pages needs that sem, but not when sync main.
> > IOW, why multifd_send_sync_main() needs:
> >
> >         qemu_sem_wait(&multifd_send_state->channels_ready);
> >
> > If it has:
> >
> >         qemu_sem_wait(&p->sem_sync);
> >
> > How does a busy loop happen?
> 
> What does multifd_send_thread() for a SYNC packet.
> 
> static void *multifd_send_thread(void *opaque)
> {
>     while (true) {
>         qemu_sem_post(&multifd_send_state->channels_ready);
>         qemu_sem_wait(&p->sem);
> 
>         qemu_mutex_lock(&p->mutex);
> 
>         if (p->pending_job) {
>             ....
>             qemu_mutex_unlock(&p->mutex);
> 
>             if (flags & MULTIFD_FLAG_SYNC) {
>                 qemu_sem_post(&p->sem_sync);
>             }
>     }
> }
> 
> I have simplified it a lot, but yot the idea.
> 
> See the 1st post of channel_ready().
> We do it for every packet sent.  Even for the SYNC ones.
> 
> Now what multifd_send_page() does?
> 
> static int multifd_send_pages(QEMUFile *f)
> {
>     qemu_sem_wait(&multifd_send_state->channels_ready);
>     ....
> }
> 
> See, we are decreasing the numbers of channels_ready because we know we
> are using one.
> 
> As we are sending packets for multifd_send_sync_main(), we need to do a
> hack in multifd_send_thread() and say that sync packets don't
> account. Or we need to decrease that semaphore in multifd_send_sync_main()
> 
> int multifd_send_sync_main(QEMUFile *f)
> {
>     ....
>     for (i = 0; i < migrate_multifd_channels(); i++) {
>         qemu_sem_wait(&multifd_send_state->channels_ready);
>         ...
>     }
> }
> 
> And that is what we do here.
> We didn't had this last line (not needed for making sure the channels
> are ready here).
> 
> But needed to make sure that we are maintaining channels_ready exact.

I didn't expect it to be exact, I think that's the major part of confusion.
For example, I see this comment:

static void *multifd_send_thread(void *opaque)
       ...
        } else {
            qemu_mutex_unlock(&p->mutex);
            /* sometimes there are spurious wakeups */
        }

So do we have spurious wakeup anywhere for either p->sem or channels_ready?
They are related, because if we got spurious p->sem wakeups, then we'll
boost channels_ready one more time too there.

I think two ways to go here:

  - If we want to make them all exact: we'd figure out where are spurious
    wake ups and we fix all of them.  Or,

  - IMHO we can also make them not exact.  It means they can allow
    spurious, and code can actually also work like that.  One example is
    e.g. what happens if we get spurious wakeup in multifd_send_pages() for
    channels_ready?  We simply do some cpu loops as long as we double check
    with each channel again, we can even do better that if looping over N
    channels and see all busy, "goto retry" and wait on the sem again.

What do you think?

Thanks,

-- 
Peter Xu




reply via email to

[Prev in Thread] Current Thread [Next in Thread]