[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v5 13/21] migration: Postcopy recover with preempt enabled
From: |
Peter Xu |
Subject: |
Re: [PATCH v5 13/21] migration: Postcopy recover with preempt enabled |
Date: |
Mon, 16 May 2022 11:58:45 -0400 |
On Mon, May 16, 2022 at 08:21:23PM +0530, manish.mishra wrote:
>
> On 16/05/22 7:41 pm, Peter Xu wrote:
> > Hi, Manish,
> >
> > On Mon, May 16, 2022 at 07:01:35PM +0530, manish.mishra wrote:
> > > On 26/04/22 5:08 am, Peter Xu wrote:
> > > LGTM,
> > > Peter, I wanted to give review-tag for this and ealier patch too. I am new
> > > to qemu
> > > review process so not sure how give review-tag, did not find any reference
> > > on
> > > google too. So if you please let me know how to do it.
> > It's here:
> >
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__git.qemu.org_-3Fp-3Dqemu.git-3Ba-3Dblob-3Bf-3Ddocs_devel_submitting-2Da-2Dpatch.rst-3Bhb-3DHEAD-23l492&d=DwIBaQ&c=s883GpUCOChKOHiocYtGcg&r=c4KON2DiMd-szjwjggQcuUvTsPWblztAL0gVzaHnNmc&m=8LU6rphEJ5GMSXEpSxe8JZ_hpn6TQDUXfjWM6Vt7DdShxnU3X5zYXbAMBLPYchdK&s=TUNUCtdl7LWhrdlfnIx1F08kC0d9IMvArl6cNMpfXkc&e=
> >
> > Since afaict QEMU is mostly following what Linux does, you can also
> > reference to this one with more context:
> >
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.kernel.org_doc_html_v4.17_process_submitting-2Dpatches.html-23using-2Dreported-2Dby-2Dtested-2Dby-2Dreviewed-2Dby-2Dsuggested-2Dby-2Dand-2Dfixes&d=DwIBaQ&c=s883GpUCOChKOHiocYtGcg&r=c4KON2DiMd-szjwjggQcuUvTsPWblztAL0gVzaHnNmc&m=8LU6rphEJ5GMSXEpSxe8JZ_hpn6TQDUXfjWM6Vt7DdShxnU3X5zYXbAMBLPYchdK&s=TJmr_eC4LAccVY1EqgkLleXfJhUgtIjTJmLc3cedYr0&e=
> >
> > But since you're still having question regarding this patch, no rush on
> > providing your R-bs; let's finish the discussion first.
> >
> > [...]
> >
> > > > +static void postcopy_pause_ram_fast_load(MigrationIncomingState *mis)
> > > > +{
> > > > + trace_postcopy_pause_fast_load();
> > > > + qemu_mutex_unlock(&mis->postcopy_prio_thread_mutex);
> > > I may have misunderstood synchronisation here but very very rare chance,
> > >
> > > as both threads are working independently is it possible qemu_sem_post on
> > >
> > > postcopy_pause_sem_fast_load is called by main thread even before we go
> > >
> > > to qemu_sem_wait in next line, causing some kind of deadlock. That's
> > > should
> > >
> > > not be possible as i understand it requires manually calling
> > > qmp_migration_recover
> > >
> > > so chances are almost impossible. Just wanted to confirm it.
> > Sorry I don't quite get the question, could you elaborate? E.g., when the
> > described deadlock happened, what is both main thread and preempt load
> > thread doing? What are they waiting at?
> >
> > Note: we have already released mutex before waiting on sem.
>
> What i meant here is deadlock could be due the reason that we infinately wait
>
> on qemu_sem_wait(&mis->postcopy_pause_sem_fast_load), because main
>
> thread already called post on postcopy_pause_sem_fast_load after recovery
>
> even before we moved to qemu_sem_wait(&mis->postcopy_pause_sem_fast_load)
>
> in next line. Basically if we miss a post on postcopy_pause_sem_fast_load.
>
> This is nearly impossibily case becuase it requires full recovery path to be
> completed
>
> before this thread executes just next line. Also as recovery needs to be
> called manually,
>
> So please ignore this.
Yes the migration state has a dependency.
The other more obvious reason it won't happen is that sem is number based
and it remembers. Please try below:
sem_t *sem = sem_open("test", O_CREAT);
sem_post(sem);
sem_wait(sem);
And sem_wait() will return immediately because post() already set it to 1.
>
> Basically i wanted to check if we should use something like
>
> int pthread_cond_wait(pthread_cond_t *restrict cond,
> pthread_mutex_t *restrict mutex);
>
> so that there is no race between releasing mutex and calling wait.
Yes I think condvar should also work here but it should be stricter.
Thanks,
--
Peter Xu