[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] tests/qtest/migration-test: Disable migration/multifd/tcp/pl
From: |
Peter Xu |
Subject: |
Re: [PATCH] tests/qtest/migration-test: Disable migration/multifd/tcp/plain/cancel |
Date: |
Tue, 14 Mar 2023 12:46:34 -0400 |
On Tue, Mar 14, 2023 at 10:11:53AM +0000, Dr. David Alan Gilbert wrote:
> OK, I think I kind of see what's happening here, one for Peter Xu.
> If I'm right it's a race something like:
> a) The test harness tells the source it wants to enter postcopy
> b) The harness then waits for the source to stop
> c) ... and the dest to start
>
> It's blocked on one of b&c but can't tell which
>
> d) The main thread in the dest is waiting for the postcopy recovery fd
> to be opened
> e) But I think the source is still trying to send normal precopy RAM
> and perhaps hasn't got around yet to opening that socket yet????
> f) But I think the dest isn't reading from the main channel at that
> point because of (d)
I think this analysis is spot on. Thanks Dave!
Src qemu does this with below order:
1. setup preempt channel
1.1. connect() --> this is done in another thread
1.2. sem_wait(postcopy_qemufile_src_sem) --> make sure it's created
2. prepare postcopy package (LISTEN, non-iterable states, ping-3, RUN)
3. send the package
So logically the sequence is guaranteed so that when LISTEN cmd is
processed, we should have connect()ed already.
But I think there's one thing missing on dest.. since the accept() on the
dest node should be run in the main thread, meanwhile the LISTEN cmd is
also processed on the main thread, even if the listening socket is trying
to kick the main thread to do the accept() (so the connection has
established) it won't be able to kick the final accept() as main thread is
waiting in the semaphore. That caused a deadlock.
A simple fix I can think of is moving the wait channel operation outside
the main thread, e.g. to the preempt thread.
I've attached that simple fix. Peter, is it easy to verify it? I'm not
sure the reproducability, fine by me too if it's easier to just disable
preempt tests for 8.0 release.
Thanks,
--
Peter Xu
0001-migration-Wait-on-preempt-channel-in-preempt-thread.patch
Description: Text document
- Re: [PATCH] tests/qtest/migration-test: Disable migration/multifd/tcp/plain/cancel, (continued)
- Re: [PATCH] tests/qtest/migration-test: Disable migration/multifd/tcp/plain/cancel, Thomas Huth, 2023/03/03
- Re: [PATCH] tests/qtest/migration-test: Disable migration/multifd/tcp/plain/cancel, Juan Quintela, 2023/03/03
- Re: [PATCH] tests/qtest/migration-test: Disable migration/multifd/tcp/plain/cancel, Peter Maydell, 2023/03/04
- Re: [PATCH] tests/qtest/migration-test: Disable migration/multifd/tcp/plain/cancel, Peter Maydell, 2023/03/07
- Re: [PATCH] tests/qtest/migration-test: Disable migration/multifd/tcp/plain/cancel, Peter Maydell, 2023/03/12
- Re: [PATCH] tests/qtest/migration-test: Disable migration/multifd/tcp/plain/cancel, Peter Maydell, 2023/03/12
- Re: [PATCH] tests/qtest/migration-test: Disable migration/multifd/tcp/plain/cancel, Dr. David Alan Gilbert, 2023/03/14
- Re: [PATCH] tests/qtest/migration-test: Disable migration/multifd/tcp/plain/cancel, Peter Maydell, 2023/03/14
- Re: [PATCH] tests/qtest/migration-test: Disable migration/multifd/tcp/plain/cancel, Daniel P . Berrangé, 2023/03/14
- Re: [PATCH] tests/qtest/migration-test: Disable migration/multifd/tcp/plain/cancel, Dr. David Alan Gilbert, 2023/03/14
- Re: [PATCH] tests/qtest/migration-test: Disable migration/multifd/tcp/plain/cancel,
Peter Xu <=
- Re: [PATCH] tests/qtest/migration-test: Disable migration/multifd/tcp/plain/cancel, Daniel P . Berrangé, 2023/03/14
- Re: [PATCH] tests/qtest/migration-test: Disable migration/multifd/tcp/plain/cancel, Peter Maydell, 2023/03/14
- Re: [PATCH] tests/qtest/migration-test: Disable migration/multifd/tcp/plain/cancel, Peter Xu, 2023/03/14
- Re: [PATCH] tests/qtest/migration-test: Disable migration/multifd/tcp/plain/cancel, Peter Maydell, 2023/03/22