qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] tests/acceptance: Fix race conditions in s390x tests & skip


From: Philippe Mathieu-Daudé
Subject: Re: [PATCH] tests/acceptance: Fix race conditions in s390x tests & skip fedora on gitlab-CI
Date: Tue, 12 Jan 2021 14:53:19 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0

+Willian

On 1/12/21 2:31 PM, Thomas Huth wrote:
> On 12/01/2021 13.23, Cornelia Huck wrote:
>> On Tue, 12 Jan 2021 11:32:44 +0000
>> Alex Bennée <alex.bennee@linaro.org> wrote:
>>
>>> Cornelia Huck <cohuck@redhat.com> writes:
>>>
>>>> On Fri,  8 Jan 2021 19:56:45 +0100
>>>> Thomas Huth <thuth@redhat.com> wrote:
>>>>  
>>>>> There was a race condition in the first test where there was
>>>>> already the
>>>>> "crw" output in the dmesg, but the "0.0.4711" entry has not been
>>>>> created
>>>>> in the /sys fs yet. Fix it by waiting until it is there.
>>>>>
>>>>> The second test has even more problems on gitlab-CI. Even after
>>>>> adding some
>>>>> more synchronization points (that wait for some messages in the
>>>>> "dmesg"
>>>>> output to make sure that the modules got loaded correctly), there
>>>>> are still
>>>>> occasionally some hangs in this test when it is running in the
>>>>> gitlab-CI.
>>>>> So far I was unable to reproduce these hangs locally on my
>>>>> computer, so
>>>>> this issue might take a while to debug. Thus disable the 2nd test
>>>>> in the
>>>>> gitlab-CI until the problems are better understood and fixed.
>>>>>
>>>>> Signed-off-by: Thomas Huth <thuth@redhat.com>
>>>>> ---
>>>>>   tests/acceptance/machine_s390_ccw_virtio.py | 14 ++++++++++++--
>>>>>   1 file changed, 12 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/tests/acceptance/machine_s390_ccw_virtio.py
>>>>> b/tests/acceptance/machine_s390_ccw_virtio.py
>>>>> index eccf26b262..4028c99afc 100644
>>>>> --- a/tests/acceptance/machine_s390_ccw_virtio.py
>>>>> +++ b/tests/acceptance/machine_s390_ccw_virtio.py
>>>>> @@ -12,6 +12,7 @@
>>>>>   import os
>>>>>   import tempfile
>>>>>   +from avocado import skipIf
>>>>>   from avocado_qemu import Test
>>>>>   from avocado_qemu import exec_command_and_wait_for_pattern
>>>>>   from avocado_qemu import wait_for_console_pattern
>>>>> @@ -133,8 +134,10 @@ class S390CCWVirtioMachine(Test):
>>>>>           self.vm.command('device_add', driver='virtio-net-ccw',
>>>>>                           devno='fe.0.4711', id='net_4711')
>>>>>           self.wait_for_crw_reports()
>>>>> -        exec_command_and_wait_for_pattern(self, 'ls
>>>>> /sys/bus/ccw/devices/',
>>>>> -                                          '0.0.4711')
>>>>> +        exec_command_and_wait_for_pattern(self, 'for i in 1 2 3 4
>>>>> 5 6 7 ; do '
>>>>> +                    'if [ -e /sys/bus/ccw/devices/*4711 ]; then
>>>>> break; fi ;'
>>>>> +                    'sleep 1 ; done ; ls /sys/bus/ccw/devices/',
>>>>> +                    '0.0.4711')
>>>>
>>>> I'm wondering whether we should introduce a generic helper function for
>>>> "execute command repeatedly, if the expected result did not yet show
>>>> up", or "wait for a file/directory to exist". It's probably not
>>>> uncommon for a desired outcome to arrive asynchronously, and having a
>>>> function for waiting/retrying could be handy.
>>>
>>> We don't really want to encourage fragile shell scripts in the guest so
>>> something that makes it easy to encode these loops in python. Currently
>>> the _console_interaction helper fails the test if failure_message is
>>> seen so I guess we need a slightly more liberal interaction which
>>> accepts a command can fail so we can write something like:
>>>
>>>    while True:
>>>        if exec_command_and_check(self, "stat -t
>>> /sys/bus/ccw/devices/0.0.4711",
>>>                                  "/sys/bus/ccw/devices/0.0.4711"):
>>>            break
>>>
>>> ?
>>
>> Yes, something like that. The caller can decide whether they want to
>> limit retries.
> 
> Fine for me, but I think we should use a timeout, not an amount of retries.

We explained this feature request to the Avocado project at the
KVM forum 2018 in Lyon. There was an entry filled on their Trello
dashboard. Then the project switched to GitLab and I lost track
of it.

Cleber, if you remember, can you point us at the new ticket please?

Thanks,

Phil.

> I already put my patch into my pull-request yesterday (so that people
> are not running into failures with their gitlab-CI), so if someone wants
> to have a go at creating such a function in python, feel free to do so
> by refactoring that code again.
> 
>  Thomas
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]