qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] gitlab: remove unreliable avocado CI jobs


From: Alex Bennée
Subject: Re: [PATCH] gitlab: remove unreliable avocado CI jobs
Date: Tue, 12 Sep 2023 17:01:26 +0100
User-agent: mu4e 1.11.17; emacs 29.1.50

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Tue, Sep 12, 2023 at 11:06:11AM -0400, Stefan Hajnoczi wrote:
>> The avocado-system-alpine, avocado-system-fedora, and
>> avocado-system-ubuntu jobs are unreliable. I identified them while
>> looking over CI failures from the past week:
>> https://gitlab.com/qemu-project/qemu/-/jobs/5058610614
>> https://gitlab.com/qemu-project/qemu/-/jobs/5058610654
>> https://gitlab.com/qemu-project/qemu/-/jobs/5030428571
>> 
>> Thomas Huth suggest on IRC today that there may be a legitimate failure
>> in there:
>> 
>>   th_huth: f4bug, yes, seems like it does not start at all correctly on
>>   alpine anymore ... and it's broken since ~ 2 weeks already, so if nobody
>>   noticed this by now, this is worrying
>> 
>> It crept in because the jobs were already unreliable.
>> 
>> I don't know how to interpret the job output, so all I can do is to
>> propose removing these jobs. A useful CI job has two outcomes: pass or
>> fail. Timeouts and other in-between states are not useful because they
>> require constant triaging by someone who understands the details of the
>> tests and they can occur when run against pull requests that have
>> nothing to do with the area covered by the test.
>> 
>> Hopefully test owners will be able to identify the root causes and solve
>> them so that these jobs can stay. In their current state the jobs are
>> not useful since I cannot cannot tell whether job failures are real or
>> just intermittent when merging qemu.git pull requests.
>> 
>> If you are a test owner, please take a look.
>> 
>> It is likely that other avocado-system-* CI jobs have similar failures
>> from time to time, but I'll leave them as long as they are passing.
>> 
>> Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1884
>> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
>> ---
>>  .gitlab-ci.d/buildtest.yml | 27 ---------------------------
>>  1 file changed, 27 deletions(-)
>> 
>> diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
>> index aee9101507..83ce448c4d 100644
>> --- a/.gitlab-ci.d/buildtest.yml
>> +++ b/.gitlab-ci.d/buildtest.yml
>> @@ -22,15 +22,6 @@ check-system-alpine:
>>      IMAGE: alpine
>>      MAKE_CHECK_ARGS: check-unit check-qtest
>>  
>> -avocado-system-alpine:
>> -  extends: .avocado_test_job_template
>> -  needs:
>> -    - job: build-system-alpine
>> -      artifacts: true
>> -  variables:
>> -    IMAGE: alpine
>> -    MAKE_CHECK_ARGS: check-avocado
>
> Instead of entirely deleting, I'd suggest adding
>
>    # Disabled due to frequent random failures
>    # https://gitlab.com/qemu-project/qemu/-/issues/1884
>    when: manual
>
> See example: https://docs.gitlab.com/ee/ci/yaml/#when
>
> This disables the job from running unless someone explicitly
> tells it to run

What I don't understand is why we didn't gate the release back when they
first tripped. We should have noticed between:

  https://gitlab.com/qemu-project/qemu/-/pipelines/956543770

and

  https://gitlab.com/qemu-project/qemu/-/pipelines/957154381

that the system tests where regressing. Yet we merged the changes
anyway.

>
>> -
>>  build-system-ubuntu:
>>    extends:
>>      - .native_build_job_template
>> @@ -53,15 +44,6 @@ check-system-ubuntu:
>>      IMAGE: ubuntu2204
>>      MAKE_CHECK_ARGS: check
>>  
>> -avocado-system-ubuntu:
>> -  extends: .avocado_test_job_template
>> -  needs:
>> -    - job: build-system-ubuntu
>> -      artifacts: true
>> -  variables:
>> -    IMAGE: ubuntu2204
>> -    MAKE_CHECK_ARGS: check-avocado
>> -
>>  build-system-debian:
>>    extends:
>>      - .native_build_job_template
>> @@ -127,15 +109,6 @@ check-system-fedora:
>>      IMAGE: fedora
>>      MAKE_CHECK_ARGS: check
>>  
>> -avocado-system-fedora:
>> -  extends: .avocado_test_job_template
>> -  needs:
>> -    - job: build-system-fedora
>> -      artifacts: true
>> -  variables:
>> -    IMAGE: fedora
>> -    MAKE_CHECK_ARGS: check-avocado
>> -
>>  crash-test-fedora:
>>    extends: .native_test_job_template
>>    needs:
>> -- 
>> 2.41.0
>> 
>> 
>
> With regards,
> Daniel


-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro



reply via email to

[Prev in Thread] Current Thread [Next in Thread]