qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PULL 00/33] Abstract ArchCPU


From: Daniel P . Berrangé
Subject: Re: [PULL 00/33] Abstract ArchCPU
Date: Mon, 7 Mar 2022 12:12:41 +0000
User-agent: Mutt/2.1.5 (2021-12-30)

On Mon, Mar 07, 2022 at 11:51:20AM +0000, Peter Maydell wrote:
> On Sun, 6 Mar 2022 at 21:13, Philippe Mathieu-Daudé
> <philippe.mathieu.daude@gmail.com> wrote:
> >
> > +Daniel/Alex
> >
> > On 6/3/22 20:56, Peter Maydell wrote:
> > > On Sun, 6 Mar 2022 at 19:06, Philippe Mathieu-Daudé
> > > <philippe.mathieu.daude@gmail.com> wrote:
> > >> I see. I only have access to aarch64 Darwin, not x86_64; I was relying
> > >> on our CI for that (my GitLab CI is green). I'll work a fix, thanks.
> > >
> > > This was on my ad-hoc stuff -- I guess our gitlab CI for macos
> > > doesn't build hvf ?
> >
> > No, it does:
> >
> > https://gitlab.com/philmd/qemu/-/jobs/2167582776#L6444
> >
> >    Targets and accelerators
> >      KVM support                  : NO
> >      HAX support                  : YES
> >      HVF support                  : YES
> >      WHPX support                 : NO
> >      NVMM support                 : NO
> >      Xen support                  : NO
> >      TCG support                  : YES
> >
> > But the Cirrus job are allowed to fail:
> 
> Overall I am starting to feel that we should stop having
> these CI jobs that are in the "allowed to fail" category.
> All that happens is that they eat a lot of CPU on our CI
> hosts, but they don't actually find bugs because everybody
> (rightly) treats "allowed-to-fail-and-failed" as "ignore me".
> I think our CI jobs should either be "must pass", or else
> "run only manually", with that latter category being rarely
> used and only where there's a good reason (eg somebody
> specific has taken responsibility for debugging some
> intermittent failure and having it still available in the
> CI UI for them to trigger is helpful).

The cirrus CI jobs were introduced as allow-fail as we were
not sure the cirrus-run integration with gitlab would be
entirely stable. There was a blip a month or so ago due
to Cirrus CI breaking their REST API, but on the QEMU side
we seem to be OK. So I think we can toggle the flag to
make these Cirrus CI jobs gating.

> Plus we really need to get on top of all the intermittent
> failures. The current state of the world is that we have
> some intermittents, which makes it easy for new intermittents
> to get into the tree, because everybody is in the habit of
> "just hit retry"...

A big issue IMHO is that the pain/impact hits the wrong people.
It is most seriously impacts & disrupts Peter when merging, and
less impacts the subsystem maintainers, and even less the
original authors.

If we consider a alternative world where we used merge requests
for subsystem maintainers just to send pull requests. The subsystem
maintainer would open a MR and it would be their responsibility
to get a green pipeline. Peter (or the person approving pulls for
merge at the time) shouldn't even have to consider a MR until it
has got a green pipeline. That would put the primary impact of
unreliable CI onto the subsystem maintainers, blocking their work
from being considered for merge. This creates a direct incentive
on the subsystem maintainers to contribute to ensuring reliable
CI, instead of considering it somebody else's problem.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




reply via email to

[Prev in Thread] Current Thread [Next in Thread]