qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC 1/1] docs/deve/ci-plan: define a high-level plan for the QEMU G


From: Daniel P . Berrangé
Subject: Re: [RFC 1/1] docs/deve/ci-plan: define a high-level plan for the QEMU GitLab CI
Date: Wed, 15 Sep 2021 15:07:40 +0100
User-agent: Mutt/2.0.7 (2021-05-04)

On Wed, Sep 15, 2021 at 10:51:56AM -0300, Willian Rampazzo wrote:
> On Wed, Sep 15, 2021 at 6:00 AM Daniel P. Berrangé <berrange@redhat.com> 
> wrote:
> >
> > On Tue, Sep 14, 2021 at 03:48:30PM -0300, Willian Rampazzo wrote:
> > > This adds a high-level plan for the QEMU GitLab CI based on use cases.
> > > The idea is to have a base for evolving the QEMU CI. It sets high-level
> > > characteristics for the QEMU CI use cases, which helps guide its
> > > development.
> > >
> > > Signed-off-by: Willian Rampazzo <willianr@redhat.com>
> > > ---
> > >  docs/devel/ci-plan.rst | 77 ++++++++++++++++++++++++++++++++++++++++++
> > >  docs/devel/ci.rst      |  1 +
> > >  2 files changed, 78 insertions(+)
> > >  create mode 100644 docs/devel/ci-plan.rst
> > >
> > > diff --git a/docs/devel/ci-plan.rst b/docs/devel/ci-plan.rst
> > > new file mode 100644
> > > index 0000000000..5e95b6bcea
> > > --- /dev/null
> > > +++ b/docs/devel/ci-plan.rst
> > > @@ -0,0 +1,77 @@
> > > +The GitLab CI structure
> > > +=======================
> > > +
> > > +This section describes the current state of the QEMU GitLab CI and the
> > > +high-level plan for its future.
> > > +
> > > +Current state
> > > +-------------
> > > +
> > > +The mainstream QEMU project considers the GitLab CI its primary CI 
> > > system.
> > > +Currently, it runs 120+ jobs, where ~36 are container build jobs, 69 are 
> > > QEMU
> > > +build jobs, ~22 are test jobs, 1  is a web page deploy job, and 1 is an
> > > +external job covering Travis jobs execution.
> > > +
> > > +In the current state, every push a user does to its fork runs most of 
> > > the jobs
> > > +compared to the jobs running on the main repository. The exceptions are 
> > > the
> > > +acceptance tests jobs, which run automatically on the main repository 
> > > only.
> > > +Running most of the jobs in the user's fork or the main repository is not
> > > +viable. The job number tends to increase, becoming impractical to run 
> > > all of
> > > +them on every single push.
> > > +
> > > +Future of QEMU GitLab CI
> > > +------------------------
> > > +
> > > +Following is a proposal to establish a high-level plan and set the
> > > +characteristics for the QEMU GitLab CI. The idea is to organize the CI 
> > > by use
> > > +cases, avoid wasting resources and CI minutes, anticipating the time 
> > > GitLab
> > > +starts to enforce minutes limits soon.
> > > +
> > > +Use cases
> > > +^^^^^^^^^
> > > +
> > > +Below is a list of the most common use cases for the QEMU GitLab CI.
> > > +
> > > +Gating
> > > +""""""
> > > +
> > > +The gating set of jobs runs on the maintainer's pull requests when the 
> > > project
> > > +leader pushes them to the staging branch of the project. The gating CI 
> > > pipeline
> > > +has the following characteristics:
> > > +
> > > + * Jobs tagged as gating run as part of the gating CI pipeline;
> > > + * The gating CI pipeline consists of stable jobs;
> > > + * The execution duration of the gating CI pipeline should, as much as 
> > > possible,
> > > +   have an upper bound limit of 2 hours.
> > > +
> > > +Developers
> > > +""""""""""
> > > +
> > > +A developer working on a new feature or fixing an issue may want to 
> > > run/propose
> > > +a specific set of tests. Those tests may, eventually, benefit other 
> > > developers.
> > > +A developer CI pipeline has the following characteristics:
> > > +
> > > + * It is easy to run current tests available in the project;
> > > + * It is easy to add new tests or remove unneeded tests;
> > > + * It is flexible enough to allow changes in the current jobs.
> > > +
> > > +Maintainers
> > > +"""""""""""
> > > +
> > > +When accepting developers' patches, a maintainer may want to run a 
> > > specific
> > > +test set. A maintainer CI pipeline has the following characteristics:
> > > +
> > > + * It consists of tests that are valuable for the subsystem;
> > > + * It is easy to run a set of specific tests available in the project;
> > > + * It is easy to add new tests or remove unneeded tests.
> >
> >
> > Neither of these describe why I use CI as a developer and/or subsys
> > maintainer.
> >
> > My desire with using CI is to (as close as possible) be able to
> > execute the exact same  set of tests that will be run by gating CI
> > on pull requests.
> 
> I totally understand your desire and I think it is valid.
> 
> What I'm trying with this proposal is the same strategy we used when
> we started planning for the gating CI. The decision was to start
> small. Today the CI grew and we don´t have a so called gating CI yet,
> but a bunch of jobs that runs on staging branch, some needing
> reevaluation whether they should run on staging or not.

Of course we have a gating CI today, it is the very thing you just
described. Whether or not the set of CI jobs that run on staging is
designed ground up, or evolved organically is irrelevant. It is what
exists today and is used to test merges to master, so by definition
is is our gating CI.  The set of jobs will never be perfect because
we're in a changing world, so they will always need re-evaluation
periodically to judge whether they're the right mix for our current
needs.

> > My goal is to minimize (ideally eliminate) the risk that a patch
> > series or pull request gets rejected with a need to re-spin due
> > to CI failures. Each such rejection causes a round trip delaying
> > merge, and this wastes my time & the maintainer/gate keepers' time.
> > It is also hard to debug failures if I can't replicate the gating
> > CI myself.
> 
> Again, I totally agree with you. That would be the perfect scenario.

Aside from the custom runners, it is the scenario that exists today
and is relied on by many people. That existing usage and starting 
point has to be acknowledged in any CI plan that is proposed.

> The barrier I see to have it working the way you described is the
> hardware access. The staging branch runs on two different custom
> runners. We have two possible solutions to accomplish the scenario you
> described: remove the custom runners from the staging branch and let
> the jobs run on the GitLab CI shared runners, which everyone with
> access to GitLab can use, or allow developers to access the custom
> runners.

It isn't that large of a barrier IMHO. It will be slow, but people
can bring up custom runners for ppc/s390 using QEMU VMs if they lack
access to hardware. The most important is the build coverage and 
that's already acquired via the cross compilers. The custom runners 
essentially only add "make check" as a benefit.

> Today, I don´t think any of those options are feasible or bring value
> to the project. That is one of the reasons I'm not covering it now in
> the future plan. As I mentioned before, let's take another small step
> and organize a gating CI with some ground rules. When we reach it, the
> future future step can be to implement merge requests, think about
> reproducibility, and so on.

Being able to replicate gating CI jobs as a contributor is the most 
critical starting point for any plan. Historically diagnosing failures
in gating CI has been the biggest pain point in submitting code to QEMU,
and why myself and others have spent so much time on Travis, and now 
GitLab config to let us have a well defined environment and ruleset for
build jobs. That can't be ignored by any proposed CI plan.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




reply via email to

[Prev in Thread] Current Thread [Next in Thread]