qemu-s390x
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC] gitlab: introduce s390x wasmtime job


From: Ilya Leoshkevich
Subject: Re: [RFC] gitlab: introduce s390x wasmtime job
Date: Mon, 19 Dec 2022 22:42:16 +0100
User-agent: Evolution 3.46.1 (3.46.1-1.fc37)

On Fri, 2022-12-16 at 15:10 +0000, Alex Bennée wrote:
> 
> Ilya Leoshkevich <iii@linux.ibm.com> writes:
> 
> > On Tue, 2022-07-05 at 15:40 +0100, Peter Maydell wrote:
> > > On Tue, 5 Jul 2022 at 15:37, Ilya Leoshkevich <iii@linux.ibm.com>
> > > wrote:
> > > > 
> > > > On Tue, 2022-07-05 at 14:57 +0100, Peter Maydell wrote:
> > > > > On Tue, 5 Jul 2022 at 14:04, Daniel P. Berrangé
> > > > > <berrange@redhat.com>
> > > > > wrote:
> > > > > > If we put this job in QEMU CI someone will have to be able
> > > > > > to
> > > > > > interpret the results when it fails.
> > > > > 
> > > > > In particular since this is qemu-user, the answer is probably
> > > > > at least some of the time going to be "oh, well, qemu-user
> > > > > isn't
> > > > > reliable
> > > > > if you do complicated things in the guest". I'd be pretty
> > > > > wary of
> > > > > our
> > > > > having
> > > > > a "pass a big complicated guest code test suite under linux-
> > > > > user
> > > > > mode"
> > > > > in the CI path.
> > > 
> > > > Actually exercising qemu-user is one of the goals here: just as
> > > > an
> > > > example, one of the things that the test suite found was commit
> > > > 9a12adc704f9 ("linux-user/s390x: Fix unwinding from signal
> > > > handlers"),
> > > > so it's not only about the ISA.
> > > > 
> > > > At least for s390x, we've noticed that various projects use
> > > > qemu-user-based setups in their CI (either calling it
> > > > explicitly,
> > > > or
> > > > via binfmt-misc), and we would like these workflows to be
> > > > reliable,
> > > > even if they try complicated (within reason) things there.
> > > 
> > > I also would like them to be reliable. But I don't think
> > > *testing* these things is the difficulty: it is having
> > > people who are willing to spend time on the often quite
> > > difficult tasks of identifying why something intermittently
> > > fails and doing the necessary design and implementation work
> > > to correct the problem. Sometimes this is easy (as in the
> > > s390 regression above) but quite often it is not (eg when
> > > multiple threads are in use, or the guest wants to do
> > > something complicated with clone(), etc).
> > > 
> > > thanks
> > > -- PMM
> > > 
> > 
> > For what it's worth, we can help analyzing and fixing failures
> > detected
> > by the s390x wasmtime job. If something breaks, we will have to
> > look at
> > it anyway, and it's better to do this sooner than later.
> 
> Sorry for necroing an old thread but I just wanted to add my 2p.

Thanks for that though; I've been cherry-picking this patch into my
private trees for some time now, and would be happy to see it go
upstream in some form.

> I think making 3rd party test suites easily available to developers
> is a worthy
> goal and there are a number that I would like to see including LTP
> and
> kvm-unit-tests. As others have pointed out I'm less sure about adding
> it
> to the gating CI.

Another third-party test suite that I found useful was the valgrind's
one. I'll post my thoughts about integrating wasmtime's and valgrind's
test suites below, unfortunately I'm not too familiar with LTP and
kvm-unit-tests.

Not touching the gating CI is fine for me.

> If we want to go forward with this we should probably think about how
> we
> would approach this generally:
> 
>   - tests/third-party-suites/FOO?

Sounds good to me.

>   - should we use avocado as a wrapper or something else?
>     - make check-?

avocado sounds good; we might have to add a second wrapper for
producing tap output (see below).

One should definitely be able to specify the testsuite and the
architecture, e.g. `make check-third-party-wasmtime-s390x`.

In addition, we need to either hardcode or let the user choose
the way the testsuite it built and executed. I see 3 possibilities:

- Fully on the host. Easiest to implement, the results are also easy
  to debug. But this requires installing cross-toolchains manually,
  which is simple on some distros and not-so-simple on the others.

- Provide the toolchain as a Docker image. For wasmtime, the toolchain
  would include the Rust compiler and Cargo. This solves the problem
  with configuring the host, but introduces the next choice one has to
  make:

  - Build qemu on the host. Then qemu binary would have to be
    compatible with the container (e.g. no references to the latest
    greatest glibc functions).

    This is because wastime testsuite needs to run inside the
    container: it's driven by Cargo, which is not available on the 
    host. It is possible to only build tests with Cargo and then run
    the resulting binaries manually, but there is more than one and I'm
    not sure how to get a list of them (if we decide to do this, in the
    worst case the list can be hardcoded).

    For valgrind it's a bit easier, since the test runner is not as
    complex as Cargo, and can therefore just follow the check-tcg
    approach.

  - Build qemu inside the container. 2x space and time required, one
    might also have to install additional -dev packages for extra qemu
    features. Also, a decision needs to be made on whether the qemu
    build directory ends up in the container (needs a rebuild on every
    run), in a volume (volume lifetime needs to be managed) or in a
    mounted host directory (this can cause selinux/ownership issues if
    not done carefully).

- Provide both toolchain and testsuite as a Docker image. Essentially
  same as above, but trades build time for download time. Also the
  results are slightly harder to debug, since the test binaries are
  now located inside the container.

Sorry for the long list, it's just that since we are discussing how to
enable this for a larger audience, I felt I needed to enumerate all the
options and pitfalls I could think of.

>   - ensuring the suites output tap for meson

At the moment Rust can output either json like this:

$ cargo test -- -Z unstable-options --format=json
{ "type": "suite", "event": "started", "test_count": 1 }
{ "type": "test", "event": "started", "name": "test::hello" }
{ "type": "test", "name": "test::hello", "event": "ok" }
{ "type": "suite", "event": "ok", "passed": 1, "failed": 0, "ignored":
0, "measured": 0, "filtered_out": 0, "exec_time": 0.001460307 }

or xUnit like this:

$ cargo test -- -Z unstable-options --format=junit

# the following is on a single line; formatted for clarity

<?xml version="1.0" encoding="UTF-8"?>
<testsuites>
  <testsuite name="test" package="test" id="0" errors="0" failures="0"
tests="1" skipped="0">
    <testcase classname="integration" name="test::hello" time="0"/>
    <system-out/>
    <system-err/>
  </testsuite>
</testsuites>

I skimmed the avocado docs and couldn't find whether it can convert
between different test output formats. Based on the source code, we can
add an XUnitRunner the same way the TAPRunner was added.

In the worst case we can pipe json to a script that would output tap.

Enhancing Rust is also an option, of course, even though this might
take some time.

>   - document in docs/devel/testing.rst

Right, we need this too; I totally ignored it in this patch.

> Also I want to avoid adding stuff to tests/docker/dockerfiles that
> aren't directly related to check-tcg and the cross builds. I want to
> move away from docker.py so for 3rd party suites lets just call
> docker/podman directly.

We could add the dockerfiles (if we decide we need them based on
the discussion above) to tests/third-party-suites/FOO. My question is,
would it be possible to build and publish the images on GitLab? Or
is it better to build them on developers' machines?

> > Best regards,
> > Ilya



reply via email to

[Prev in Thread] Current Thread [Next in Thread]