Re: [RFC] gitlab: introduce s390x wasmtime job

qemu-s390x
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC] gitlab: introduce s390x wasmtime job

From:	Alex Bennée
Subject:	Re: [RFC] gitlab: introduce s390x wasmtime job
Date:	Mon, 19 Dec 2022 22:18:39 +0000
User-agent:	mu4e 1.9.7; emacs 29.0.60
Ilya Leoshkevich <iii@linux.ibm.com> writes:

> On Fri, 2022-12-16 at 15:10 +0000, Alex Bennée wrote:
>> 
>> Ilya Leoshkevich <iii@linux.ibm.com> writes:
>> 
>> > On Tue, 2022-07-05 at 15:40 +0100, Peter Maydell wrote:
>> > > On Tue, 5 Jul 2022 at 15:37, Ilya Leoshkevich <iii@linux.ibm.com>
>> > > wrote:
>> > > > 
>> > > > On Tue, 2022-07-05 at 14:57 +0100, Peter Maydell wrote:
>> > > > > On Tue, 5 Jul 2022 at 14:04, Daniel P. Berrangé
>> > > > > <berrange@redhat.com>
>> > > > > wrote:
>> > > > > > If we put this job in QEMU CI someone will have to be able
>> > > > > > to
>> > > > > > interpret the results when it fails.
>> > > > > 
>> > > > > In particular since this is qemu-user, the answer is probably
>> > > > > at least some of the time going to be "oh, well, qemu-user
>> > > > > isn't
>> > > > > reliable
>> > > > > if you do complicated things in the guest". I'd be pretty
>> > > > > wary of
>> > > > > our
>> > > > > having
>> > > > > a "pass a big complicated guest code test suite under linux-
>> > > > > user
>> > > > > mode"
>> > > > > in the CI path.
>> > > 
>> > > > Actually exercising qemu-user is one of the goals here: just as
>> > > > an
>> > > > example, one of the things that the test suite found was commit
>> > > > 9a12adc704f9 ("linux-user/s390x: Fix unwinding from signal
>> > > > handlers"),
>> > > > so it's not only about the ISA.
>> > > > 
>> > > > At least for s390x, we've noticed that various projects use
>> > > > qemu-user-based setups in their CI (either calling it
>> > > > explicitly,
>> > > > or
>> > > > via binfmt-misc), and we would like these workflows to be
>> > > > reliable,
>> > > > even if they try complicated (within reason) things there.
>> > > 
>> > > I also would like them to be reliable. But I don't think
>> > > *testing* these things is the difficulty: it is having
>> > > people who are willing to spend time on the often quite
>> > > difficult tasks of identifying why something intermittently
>> > > fails and doing the necessary design and implementation work
>> > > to correct the problem. Sometimes this is easy (as in the
>> > > s390 regression above) but quite often it is not (eg when
>> > > multiple threads are in use, or the guest wants to do
>> > > something complicated with clone(), etc).
>> > > 
>> > > thanks
>> > > -- PMM
>> > > 
>> > 
>> > For what it's worth, we can help analyzing and fixing failures
>> > detected
>> > by the s390x wasmtime job. If something breaks, we will have to
>> > look at
>> > it anyway, and it's better to do this sooner than later.
>> 
>> Sorry for necroing an old thread but I just wanted to add my 2p.
>
> Thanks for that though; I've been cherry-picking this patch into my
> private trees for some time now, and would be happy to see it go
> upstream in some form.
>
>> I think making 3rd party test suites easily available to developers
>> is a worthy
>> goal and there are a number that I would like to see including LTP
>> and
>> kvm-unit-tests. As others have pointed out I'm less sure about adding
>> it
>> to the gating CI.
>
> Another third-party test suite that I found useful was the valgrind's
> one. I'll post my thoughts about integrating wasmtime's and valgrind's
> test suites below, unfortunately I'm not too familiar with LTP and
> kvm-unit-tests.
>
> Not touching the gating CI is fine for me.
>
>> If we want to go forward with this we should probably think about how
>> we
>> would approach this generally:
>> 
>>   - tests/third-party-suites/FOO?
>
> Sounds good to me.
>
>>   - should we use avocado as a wrapper or something else?
>>     - make check-?
>
> avocado sounds good; we might have to add a second wrapper for
> producing tap output (see below).
>
> One should definitely be able to specify the testsuite and the
> architecture, e.g. `make check-third-party-wasmtime-s390x`.
>
> In addition, we need to either hardcode or let the user choose
> the way the testsuite it built and executed. I see 3 possibilities:
>
> - Fully on the host. Easiest to implement, the results are also easy
>   to debug. But this requires installing cross-toolchains manually,
>   which is simple on some distros and not-so-simple on the others.
>
> - Provide the toolchain as a Docker image. For wasmtime, the toolchain
>   would include the Rust compiler and Cargo. This solves the problem
>   with configuring the host, but introduces the next choice one has to
>   make:
>
>   - Build qemu on the host. Then qemu binary would have to be
>     compatible with the container (e.g. no references to the latest
>     greatest glibc functions).
>
>     This is because wastime testsuite needs to run inside the
>     container: it's driven by Cargo, which is not available on the 
>     host. It is possible to only build tests with Cargo and then run
>     the resulting binaries manually, but there is more than one and I'm
>     not sure how to get a list of them (if we decide to do this, in the
>     worst case the list can be hardcoded).
>
>     For valgrind it's a bit easier, since the test runner is not as
>     complex as Cargo, and can therefore just follow the check-tcg
>     approach.
>
>   - Build qemu inside the container. 2x space and time required, one
>     might also have to install additional -dev packages for extra qemu
>     features. Also, a decision needs to be made on whether the qemu
>     build directory ends up in the container (needs a rebuild on every
>     run), in a volume (volume lifetime needs to be managed) or in a
>     mounted host directory (this can cause selinux/ownership issues if
>     not done carefully).

I think building inside the container is the easiest to ensure you have
all the bits. We can provide a persistent ccache and follow the same
TARGET_LIST and option rules as the cross builds to allow for selecting
a minimal subset.

> - Provide both toolchain and testsuite as a Docker image. Essentially
>   same as above, but trades build time for download time. Also the
>   results are slightly harder to debug, since the test binaries are
>   now located inside the container.

There certainly seems some millage in having the test binaries in a
volume that is on the host system - especially if they are
self-contained or build statically.

> Sorry for the long list, it's just that since we are discussing how to
> enable this for a larger audience, I felt I needed to enumerate all the
> options and pitfalls I could think of.
>
>>   - ensuring the suites output tap for meson
>
> At the moment Rust can output either json like this:
>
> $ cargo test -- -Z unstable-options --format=json
> { "type": "suite", "event": "started", "test_count": 1 }
> { "type": "test", "event": "started", "name": "test::hello" }
> { "type": "test", "name": "test::hello", "event": "ok" }
> { "type": "suite", "event": "ok", "passed": 1, "failed": 0, "ignored":
> 0, "measured": 0, "filtered_out": 0, "exec_time": 0.001460307 }
>
> or xUnit like this:
>
> $ cargo test -- -Z unstable-options --format=junit
>
> # the following is on a single line; formatted for clarity
>
> <?xml version="1.0" encoding="UTF-8"?>
> <testsuites>
>   <testsuite name="test" package="test" id="0" errors="0" failures="0"
> tests="1" skipped="0">
>     <testcase classname="integration" name="test::hello" time="0"/>
>     <system-out/>
>     <system-err/>
>   </testsuite>
> </testsuites>
>
> I skimmed the avocado docs and couldn't find whether it can convert
> between different test output formats. Based on the source code, we can
> add an XUnitRunner the same way the TAPRunner was added.
>
> In the worst case we can pipe json to a script that would output tap.

That certainly works, there are plenty of interoperation solutions.

>
> Enhancing Rust is also an option, of course, even though this might
> take some time.
>
>>   - document in docs/devel/testing.rst
>
> Right, we need this too; I totally ignored it in this patch.
>
>> Also I want to avoid adding stuff to tests/docker/dockerfiles that
>> aren't directly related to check-tcg and the cross builds. I want to
>> move away from docker.py so for 3rd party suites lets just call
>> docker/podman directly.
>
> We could add the dockerfiles (if we decide we need them based on
> the discussion above) to tests/third-party-suites/FOO. My question is,
> would it be possible to build and publish the images on GitLab? Or
> is it better to build them on developers' machines?

Probably assume the developers especially as the actual CI currently
hammers our GitLab storage quotas and we are not expecting everyone to
be interested in such detail.

>
>> > Best regards,
>> > Ilya


-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro
[Prev in Thread]
Current Thread
[Next in Thread]
Re: [RFC] gitlab: introduce s390x wasmtime job, Alex Bennée, 2022/12/16
- Re: [RFC] gitlab: introduce s390x wasmtime job, Ilya Leoshkevich, 2022/12/19
  - Re: [RFC] gitlab: introduce s390x wasmtime job, Alex Bennée <=
Prev by Date: Re: [RFC] gitlab: introduce s390x wasmtime job
Next by Date: [PATCH 0/5] cpus: Remove system reset() API from user emulation
Previous by thread: Re: [RFC] gitlab: introduce s390x wasmtime job
Next by thread: [PATCH 0/5] target/s390x: Header cleanups around "cpu.h"
Index(es):
- Date
- Thread