[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#35521: Mariadb test suite failures on x86_64-linux
From: |
Chris Marusich |
Subject: |
bug#35521: Mariadb test suite failures on x86_64-linux |
Date: |
Tue, 09 Jul 2019 23:18:57 -0700 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) |
Hi,
I've been encountering this failure off and on for a few weeks now, and
I'd like to help fix it. In short, it seems like non-deterministic test
failures, to me. I think we should gather data and report the issue
upstream, and maybe disable the offending tests in the meantime.
Mariadb failed for me earlier today with a different error than the ones
observed in this bug report so far. My error was the following (when
building mariadb 10.1.40 on an x86_64-linux system using Guix 9b2644c):
Failure: Failed 1/1990 tests, 99.95% were successful.
Failing test(s): tokudb_bugs.5733_innodb
The log files in var/log may give you some hint of what went wrong.
If you want to report this error, please read first the documentation
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
558 tests were skipped, 169 by the test itself
I kept the failed build directory, but there is no "var" directory to be
found there. I guess they meant system logs; I am not sure where such
logs would go when emitted from within a derivation.
The MySQL website suggested running mysql-test-run.pl with the --force
option, which I casually tried after invoking ". environment-variables"
from the failed build directory; however, it promptly failed because it
could not find 'my_safe_process' - maybe I didn't have everything set up
just so to run the tests manually.
Curiously, on a different x86_64-linux machine, using Guix commit
6c83c48 (which is only a few commits ahead of 9b2644c), I was able to
build mariadb successfully, although I am not sure when I built it
(running "guix build mariadb" currently results in quick success for me,
so on this machine I probably built or substituted it some time ago).
The derivation (without grafts) was identical to the one that failed to
build on the other machine, which is strange because I would normally
expect the same derivation to succeed on both machines. For the record,
this was the derivation:
$ guix build --no-grafts -d mariadb
/gnu/store/9yw33r8r84qrsic7fiq0lqqkbzisv1cj-mariadb-10.1.40.drv
Perhaps these tests fail non-deterministically? Or perhaps they fail in
a way that is specific something not isolated from the build process by
Guix, such as the kernel, the file system, or the hardware?
I tried to check the status of mariadb in Cuirass. However, I only
found the following information:
https://ci.guix.gnu.org/search?query=mariadb-10.1.40
For x86_64-linux, build 1304242 supposedly failed at 10 May 20:32 +0200
after about 3 hours of runtime:
https://ci.guix.gnu.org/build/1304242/details
I say "supposedly failed" because I'm not sure why it failed. The build
log seems to indicate no problems:
https://ci.guix.gnu.org/build/1304242/log/raw
Has Cuirass tried to build mariadb since then? May 10th was a long time
ago, and I am surprised there is not another build of it from master.
Mark H Weaver <address@hidden> writes:
> Mark H Weaver <address@hidden> writes:
>
>> The same build also failed twice in a row on my Thinkpad X200, and with
>> the same error each time, although it's a different error than happens
>> on hydra.gnunet.org. On my X200, I get this instead:
>>
>>> Failure: Failed 1/1091 tests, 99.91% were successful.
>>>
>>> Failing test(s): tokudb_bugs.mdev4533
>
> and it just failed a third time on my X200, again with the same error.
It seems like the tests may be flaky. The test failure I saw was
different from yours. And in my case, I actually was able to build (or
substitute) mariadb once. So maybe what we need to do is gather enough
data to report the problem upstream, to enlist their help?
Platoxia <address@hidden> writes:
> This problem persists and is preventing sucessful completion of guix system
> reconfigure for pre-1.0.0 systems (at least mine which is still at kernel
> 4.20), not only for those using mariadb but also for anyone using any of the
> 544 packages that depend on it; as per the command guix graph
> --type=reverse-package mariadb | grep -c label).
>
> This could, potentially, be fixed by simply adding this test to the list of
> disabled tests in the package definition:
>
> --- snip ---
> (add-after 'unpack 'adjust-tests
> (lambda _
> (let ((disabled-tests
> '(;; These fail because root@hostname == root@localhost in
> ;; the build environment, causing a user count mismatch.
> ;; See <https://jira.mariadb.org/browse/MDEV-7761>.
> "main.join_cache"
> "main.explain_non_select"
> "main.stat_tables_innodb"
> "roles.acl_statistics"
>
> ;; This file contains a time bomb which makes it fail
> after
> ;; 2030-12-31. See <https://bugs.gnu.org/34351> for
> details.
> "main.mysqldump"
>
> ;; XXX: Fails sporadically.
> "innodb_fts.crash_recovery"
>
> ;; FIXME: This test fails on i686:
> ;; -myisampack: Can't create/write to file (Errcode: 17
> "File exists")
> ;; +myisampack: Can't create/write to file (Errcode: 17
> "File exists)
> ;; When running "myisampack --join=foo/t3 foo/t1 foo/t2"
> ;; (all three tables must exist and be identical)
> ;; in a loop it produces the same error around 1/240
> times.
> ;; montywi on #maria suggested removing the real_end
> check in
> ;; "strings/my_vsnprintf.c" on line 503, yet it still
> does not
> ;; reach the ending quote occasionally. Disable it for
> now.
> "main.myisampack"
> ;; FIXME: This test fails on armhf-linux:
> "mroonga/storage.index_read_multiple_double"))
>
> ;; This file contains a list of known-flaky tests for this
> ;; release. Append our own items.
> (unstable-tests (open-file "mysql-test/unstable-tests"
> "a")))
> (for-each (lambda (test)
> (format unstable-tests "~a : ~a\n"
> test "Disabled in Guix"))
> disabled-tests)
> (close-port unstable-tests)
> --- snip ---
>
> I say "potentially" because after getting this failure I happened to notice
> that approximately one and a half minutes after beginning the build of
> /gnu/store/c46sn2yfllcfi86p8227wvvr1bxssgxj-mariadb-10.1.38.drv the kernel
> throws this message: "traps: cmTC_35af5[27766] trap invalid opcode
> ip:555555555174 sp:7fffffffcc90 error:0 in cmTC_35af5[555555555000+1000]".
>
> I have retested this several times and confirmed that this occurs each and
> every time mariadb-10.1.38.drv tries to build and in approximately the same
> amount of time after starting the build. I say approximately because the
> closest I could get to a timeframe on this kernel message in relation to the
> mariadb build is by sending the stdout from guix system reconfigure through
> logger so that it gets printed with a timestamp to the kernel messages
> terminal (alt-F12).
>
> Specifically, the message sequence is always as follows, without deviation
> (other than the cmTC_#), with no related messages in between; as per the
> command cat /dev/vcs12:
>
> --- snip ---
> May 9 16:36:35 localhost root cmd: guix system reconfigure: building
> /gnu/store/c46sn2yfllcfi86p8227wvvr1bxssgxj-mariadb-10.1.38.drv...
> May 9 16:38:08 localhost vmunix: [ 9169.050496] traps: cmTC_35af5[27766]
> trap invalid opcode ip:555555555174 sp:7fffffffcc90 error:0 in
> cmTC_35af5[555555555000+1000]
> --- snip ---
>
> I really suggest trying to simply add the tokudb_alter_table.hcad_all_add
> test to the package definition before trying to solve the overall problem,
> though. Maybe we can get this in for 1.0.1?
>
> I would be willing to do this myself and report the results here but I'm
> baffled at how to achieve this simple task. Perhaps someone could walk me
> through it?
I'm not sure about the kernel error. I haven't seen an error like that
myself. But perhaps this is yet another test which is failing
non-deterministically?
I think we need more data. It would be nice if we could build this
repeatedly on Cuirass. When the build is 3 hours long, it is difficult
to test it on my machine, and I often forget about it by the time it is
done running.
If I get more time, I will try to dig in more. In the meantime, any
thoughts about this would be welcome.
--
Chris
signature.asc
Description: PGP signature
- bug#35521: Mariadb test suite failures on x86_64-linux,
Chris Marusich <=
- bug#35521: Mariadb test suite failures on x86_64-linux, Marius Bakke, 2019/07/10
- bug#35521: Mariadb test suite failures on x86_64-linux, Mark H Weaver, 2019/07/10
- bug#35521: Mariadb test suite failures on x86_64-linux, Marius Bakke, 2019/07/11
- bug#35521: Mariadb test suite failures on x86_64-linux, Giovanni Biscuolo, 2019/07/12
- bug#35521: Mariadb test suite failures on x86_64-linux, Marius Bakke, 2019/07/12
- bug#35521: Mariadb test suite failures on x86_64-linux, Mark H Weaver, 2019/07/13
- bug#35521: Mariadb test suite failures on x86_64-linux, Marius Bakke, 2019/07/13
- bug#35521: Mariadb test suite failures on x86_64-linux, Mark H Weaver, 2019/07/13