bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#21097: verify-store test failure on armhf-linux


From: Chris Marusich
Subject: bug#21097: verify-store test failure on armhf-linux
Date: Fri, 08 Jun 2018 01:21:33 -0700
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux)

address@hidden (Ludovic Courtès) writes:

> I’ve become convinced that this is due to parallelism: several
> guix-daemon processes run at the same time.  In this case, I bet this
> process tries to remove an item from the ValidPaths table while another
> is trying to add it in the Refs table or something.
>
> In dc57d527 I added #:parallel-tests? #f for ‘guix-devel’.  Eventually
> we should fix the makefile to run this test alone, as is done for
> ‘guix-gc.sh’.

In the 2 years and 7 months since we disabled parallel tests in commit
dc57d527aee4eb18ec5fb345f90d6637bbd1a4d2 to work around this bug, we may
have allowed other parallelism bugs to quietly creep in.  Today, I
observed a parallel test failure that seems unrelated to the original
bug reported here.  And anecdotally, I feel that the tests frequently
fail spuriously when I run them in parallel.  Until we get to the bottom
of this, I agree that the best thing to do is to always run the tests in
serial.

For completeness, below I'll report the failure I observed today.

On my x86_64-linux GuixSD machine, using Guix version
0ec430f79530ee343c175347952f91a78adca5ec (this is what my
~/.config/guix/latest points to), I entered a Guix development
environment via "guix environment guix".  In Guix's Git repository, I
checked out commit 4dd91dff477b9717b3fa494b23976e4d69ab7dfc (the current
tip of core-updates) and ran the following commands:

    ./bootstrap && ./configure --localstatedir=/var && make -j \
        && make -j check

The following tests failed:

    FAIL: tests/guix-hash.sh
    FAIL: tests/guix-download.sh
    FAIL: tests/guix-build.sh
    FAIL: tests/guix-package.sh
    FAIL: tests/guix-system.sh

When I immediately ran "make recheck" without making any changes, the
same 5 tests passed.  Note that this ran the tests in serial because I
omitted -j.  When I ran the same 5 tests again in parallel using the
following command, they all passed:

    make -j check TESTS="tests/guix-hash.sh tests/guix-download.sh \
        tests/guix-build.sh tests/guix-package.sh tests/guix-system.sh"

I also tried running just tests/guix-hash.sh and tests/guix-download.sh
together 10 times in serial and then 10 times in parallel.
Unfortunately, this didn't reproduce the failure, either (i.e., all 20
test runs passed).

All in all, this seems to suggest that the failures I observed might be
caused by a parallelism bug when running the entire test suite.

Regarding the cause of failure, the 5 tests all failed with a message
like the following:

--8<---------------cut here---------------start------------->8---
ERROR: In procedure canonicalize-path:
In procedure canonicalize-path: No such file or directory
+ guix download --version
Backtrace:
In ice-9/boot-9.scm:
  2875:24 19 (_)
   222:17 18 (map1 (((guix utils)) ((guix config)) ((guix #)) ((…)) …))
  2788:17 17 (resolve-interface (guix utils) #:select _ #:hide _ # _ …)
  2714:10 16 (_ (guix utils) _ _ #:ensure _)
  2982:16 15 (try-module-autoload _ _)
   2312:4 14 (save-module-excursion #<procedure 1397630 at ice-9/boo…>)
  3002:22 13 (_)
In unknown file:
          12 (primitive-load-path "guix/utils" #<procedure 130d260 a…>)
In guix/utils.scm:
     26:0 11 (_)
In ice-9/boot-9.scm:
   2862:4 10 (define-module* _ #:filename _ #:pure _ #:version _ # _ …)
  2875:24  9 (_)
   222:17  8 (map1 (((guix config)) ((srfi srfi-1)) ((srfi #)) (#) …))
  2788:17  7 (resolve-interface (guix config) #:select _ #:hide _ # _ …)
  2714:10  6 (_ (guix config) _ _ #:ensure _)
  2982:16  5 (try-module-autoload _ _)
   2312:4  4 (save-module-excursion #<procedure 13975d0 at ice-9/boo…>)
  3002:22  3 (_)
In unknown file:
           2 (primitive-load-path "guix/config" #<procedure 130d1a0 …>)
In guix/config.scm:
     86:6  1 (_)
In unknown file:
           0 (canonicalize-path "/home/marusich/guix/test-tmp/db")
--8<---------------cut here---------------end--------------->8---

All the test failures looked the same, except that instead of "guix
download --version", the equivalent command (e.g., "guix system
--version") was invoked.

I realize this information doesn't help solve the original bug reported
here.  However, it's a real failure, so I hope it'll be useful.  In any
case, it shows that there are probably multiple parallelism bugs lurking
in our code now.  We're going to have to solve all those parallelism
bugs before we can reliably run the tests in parallel again.

-- 
Chris

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]