[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#59884: ‘gui-installed-desktop-os-encrypted’ test intermittent failur
From: |
Ludovic Courtès |
Subject: |
bug#59884: ‘gui-installed-desktop-os-encrypted’ test intermittent failures |
Date: |
Fri, 09 Dec 2022 23:32:05 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) |
Hi,
Mathieu Othacehe <othacehe@gnu.org> skribis:
> I spent days on that issue before. It used to show up on all installer
> tests, and even on real hardware, then
> 8ce6f4dc2879919c12bc76a2f4b01200af97e01 mitigated it.
>
> The installation is now made in a container to make sure that we are
> later on able to umount the store overlay even though some background
> processes such as kmscon or udev opened files from the overlay.
Right.
> Now the issue only shows up on that specific test and is intermittent as
> you noticed.
On this particular test it seems to be frequent.
> To be honest, that was quite painful to debug and I'm a bit scared to
> jump back in. I think I had the marionette produce some lsof reports
> back then, or something like that. I very much regret not to have kept
> notes somewhere.
Yeah. One possibility is timing: (restart-service 'guix-daemon) kills
the daemon’s process group, waits for the group leader to terminate,
then starts the daemon. I think there’s a possibility that other
processes in the group (like a ‘guix substitute’ child) are still alive
at that point and they might be the one keeping the device busy.
Sleeping for a couple of seconds should allow us to kinda verify that
hypothesis. However, the fact that it’s only in the cryptsetup case
suggests this hypothesis might well be completely bogus.
But yeah, invoking ‘lsof’ would tell us.
I guess we’ll leave that as future work though.
Thanks for your feedback!
Ludo’.