[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#40525: inferior process on core-updates crashes: mmap(PROT_NONE) fai
From: |
Christopher Baines |
Subject: |
bug#40525: inferior process on core-updates crashes: mmap(PROT_NONE) failed |
Date: |
Thu, 16 Apr 2020 20:24:15 +0100 |
User-agent: |
mu4e 1.2.0; emacs 26.3 |
Christopher Baines <address@hidden> writes:
> Ludovic Courtès <address@hidden> writes:
>
>> Hi Christopher,
>>
>> Christopher Baines <address@hidden> skribis:
>>
>>> I've attached a script that when run should reproduce the issue. I
>>> extracted the code relating to lint warnings from the Guix Data
>>> Service. The script attached runs this code twice against the inferior,
>>> once will often be enough to cause it to crash, but twice should
>>> reproduce it more reliably.
>>
>> Thanks a lot.
>>
>> Here’s a backtrace from the core dumped by the inferior:
>
> ...
>
>> It could be an unbounded growth of libgc’s finalizer table or our weak
>> tables as we experienced in <https://bugs.gnu.org/28590>.
>>
>> We should be able to reproduce it with something like:
>>
>> guix time-machine --commit=d523eb5c9c2659cbbaf4eeef3691234ae527ee6a -- \
>> lint -c
>> inputs-should-be-native,license,mirror-url,source-file-name,source-unstable-tarball,derivation,patch-file-names,formatting,synopsis
>>
>> In top one can see that heap usage keeps growing, which may well be a
>> bug in Guix proper rather than in Guile… but it doesn’t crash.
>>
>> I would propose three actions here:
>>
>> 1. Run linters un ‘gcprof’ to see what’s eating memory and hopefully
>> find and address the leak. As a start, maybe just start reducing
>> the list of checkers to see if there’s one of them that’s causing
>> it.
>>
>> The ‘derivation’ checker is definitely responsible for a lot of the
>> heap consumption because of the various caches in (guix packages) &
>> co. Perhaps add calls to ‘invalidate-derivation-caches!’ as in
>> (gnu ci).
>>
>> 2. Work around the problem in Guix Data Service by running, say, one
>> inferior per checker instead of one inferior for all checkers for
>> all packages.
>>
>> 3. If #1 didn’t help, let’s see if we can isolate a Guile weak-table
>> bug or something like that.
>>
>> Thoughts?
>
> Thanks, that's useful to know.
>
> I think I've now managed to find a way of reproducing this without the
> inferior getting in the way. I was testing if triggering garbage
> collection in Guile would help avoid the problem, but actually it seems
> to cause it. I guess given the mentions of GC in the above stacktrace,
> and the major version change of libgc, some GC related bug seems quite
> likely here.
>
> I've been testing with a checkout of Guix built with Guix from the
> core-updates branch. I think that provides the same broken Guile that
> the guix repl is using.
>
> When trying to just use a checkout of the core-updates branch, and guile
> built from that branch I get the following odd error:
>
> → ./pre-inst-env
> /gnu/store/18hp7flyb3yid3yp49i6qcdq0sbi5l1n-guile-3.0.2/bin/guile
> ./reproduce-core-updates-mmap-PROT_NONE-failed.scm
> guile: warning: failed to install locale
> warning: failed to load '(gnu packages abiword)': Function not implemented
> error: git-fetch: unbound variable
> hint: Did you forget `(use-modules (guix git-download))'?
>
> error: git-version: unbound variable
>
>
>
> No idea what's happening there, but when I ./configure and make with
> packages from core-updates, I seem to end up with a setup that works:
>
> This is the guile I'm using:
> /gnu/store/18hp7flyb3yid3yp49i6qcdq0sbi5l1n-guile-3.0.2/bin/guile
>
> If you just run the script, you should see:
>
> → ./pre-inst-env guile ./reproduce-core-updates-mmap-PROT_NONE-failed.scm
>
> ;;; ("%package-table-setup" #<hash-table 7f5f329278a0 13275/28099>)
> mmap(PROT_NONE) failed
> Aborted
>
>
> For more information, you can pipe the script to the REPL. What you
> should see is that it's slow to compute the lint warnings the first
> time, but the subsequent times are quick, and it crashes in one of the
> (gc) calls.
>
> I'm going to try and continue looking in to this, at least it'll be
> easier to delve in to guile now that I can directly control what guile
> is used.
Following up on this, I've built Guile on core-updates with libgc@7
rather than libgc@8 (which is what's used above), and I can't reproduce
the issue. So, I'm getting more certain that this is a regression which
the libgc upgrade has led to.
Would it be feasible to keep guile, or at least the guile Guix uses with
libgc@7 for now?
signature.asc
Description: PGP signature
- bug#40525: inferior process on core-updates crashes: mmap(PROT_NONE) failed, Christopher Baines, 2020/04/09
- bug#40525: inferior process on core-updates crashes: mmap(PROT_NONE) failed, Ludovic Courtès, 2020/04/10
- bug#40525: inferior process on core-updates crashes: mmap(PROT_NONE) failed, Christopher Baines, 2020/04/10
- bug#40525: inferior process on core-updates crashes: mmap(PROT_NONE) failed, Ludovic Courtès, 2020/04/11
- bug#40525: inferior process on core-updates crashes: mmap(PROT_NONE) failed, Christopher Baines, 2020/04/16
- bug#40525: inferior process on core-updates crashes: mmap(PROT_NONE) failed,
Christopher Baines <=
- bug#40525: inferior process on core-updates crashes: mmap(PROT_NONE) failed, Ludovic Courtès, 2020/04/17
- bug#40525: inferior process on core-updates crashes: mmap(PROT_NONE) failed, Christopher Baines, 2020/04/17
- bug#40525: inferior process on core-updates crashes: mmap(PROT_NONE) failed, Christopher Baines, 2020/04/18