[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#48903: guix substitute: error: TLS error in procedure 'read_from_ses
From: |
Maxim Cournoyer |
Subject: |
bug#48903: guix substitute: error: TLS error in procedure 'read_from_session_record_port': Error decoding the received TLS packet. |
Date: |
Wed, 30 Jun 2021 12:26:32 -0400 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) |
Hello!
Ludovic Courtès <ludo@gnu.org> writes:
> Hi,
>
> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>
>> $ sudo ps -eF | grep guix-daemon
>> root 25193 216 0 3074 1524 3 Jun28 ? 00:00:00
>> /gnu/store/vphx2839xv0qj9xwcwrb95592lzrrnx7-guix-1.3.0-3.50dfbbf/bin/guix-daemon
>> 25178 guixbuild --max-silent-time 0 --timeout 0 --log-compression
>> none --discover=no --substitute-urls http://127.0.0.1:8080
>> https://ci.guix.gnu.org --max-jobs=4--8<---------------cut
>> here---------------end--------------->8---
>>
>> I can rather easily (and annoyingly!) trigger the problem (and a few
>> variations of it, it seems) with something like:
>>
>> $ packages=$(guix refresh -l protobuf | sed 's/^.*: //')
>> $ guix build -v3 --keep-going $packages
>>
>> For example, running the above, I just got:
>>
>> guix build: error: corrupt input while restoring archive from #<closed:
>> file 7fc95acfc2a0>
>> --8<---------------cut here---------------end--------------->8---
>>
>> Does the above commands succeed on the first time on your end? If you
>> have already lots of things cached, you can try for an architecture you
>> don't often build for by adding the '--system=i686-linux' option; that
>> should cause a massive amount of downloads, likely to trigger the
>> problem. Perhaps also try to use --max-jobs=4.
>
> I’ve tried that, with --max-jobs=4, and it fills my disk just fine. :-/
>
>> If you have ideas of how to debug this when I hit the issue I'm all ears
>> :-).
>
> The attached patch substitutes a number of store items in a row; run:
>
> guix repl -- substitute-stress.scm
>
> and it’ll fill /tmp/substitute-test with 200 substitutes, which should
> be equivalent to the kind of stress test you had above.
>
> It doesn’t crash for me. There are a few “error: no valid substitute
> for /gnu/store/…” errors, but these are expected: was ask for
> substitutes for 200 packages without first checking whether substitutes
> are available.
>
> Could you run it and report back?
>
> You can try with more packages, different substitute URLs, etc.
>
> TIA!
>
> Ludo’.
[...]
I've tried with the following modified version which runs multiple
threads in parallel (to mimic --max-jobs=4 on the daemon), and I've yet
to trigger it, although the hard drive is grinding heavily:
--8<---------------cut here---------------start------------->8---
(use-modules (guix) (gnu packages)
(guix scripts substitute)
(guix grafts)
(guix build utils)
(srfi srfi-1)
(ice-9 match)
(ice-9 threads))
(define test-directory "/tmp/substitute-test")
(define max-jobs 4)
(define packages
;; Subset of packages for which we request substitutes.
(append (map specification->package '("libreoffice"
"ungoogled-chromium"
"openjdk"
"texmacs"))
(take (fold-packages cons '()) 1000)))
(define (spawn-substitution-thread input urls)
"Spawn a 'guix substitute' thread that reads commands from INPUT and uses
URLS as the substitute servers."
(call-with-new-thread
(lambda ()
(parameterize ((%reply-file-descriptor #f)
(current-input-port input))
(setenv "_NIX_OPTIONS"
(string-append "substitute-urls=" (string-join urls)))
(let loop ()
(format (current-error-port) "starting substituter~%")
;; Catch "no valid substitute" errors.
(catch 'quit
(lambda ()
(guix-substitute "--substitute"))
(const #f))
(unless (eof-object? (peek-char input))
(loop)))))))
(for-each (lambda (job)
(match (pipe)
((input . output)
(let ((test-directory* (string-append test-directory "-"
(number->string job)))
(thread (spawn-substitution-thread
input %default-substitute-urls)))
;; Remove the test directory.
(when (file-exists? test-directory*)
(for-each (lambda (f)
(false-if-exception (make-file-writable f)))
(find-files test-directory #:directories? #t))
(delete-file-recursively test-directory*))
(mkdir-p test-directory*)
(parameterize ((%graft? #false))
(with-store store
;; Ask for substitutes for PACKAGES.
(for-each (lambda (package n)
(define item
(run-with-store store
(package-file package)))
(format output "substitute ~a ~a/~a~%"
item test-directory* n))
packages
(iota (length packages))))
(format #t "sent ~a substitution requests...~%"
(length packages))
(close-port output)
;; Wait for substitution to complete.
(join-thread thread))))))
(iota max-jobs))
--8<---------------cut here---------------end--------------->8---
I wonder if there's something more happening in the real scenario
(validating signatures when putting things in the store? or something
similar) that may have a role in the failure.
That's a tough nut to crack!
I'll keep looking for clues.
Thanks for your time!
Maxim