bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#55441: [cuirass] hang in "In progress..."; runs out of pgsql connect


From: Ludovic Courtès
Subject: bug#55441: [cuirass] hang in "In progress..."; runs out of pgsql connections
Date: Tue, 24 May 2022 23:02:13 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)

Hi!

Ludovic Courtès <ludo@gnu.org> skribis:

> Fixed in Guix commit a4994d739306abcf3f36706012fb88b35a970e6b with a
> test that reproduces the issue.
>
> Commit d02b7abe24fac84ef1fb1880f51d56fc9fb6cfef updates the ‘guix’
> package so we should be able to reconfigure berlin now and hopefully
> (crossing fingers!) be done with it.

An update: Cuirass is now up-to-date on berlin.guix, built from Guix
commit adf5ae5a412ed13302186dd4ce8e2df783d4515d.

Unfortunately, while evaluations now run to completion, child processes
of ‘cuirass evaluate’ stick around at the end:

--8<---------------cut here---------------start------------->8---
(gdb) bt
#0  futex_wait (private=0, expected=2, futex_word=0x7f5b1d054f08) at 
../sysdeps/nptl/futex-internal.h:146
#1  __lll_lock_wait (futex=futex@entry=0x7f5b1d054f08, private=0) at 
lowlevellock.c:52
#2  0x00007f5b1d873ef3 in __GI___pthread_mutex_lock 
(mutex=mutex@entry=0x7f5b1d054f08) at ../nptl/pthread_mutex_lock.c:80
#3  0x00007f5b1d995303 in scm_c_weak_set_remove_x (pred=<optimized out>, 
closure=0x7f5b13dd8d00, raw_hash=1824276156261873434, set=#<weak-set 
7f5b156772f0>) at weak-set.c:794
#4  scm_weak_set_remove_x (obj=#<port #<port-type file 7f5b1567ab40> 
7f5b13dd8d00>, set=#<weak-set 7f5b156772f0>) at weak-set.c:817
#5  close_port (explicit=<optimized out>, port=#<port #<port-type file 
7f5b1567ab40> 7f5b13dd8d00>) at ports.c:891
#6  close_port (port=#<port #<port-type file 7f5b1567ab40> 7f5b13dd8d00>, 
explicit=<optimized out>) at ports.c:874
#7  0x00007f5af3a7df82 in ?? ()
#8  0x0000000000dbd860 in ?? ()
#9  0x00007f5af3a7df60 in ?? ()
#10 0x0000000000db82b8 in ?? ()
#11 0x00007f5b1d972ccc in scm_jit_enter_mcode (thread=0x7f5b157bf240, 
mcode=0xdbd86c "\034\217\003") at jit.c:6038
#12 0x00007f5b1d9c7f3c in vm_regular_engine (thread=0x7f5b157bf240) at 
vm-engine.c:360
#13 0x00007f5b1d9d55e9 in scm_call_n (proc=<optimized out>, argv=<optimized 
out>, nargs=0) at vm.c:1608
#14 0x00007f5b1d939a0e in scm_call_with_unblocked_asyncs (proc=#<program 
7f5aebcd7f40>) at async.c:406
#15 0x00007f5b1d9c8336 in vm_regular_engine (thread=0x7f5b157bf240) at 
vm-engine.c:972
#16 0x00007f5b1d9d55e9 in scm_call_n (proc=<optimized out>, argv=<optimized 
out>, nargs=0) at vm.c:1608
#17 0x00007f5b1d9c4be6 in really_launch (d=0x7f5aebccac80) at threads.c:778
#18 0x00007f5b1d93b85a in c_body (d=0x7f5aea691d80) at continuations.c:430
#19 0x00007f5aeeb118c2 in ?? ()
#20 0x00007f5b1553d7e0 in ?? ()
#21 0x00007f5b138a7370 in ?? ()
#22 0x0000000000000048 in ?? ()
#23 0x00007f5b1d972ccc in scm_jit_enter_mcode (thread=0x7f5b157bf240, 
mcode=0xdbc874 "\034<\003") at jit.c:6038
#24 0x00007f5b1d9c7f3c in vm_regular_engine (thread=0x7f5b157bf240) at 
vm-engine.c:360
#25 0x00007f5b1d9d55e9 in scm_call_n (proc=<optimized out>, argv=<optimized 
out>, nargs=2) at vm.c:1608
#26 0x00007f5b1d93d09a in scm_call_2 (proc=<optimized out>, arg1=<optimized 
out>, arg2=<optimized out>) at eval.c:503
#27 0x00007f5b1d9f3752 in scm_c_with_exception_handler.constprop.0 (type=#t, 
handler_data=handler_data@entry=0x7f5aea691d10, 
thunk_data=thunk_data@entry=0x7f5aea691d10,
    thunk=<optimized out>, handler=<optimized out>) at exceptions.c:170
#28 0x00007f5b1d9c588f in scm_c_catch (tag=<optimized out>, body=<optimized 
out>, body_data=<optimized out>, handler=<optimized out>, 
handler_data=<optimized out>,
    pre_unwind_handler=<optimized out>, pre_unwind_handler_data=0x7f5b156b2040) 
at throw.c:168
#29 0x00007f5b1d93de66 in scm_i_with_continuation_barrier 
(pre_unwind_handler=0x7f5b1d93db80 <pre_unwind_handler>, 
pre_unwind_handler_data=0x7f5b156b2040, handler_data=0x7f5aea691d80,
    handler=0x7f5b1d9448b0 <c_handler>, body_data=0x7f5aea691d80, 
body=0x7f5b1d93b850 <c_body>) at continuations.c:368
#30 scm_c_with_continuation_barrier (func=<optimized out>, data=<optimized 
out>) at continuations.c:464
#31 0x00007f5b1d9c4b39 in with_guile (base=0x7f5aea691e08, data=0x7f5aea691e30) 
at threads.c:645
#32 0x00007f5b1d89b0ba in GC_call_with_stack_base () from 
/gnu/store/2lczkxbdbzh4gk7wh91bzrqrk7h5g1dl-libgc-8.0.4/lib/libgc.so.1
#33 0x00007f5b1d9bd16d in scm_i_with_guile (dynamic_state=<optimized out>, 
data=0x7f5aebccac80, func=0x7f5b1d9c4b70 <really_launch>) at threads.c:688
#34 launch_thread (d=0x7f5aebccac80) at threads.c:787
#35 0x00007f5b1d871d7e in start_thread (arg=0x7f5aea692640) at 
pthread_create.c:473
#36 0x00007f5b1d46feff in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb) info threads
  Id   Target Id             Frame
* 1    process 53801 "guile" futex_wait (private=0, expected=2, 
futex_word=0x7f5b1d054f08) at ../sysdeps/nptl/futex-internal.h:146
--8<---------------cut here---------------end--------------->8---

Notice there’s a single thread: it very much looks like the random
results one gets when forking a multithreaded process (in this case,
this one thread is a finalization thread, except it’s running in a
process that doesn’t actually have the other Guile threads).  The
fork+threads problem is already manifesting, after all.

I’ll try and come up with a solution to that, if nobody beats me at it.
What’s annoying is that it’s not easy to test: the problem doesn’t
manifest on my 4-core laptop, but it does on the 96-core berlin.

To be continued…

Ludo’.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]