Re: Hanging conftest

bug-gnulib

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Hanging conftest

From:	Eric Blake
Subject:	Re: Hanging conftest
Date:	Wed, 27 Nov 2013 10:27:20 -0700
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.1.0

[adding glibc]

On 11/27/2013 09:58 AM, Michal Privoznik wrote:
> Hey guys,
> 
> I've just discovered a bug, well a hang in conftest. This is what I ran:
> 
> libvirt.git $ git clean -fxd; ./autogen.sh --system
> 
> and all looked good until this:
> 
> checking whether readlink signature is correct... yes
> checking whether readlink handles trailing slash correctly... yes
> checking for working re_compile_pattern... 
> 
> When the configure script hang and didn't continue. Attaching a debugger to 
> hanging conftest process showed:
> 

> __lll_lock_wait_private () at 
> ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:93
> 93      ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: No such file 
> or directory.
> (gdb) bt
> #0  __lll_lock_wait_private () at 
> ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:93
> #1  0x0000003274e7fbd3 in _L_lock_11326 () at malloc.c:5236
> #2  0x0000003274e7dd55 in __GI___libc_malloc (bytes=53) at malloc.c:2921

Sounds like glibc is trying to obtain the malloc lock...

> #3  0x0000003274a0533a in local_strdup (s=0x7feab7a0bf21 
> "/usr/lib/gcc/x86_64-pc-linux-gnu/4.8.2/libgcc_s.so.1") at dl-load.c:162
> #4  0x0000003274a08588 in _dl_map_object (address@hidden, address@hidden 
> "libgcc_s.so.1", address@hidden, address@hidden, address@hidden, 
> nsid=<optimized out>)
>     at dl-load.c:2249
> #5  0x0000003274a12a2c in dl_open_worker (address@hidden) at dl-open.c:225
> #6  0x0000003274a0e8c4 in _dl_catch_error (address@hidden, address@hidden, 
> address@hidden, address@hidden <dl_open_worker>, 
>     address@hidden) at dl-error.c:178
> #7  0x0000003274a124b1 in _dl_open (file=0x3274f60e30 "libgcc_s.so.1", 
> mode=-2147483647, caller_dlopen=<optimized out>, nsid=-2, argc=1, 
> argv=0x7fff58d2c9d8, env=0x7fff58d2c9e8) at dl-open.c:639
> #8  0x0000003274f1b202 in do_dlopen (address@hidden) at dl-libc.c:89
> #9  0x0000003274a0e8c4 in _dl_catch_error (objname=0x7fff58d2b9c0, 
> errstring=0x7fff58d2b9c8, mallocedp=0x7fff58d2b9bf, operate=0x3274f1b1c0 
> <do_dlopen>, args=0x7fff58d2b9e0) at dl-error.c:178
> #10 0x0000003274f1b29f in dlerror_run (address@hidden <do_dlopen>, 
> address@hidden) at dl-libc.c:48
> #11 0x0000003274f1b311 in __GI___libc_dlopen_mode (address@hidden 
> "libgcc_s.so.1", address@hidden) at dl-libc.c:165
> #12 0x0000003274ef7895 in init () at ../sysdeps/x86_64/../ia64/backtrace.c:53
> #13 0x0000003274ef79e5 in __GI___backtrace (address@hidden, address@hidden) 
> at ../sysdeps/x86_64/../ia64/backtrace.c:104
> #14 0x0000003274e74364 in __libc_message (address@hidden, address@hidden "*** 
> glibc detected *** %s: %s: 0x%s ***\n") at 
> ../sysdeps/unix/sysv/linux/libc_fatal.c:178

...in order to report malloc arena corruption...

> #15 0x0000003274e79d2e in malloc_printerr (action=3, str=0x3274f6248b 
> "malloc(): memory corruption", ptr=<optimized out>) at malloc.c:5007
> #16 0x0000003274e7b7e4 in _int_malloc (av=0x32751a1620 <main_arena>, 
> bytes=<optimized out>) at malloc.c:3555
> #17 0x0000003274e7ed90 in __libc_calloc (n=216713008672, address@hidden, 
> elem_size=0, address@hidden) at malloc.c:3274

...detected while the malloc lock is already held.  That explains the
deadlock.  Sounds like a glibc bug worth fixing (if it isn't already) -
if glibc is going to go the the effort of informing the user about
memory corruption, it should not use malloc() in the attempt.

> #18 0x0000003274ec237b in create_cd_newstate (hash=4, context=0, 
> nodes=0x7fff58d2c5f0, dfa=0x1d0b6c0) at regex_internal.c:1671
> #19 re_acquire_state_context (address@hidden, address@hidden, address@hidden, 
> context=0) at regex_internal.c:1546
> #20 0x0000003274ec7d4d in transit_state_mb (pstate=<optimized out>, 
> pstate=<optimized out>, mctx=0x7fff58d2c620) at regexec.c:2575
> #21 transit_state (state=0x1d0d340, mctx=0x7fff58d2c620, err=0x7fff58d2c5e8) 
> at regexec.c:2286
> #22 check_matching (p_match_first=0x7fff58d2c5e4, fl_longest_match=1, 
> mctx=0x7fff58d2c620) at regexec.c:1172
> #23 re_search_internal (address@hidden <regex.3883>, address@hidden 
> <data.3891> "ကျွန်ုပ်x", address@hidden, start=<optimized out>, 
> address@hidden, address@hidden, 
>     address@hidden, nmatch=<optimized out>, address@hidden, address@hidden, 
> address@hidden) at regexec.c:843

Aha - you have an older glibc.  This is
https://sourceware.org/bugzilla/show_bug.cgi?id=15078 and has been fixed
in 2.18.  Gnulib is intentionally testing for the flaw; but the problem
is that since the flaw involves memory corruption, and a secondary flaw
of memory corruption is the possibility of deadlock (as your stacktrace
shows), gnulib needs to ensure that the test gracefully times out if no
progress is being made.

But I'm confused: gnulib already has this in the conftest in question:


#if HAVE_DECL_ALARM
            /* Some builds of glibc go into an infinite loop on this
test.  */
            signal (SIGALRM, SIG_DFL);
            alarm (2);
#endif

Why is that not working to kill the test after 2 seconds rather than
going into deadlock?

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

[Prev in Thread]

Current Thread

[Next in Thread]

Hanging conftest, Michal Privoznik, 2013/11/27
- Re: Hanging conftest, Eric Blake <=
  - Re: Hanging conftest, Siddhesh Poyarekar, 2013/11/28
    - Re: Hanging conftest, Ondřej Bílka, 2013/11/28
    - Re: Hanging conftest, Eric Blake, 2013/11/28
    - Re: Hanging conftest, Eric Blake, 2013/11/28

Prev by Date: Hanging conftest
Next by Date: small improvements to selinux-h
Previous by thread: Hanging conftest
Next by thread: Re: Hanging conftest
Index(es):
- Date
- Thread