[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: __thread errno
From: |
Thomas Schwinge |
Subject: |
Re: __thread errno |
Date: |
Thu, 2 Aug 2012 02:05:44 +0200 |
User-agent: |
Notmuch/0.9-101-g81dad07 (http://notmuchmail.org) Emacs/23.3.1 (i486-pc-linux-gnu) |
Hi!
While the issue is Hurd-specific, non-Hurd people might nevertheless be
able to help here with their glibc/TLS expertise.
I'm working on a patch to move the Hurd's errno from the Hurd-specific
threadvar (in short, a mechanism somewhat equivalent to TLS, using a
portion of space at the beginning of a thread's stack for storing
thread-specific data) to TLS proper.
The specific glibc tree is
<http://git.savannah.gnu.org/cgit/hurd/glibc.git/tree/?id=cba1c83ad62a11347684a9daf349e659237a1741>,
but apart from Hurd-specifc patches this is equivalent to mainline commit
fc56c5bbc1a0d56b9b49171dd377c73c268ebcfd.
On Thu, 10 May 2012 17:25:59 +0800, I wrote:
> $ gdb -q --args ./ld.so
> Reading symbols from /home/tschwinge/tmp/ld.so...done.
> (gdb) r
> Starting program: /home/tschwinge/tmp/ld.so
>
> Program received signal EXC_BAD_ACCESS, Could not access memory.
> 0x00015797 in __strerror_r (errnum=0, buf=0x0, buflen=2) at
> dl-minimal.c:173
> 173 dl-minimal.c: No such file or directory.
> in dl-minimal.c
> (gdb) bt
> #0 0x00015797 in __strerror_r (errnum=0, buf=0x0, buflen=2) at
> dl-minimal.c:173
> #1 0x00000000 in ?? ()
> (gdb) info registers
> eax 0x0 0
> ecx 0xa 10
> edx 0x2 2
> ebx 0x26ff4 159732
> esp 0x1028c60 0x1028c60
> ebp 0x1028cb8 0x1028cb8
> esi 0xa 10
> edi 0x21b4c 138060
> eip 0x15797 0x15797 <__strerror_r+167>
> eflags 0x10202 [ IF RF ]
> cs 0x17 23
> ss 0x1f 31
> ds 0x1f 31
> es 0x1f 31
> fs 0x1f 31
> gs 0x1f 31
>
> 0x15797 is bogus: it's not even an instruction boundary.
>
> Apparently I forgot how to debug ld.so from the very beginning...
>
> It seems that gs is not set up, but even if that were an invalid TLS gs:X
> access, that doesn't explain to me how the PC would be badly affected by
> that?
It turns out that GDB's understanding of addresses (.text only?) is off
by 0x1000 (has been reloacted, I assume), so after hitting a breakpoint
you have to »set $pc = $pc - 0x1000« to be able to make sense out of
backtraces, etc. (For posterity, in case this is useful to someone who
then remembers these words, I eventually figured this out by sprinkling a
few »__asm __volatile ("hlt");« (to transfer control to GDB) before the
places in ld.so code where TLS data (errno, specifically) is accessed,
and then comparing the dissassembly and looking for looking for magic
constants, where I found »movl $0x40000009,%gs:(%eax)« (»errno = EBADF«)
and that constant only used in two places, one of them being __writev --
oh, it's trying to print something? -- etc., etc.) Manually offsetting
each frame's PC by -0x1000 I then got a backtrace, which included:
#3 0x00013fb6 in __assert_fail (assertion=0x1e114 "info[30] == ((void *)0)
|| (info[30]->d_un.d_val & ~0x00000008) == 0", file=0x1f4e3 "dynamic-link.h",
line=207, function=0x1f6ec "elf_get_dynamic_info") at dl-minimal.c:208
#4 0x00003f69 in elf_get_dynamic_info (temp=0x0, l=0x24604) at
dynamic-link.h:206
#5 _dl_start (arg=0x1027000) at rtld.c:416
In my understanding of x86 TLS (and that understanding is not too
detailed), »movl $0x40000009,%gs:(%eax)« is local-exec TLS, which causes
the linker to set the DF_STATIC_TLS flag, and thus the assertion in
elf/dynamic-link.h, line 206 to fail:
202 #ifdef RTLD_BOOTSTRAP
203 /* Only the bind now flags are allowed. */
204 assert (info[VERSYMIDX (DT_FLAGS_1)] == NULL
205 || (info[VERSYMIDX (DT_FLAGS_1)]->d_un.d_val & ~DF_1_NOW)
== 0);
206 assert (info[DT_FLAGS] == NULL
207 || (info[DT_FLAGS]->d_un.d_val & ~DF_BIND_NOW) == 0);
208 /* Flags must not be set for ld.so. */
209 assert (info[DT_RUNPATH] == NULL);
210 assert (info[DT_RPATH] == NULL);
211 #else
(Again for posterity, and as GDB would not access the variable properly,
I confirmed this by putting »volatile Elf32_Word tmp =
info[DT_FLAGS]->d_un.d_val; __asm __volatile ("hlt");« before the assert,
and then GDB could »print tmp« to confirm it was 0x10 (DF_STATIC_TLS).)
(At this time, _hurd_init_dtablesize is zero, so it can't print anything
yet, and errno is set to EBADF, triggering the faulting TLS access.
Not knowing what this assert is good for, I simply made it allow the
DF_STATIC_TLS case, too, and this allowed ld.so to progress a little bit
further: if invoked without arguments, it is now able to print its usage
information, elf/rtld.c:dl_main, line 1017.
Yet, something like »./ld.so --library-path $PWD ./libc.so« still fails,
and I (again manually with 0x1000 offset) obtained the following
backtrace:
#0 0x00004a69 in open_verify (name=0x25ae0 "/home/thomas/libc.so",
fbp=0x1026a28, loader=0x0, whatcode=0, found_other_class=0x1026a27,
free_name=true) at dl-load.c:1722
#1 0x00007915 in _dl_map_object (loader=0x0, name=0x102703b
"/home/thomas/libc.so", type=1, trace_mode=0, mode=536870912, nsid=0) at
dl-load.c:2285
#2 0x00002078 in dl_main (phdr=0x1034, phnum=7, user_entry=0x1026eac,
auxv=0x0) at rtld.c:1084
#3 0x00012d25 in go (argdata=0x1026d90) at
../sysdeps/mach/hurd/dl-sysdep.c:213
#4 0x00015f46 in _hurd_startup (argptr=0x1027000, main=0x1026f94) at
hurdstartup.c:188
#5 0x00013be3 in _dl_sysdep_start (start_argptr=0x1027000, dl_main=0x275a
<dl_main+4096>) at ../sysdeps/mach/hurd/dl-sysdep.c:281
#6 0x0000421b in _dl_start_final (arg=0x1027000) at rtld.c:338
#7 _dl_start (arg=0x1027000) at rtld.c:564
dl-load.c:1722 again is an errno access, and the processor's segment
register setup tells me TLS has not yet been initialized at that point.
Now what is important is that glibc's Hurd-specific code, contrary to the
Linux kernel-specific code, does not have a private errno for ld.so:
sysdeps/mach/hurd/dl-sysdep.h:
/* The private errno doesn't make sense on the Hurd. errno is always the
thread-local slot shared with libc, and it matters to share the cell
with libc because after startup we use libc functions that set errno
(open, mmap, etc). */
#define RTLD_PRIVATE_ERRNO 0
And thus in the GNU Hurd configuration, ld.so code uses the TLS errno.
In sysdeps/generic/dl-sysdep.h, this is explained/defined as follows:
/* This macro must be defined to either 0 or 1.
If 1, then an errno global variable hidden in ld.so will work right with
all the errno-using libc code compiled for ld.so, and there is never a
need to share the errno location with libc. This is appropriate only if
all the libc functions that ld.so uses are called without PLT and always
get the versions linked into ld.so rather than the libc ones. */
#ifdef IS_IN_rtld
# define RTLD_PRIVATE_ERRNO 1
#else
# define RTLD_PRIVATE_ERRNO 0
#endif
Now, in elf/rtld.so:dl_main, TLS will eventually be initialized (at
earliest when »we have auditing DSOs to load« -- but this is after
mapping in objects (_dl_map_object which then invokes open_verify that
contains the errno access).
My naïve attempt to simply move »tcbp = init_tls ();« before mapping
objects did not work out -- any suggestions to help me back onto firm
ground?
Any what, by the way, is the story that elf/rtld.c still contains code
conditioned by USE___THREAD (and that code looking somewhat relevant for
my case), but USE___THREAD not being defined anywhere?
Grüße,
Thomas
pgpbn3TBhMW2r.pgp
Description: PGP signature
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: __thread errno,
Thomas Schwinge <=