[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: deadlock in NPTL, FUTEX
From: |
Paul Pluzhnikov |
Subject: |
Re: deadlock in NPTL, FUTEX |
Date: |
Sat, 20 Sep 2008 14:14:15 -0700 |
User-agent: |
Gnus/5.1006 (Gnus v5.10.6) XEmacs/21.4 (Jumbo Shrimp, linux) |
Bernhard Ibertsberger <elgringo@gmx.at> writes:
> since Kernel 2.5 signals seems to contain some trickier
> pitfalls. "Native POSIX Threads Library" vs linuxthreads-0.10
This isn't new to NPTL; you could have encountered the exact same
deadlock with LinuxThreads.
> Investigating that issue i found:
> http://lwn.net/Articles/124747/
The article above deals with the kernel side of things, and have
absolutely nothing to do with your problem, which is entirely in
user space.
> * what exactly is the anatomy of the deadlock inside the libc and the
> * kernel?
Huh? The deadlock is inside libc; kernel has nothing to do with it;
and what exactly do you mean by "anatomy".
> Does it necessarily need a pagefault or can the deadlock
> * occue depending on other circumstances?
Your test case doesn't encounter any pagefaults, and pagefaults
have nothing to do with calling non-reentrant functions.
> * where is the context between localtime() und vsprintf()?
Huh? You seem to use words like "context" and "anatomy" in a meaning
that is entirely unfamiliar to me.
> In the
> * source of localtime() i can't find anything (just to __tz_convert())?
> * where can i find further information concerning this problem?
What further information do you need?
Consider the following function:
int x;
void foo() { while (x) sleep(1); x+=1; sleep(60); x-=1; }
Do you understand that if this function is interrupted while it is
inside 'sleep(60)', and if the interrupt handler calls it again,
then this function (and the interrupt handler) will *never* make
any further progress?
If you understand that, then substitute 'x' with a mutex, and you'll
understand exactly the nature of deadlock you are observing.
> If i run the demo[2] in form
> $ ./printf-hang | grep t
> it deadlocks to although there are no calls to ctime.
ctime is not the only function in libc that needs a lock.
*Every* function that may need a lock may exhibit the same problem.
There is a very short list of functions that are guaranteed to not
show this problem, they are called async-signal safe. Neither ctime,
nor printf are on that list.
> This means the concurrent printfs to the (non-threadsave) stdout are enough to
> deadlock.
Stdout *is* thread safe, and so is printf. But printf is not
async-signal safe, and may deadlock if called from a signal
handler. Solution: don't do that.
Cheers,
--
In order to understand recursion you must first understand recursion.
Remove /-nsp/ for email.