[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Test-lock hang (not 100% reproducible) on GNU/Linux
From: |
Pavel Raiskup |
Subject: |
Re: Test-lock hang (not 100% reproducible) on GNU/Linux |
Date: |
Mon, 02 Jan 2017 17:37:25 +0100 |
User-agent: |
KMail/5.3.3 (Linux/4.8.15-300.fc25.x86_64; KDE/5.27.0; x86_64; ; ) |
On Monday, January 2, 2017 4:50:28 PM CET Bruno Haible wrote:
> Hi Pavel,
>
> > One thing I'm afraid of is that writers could finish too
> > early. Could we could artificially slow them down?
>
> In test_rwlock the test does this:
>
> /* Wait for the threads to terminate. */
> for (i = 0; i < THREAD_COUNT; i++)
> gl_thread_join (threads[i], NULL);
> set_atomic_int_value (&rwlock_checker_done, 1);
> for (i = 0; i < THREAD_COUNT; i++)
> gl_thread_join (checkerthreads[i], NULL);
>
> It waits until all 10 mutator threads are terminated, then sets a
> lock-protected variable rwlock_checker_done to 1, that signals to the
> 10 checker thread that they can terminate at the next occasion, and
> then waits for them to terminate.
>
> Are you saying that the kernel will schedule the 10 checker threads
> with higher priority than the 10 mutator threads, although I have *not*
> specified anything about priorities? That would be a kernel bug, IMO.
That's what I'm not sure about, as discussed in [1], POSIX says (for
pthread_rwlock_wrlock()):
Implementations may favor writers over readers to avoid writer starvation.
But that's too far from 'shall favor' spelling. And when I had a look at my man
pthread_rwlockattr_setkind_np(3), there's written:
PTHREAD_RWLOCK_PREFER_READER_NP
This is the default. A thread may hold multiple read locks;
that is, read locks are recursive. According to The Single Unix
Specification, the behavior is unspecified when a reader tries
to place a lock, and there is no write lock but writers are
waiting. Giving preference to the reader, as is set by
PTHREAD_RWLOCK_PREFER_READER_NP, implies that the reader will
receive the requested lock, even if a writer is waiting. As
long as there are readers, the writer will be starved.
> Especially since the problem occurs only on one architecture.
I've been able to reproduce this on i686 in the meantime too, sorry -- I just
reported what I observed :(. See [1].
> > Could we set PTHREAD_RWLOCK_PREFER_WRITER_NP (in test-lock.c) to avoid
> > those issues?
>
> I disagree. The test is a minimal test of the kernel's multithreading
> support. If among 10 mutator threads and 10 checker threads, all started
> with the same priority, it has such a severe bias that the mutator threads
> never get to run, you have a kernel bug. I should not need a non-portable
> threading function in order to get 20 threads to run reasonably.
>
> Imagine what scenarios you would then get with an application server and
> 400 threads.
It might be bug in libpthread, too, but based on the POSIX specs and manual
pages, I am not sure whether this might be actually considered a bug.
[1]
https://lists.fedoraproject.org/archives/list/address@hidden/thread/PQD576JZLERFY6ROI3GF7UYXKZIRI33G/
Pavel
- Re: Test-lock hang (not 100% reproducible) on GNU/Linux, Pavel Raiskup, 2017/01/02
- Re: Test-lock hang (not 100% reproducible) on GNU/Linux, Bruno Haible, 2017/01/02
- Re: Test-lock hang (not 100% reproducible) on GNU/Linux,
Pavel Raiskup <=
- Re: Test-lock hang (not 100% reproducible) on GNU/Linux, Bruno Haible, 2017/01/03
- Re: Test-lock hang (not 100% reproducible) on GNU/Linux, Pavel Raiskup, 2017/01/04
- Re: Test-lock hang (not 100% reproducible) on GNU/Linux, Bruno Haible, 2017/01/04
- Re: Test-lock hang (not 100% reproducible) on GNU/Linux, Pádraig Brady, 2017/01/04
- Re: Test-lock hang (not 100% reproducible) on GNU/Linux, Bruno Haible, 2017/01/04
- Re: Test-lock hang (not 100% reproducible) on GNU/Linux, Pavel Raiskup, 2017/01/04
- Re: Test-lock hang (not 100% reproducible) on GNU/Linux, Bruno Haible, 2017/01/04
- Re: Test-lock hang (not 100% reproducible) on GNU/Linux, Pavel Raiskup, 2017/01/04