bug-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

glibc THREAD_GSCOPE and excessive gsync_wake ()


From: Sergey Bugaev
Subject: glibc THREAD_GSCOPE and excessive gsync_wake ()
Date: Sat, 8 May 2021 22:35:23 +0300

I've noticed that even a simple hello world does a lot of gsync calls:

$ rpctrace echo hello world |& grep -c gsync
53

These are in fact all identical gsync_wake () calls on a single address:

$ rpctrace echo hello world |& grep gsync | uniq
task132(pid18549)->gsync_wake (202820 0 0);

This immediately screams that something is wrong to me, for two reasons:
* Surely there's just a single thread in /bin/echo! Well, there's
always the implicit signal/msg thread, but it shouldn't be creating
*that* much contention.
* gsync_wake () calls are useless unless paired with gsync_wait ().

I traced the gsync_wake () calls to the THREAD_GSCOPE macros in glibc
(sysdeps/mach/hurd/tls.h). As I understand it from looking at the
Linux version, THREAD_GSCOPE is supposed to be an RCU-like mechanism,
where a thread can wait until all the other threads have left the
"global scope" at least once. The Hurd version is different, it waits
for all the other threads to be outside the "global scope" *at the
same time*. Which also sounds wrong to me, because this way threads
constantly entering and exiting the "global scope" at the same time
can completely starve a waiting thread, unlike with RCU. I'm probably
misunderstanding something though.

Anyway, the Hurd implementation seems pretty straightforward: it keeps
a single global counter, increments it when entering "global scope",
decrements it when exiting it. If this was the last thread exiting, it
calls gsync_wake (), and this is where the excessive gsync_wake ()
calls come from.

In my experience building synchronization primitives on top of
futexes, one of the most important things to do is make it so that the
fast (uncontended) path never requires calling into the kernel. Any
(reasonably small) amount of userspace logic is far cheaper than
calling into the kernel. Which is why you add various "have waiters"
flags, to only call futex(FUTEX_WAKE_PRIVATE) if there are, in fact,
any waiters.

So I took a stab at implementing this for THREAD_GSCOPE. I'm currently
building glibc to see if it works; but I might have easily messed up
("futexes are tricky"!). Please tell me if I did :) This is not
supposed to be a patch ready for merging (I don't even know if this is
the right list to send glibc patches to); just an RFC.

Sergey

-- >8 --
diff --git a/sysdeps/mach/hurd/tls.h b/sysdeps/mach/hurd/tls.h
index f83956d3..9fccd9f2 100644
--- a/sysdeps/mach/hurd/tls.h
+++ b/sysdeps/mach/hurd/tls.h
@@ -55,22 +55,46 @@
 /* Global scope switch support.  */
 #define THREAD_GSCOPE_IN_TCB      0
 #define THREAD_GSCOPE_GLOBAL
+
+#define THREAD_GSCOPE_WAITERS (1 << 31)
+
 #define THREAD_GSCOPE_SET_FLAG() \
   atomic_exchange_and_add_acq (&GL(dl_thread_gscope_count), 1)
+
 #define THREAD_GSCOPE_RESET_FLAG() \
-  do      \
-    if (atomic_exchange_and_add_rel (&GL(dl_thread_gscope_count), -1) == 1)   \
-      lll_wake (GL(dl_thread_gscope_count), 0);      \
+  do \
+    { \
+      int count; \
+      count = atomic_exchange_and_add_rel (&GL(dl_thread_gscope_count), -1); \
+      if (count == (1 | THREAD_GSCOPE_WAITERS)) \
+        { \
+          /* We're going to wake everyone; clear the waiters flag.  */ \
+          count = atomic_and_val (&GL(dl_thread_gscope_count),
~THREAD_GSCOPE_WAITERS); \
+          if (count & THREAD_GSCOPE_WAITERS) \
+            lll_wake (GL(dl_thread_gscope_count), GSYNC_BROADCAST); \
+        } \
+    } \
   while (0)
+
 #define THREAD_GSCOPE_WAIT() \
-  do      \
-    {      \
-      int count;      \
-      atomic_write_barrier ();      \
-      while ((count = GL(dl_thread_gscope_count)))      \
-        lll_wait (GL(dl_thread_gscope_count), count, 0);      \
-    }      \
-  while (0)
+  do \
+    { \
+      int count; \
+      count = atomic_load_acquire (&GL(dl_thread_gscope_count));\
+      if (count == 0) \
+        break; \
+      if (!(count & THREAD_GSCOPE_WAITERS)) \
+        { \
+          /* Record that we're going to wait.  */ \
+          if (atomic_compare_and_exchange_bool_acq ( \
+                &GL(dl_thread_gscope_count), \
+                count | THREAD_GSCOPE_WAITERS, count)) \
+            continue; \
+          count |= THREAD_GSCOPE_WAITERS; \
+        } \
+      lll_wait (GL(dl_thread_gscope_count), count, 0); \
+    } \
+  while (1)

 #endif /* !ASSEMBLER */



reply via email to

[Prev in Thread] Current Thread [Next in Thread]