[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RFC: Lightweight synchronization mechanism for gnumach v2.0

From: Agustina Arzille
Subject: Re: RFC: Lightweight synchronization mechanism for gnumach v2.0
Date: Tue, 29 Mar 2016 22:37:27 -0300

Hello, Samuel

On 03/28/2016 06:29 PM, Samuel Thibault wrote:
Agustina Arzille, on Fri 04 Mar 2016 11:32:10 -0300, wrote:
+#define GSYNC_NBUCKETS   512
+static struct gsync_hbucket gsync_buckets[GSYNC_NBUCKETS];
I'm wondering whether we could want to have buckets per task too?  What
does Linux do for private futexes?  Is 512 enough?  How much does Linux

(not saying that we need this for commiting, rather for future work).

I actually started out with separate hash tables: One for process-local
objects, and a global one for shared gsync. However, after doing some
benchmarking, I found out that it wasn't more efficient, and it also
cluttered up the code. So I ended unifying both into a single table.

512 is simply a magic number that I found out to work fairly well without
being too big. I suppose we could run more tests and find a more
appropriate number.

As for Linux, they too use a global hash table, but its size is determined
by the number of CPUs (The actual size is 256 * NCPUS).

+static int
+valid_access_p (vm_map_t map, vm_offset_t addr,
+  vm_offset_t size, vm_prot_t prot)
+  vm_map_entry_t entry;
+  return (vm_map_lookup_entry (map, addr, &entry) &&
+    entry->vme_end >= addr + size &&
+    (prot & entry->protection) == prot);
Mmm, userland could be running gsync_wait in one thread, and munmap
in another thread, and make the kernel crash because munmap() got
effect beween the valid_access_p check and the actually dereference
in gsync_wait. I believe you need a vm_map_lock_read(current_map());
vm_map_unlock_read(current_map()); around the valid_access_p + actual
access part (and the same for wake and requeue).

Good catch. There's an issue, however:
For shared gsync objects, I'm using vm_map_lookup to find out the
vm object and offset. The problem is that said function expects the map
to be unlocked; as such, we still have that race. A simple workaround is
to do valid_access_p *again* after vm_map_lookup returns (It should be
faster since the vm hint should be set by then). However, it's a bit ugly.
I can't think of a better solution, though, and I have to admit I understand
very little about the vm system.

+      /* XXX: These codes aren't really descriptive, but it's
+       * the best I can think of right now. */
+      ret = ret == THREAD_INTERRUPTED ?
I'd say invent new KERN_* values: neither ABORTED nor TERMINATED really
match the two cases happening here.

Will do.

+  simple_lock (&hbp->lock);
+  if (flags & GSYNC_MUTATE)
+    __atomic_store_n ((unsigned int *)addr,
+      val, __ATOMIC_RELEASE);
Why using an atomic operation? AIUI, simple_lock already provides
atomicity.  You'd need atomic flags on the reader side too to really get
proper semantic actually.

It was mostly to enforce some ordering, but you're right in that it's
useless given the use of simple locks.

+   * Setting the amount to UINT_MAX
+   * should do the trick until we can manage a ridiculously
+   * large amount of waiters. */
+  unsigned int nw = (flags & GSYNC_BROADCAST) ? ~0U : 1;
+      while (--nw != 0 && !list_end (&hbp->entries, runp) &&
+        gsync_key_eq (&node_to_waiter(runp)->key, &w.key));
It doesn't seem too costly to me to do instead

+      while ( (flags & GSYNC_BROADCAST || --nw != 0) && !list_end (&hbp->entries, 
runp) &&
Which avoids to introduce the not-so-happy trick altogether (and some
for requeue)

Yeah, I ended up simplifying things a bit with this.

+  else if ((unsigned long)bp1 < (unsigned long)bp2)
+    {
+      simple_unlock (&bp2->lock);
+      simple_unlock (&bp1->lock);
+    }
+  else
+    {
+      simple_unlock (&bp1->lock);
+      simple_unlock (&bp2->lock);
+    }
There's no real need to order unlocking, but why not :)

A pointless micro-optimization. It's gone now.

Thanks for your work!

BTW, how is your copyright assignment going?

Slow. I sent the letter about 20 days ago, still no answer.

Anyway, here's the new patch. If someone knows of a better solution for
the vm_map_lookup issue as explained above, I'll gladly implement it.

Attachment: diff
Description: Text document

reply via email to

[Prev in Thread] Current Thread [Next in Thread]