dotgnu-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [DotGNU]Monitor weirdness


From: Russell Stuart
Subject: RE: [DotGNU]Monitor weirdness
Date: 06 May 2004 16:32:30 +1000

On Thu, 2004-05-06 at 12:59, Thong (Tum) Nguyen wrote:
> I'm not convinced your MonitorAbortDuringWait MonitorInterrup* tests are
> accurate.  Your test tests to see if a waiting thread that is aborted
> reaquires the lock that it was waiting on.  By my tests, MS.NET does not
> require that a thread reaquire a lock.  It does the "pretend" reaquire thing
> where Wait is allowed to exit even though the monitor hasn't been aquired.
> It also lets the thread call an appropriate number of "Exits" even though
> the thread doesn't really own the lock.

Hmmm.  I did try to check for the very thing you are describing in the
tests.  I had another look - and in two cases I got it wrong.  I tried
to eliminate the "pretend" bit by:
   lock (..) {
     :
     thread.Abort()/thread.Interrupt();
     Thread.Sleep(2*1000);
     this.seen = true;
   }
and verifying that "seen" was in fact true when Monitor.Wait(...)
finally exited.  If it was true, there is no "pretending" going on - the
lock was definitely re-acquired.  In the Monitor*DuringWait cases I got
it right.  Ergo they are not pretending under MS.

I got it wrong for the Monitor*AfterWait tests.  When I corrected the
test, I got the result you describe (ie the two Mointor*AfterWait tests
fail under MS fail) - there obviously is some pretending going on.

I have now put the corrected code back into CVS.

Just so it is clear, I will spell out the difference between the two
tests.  As you are aware, Monitor.Wait does two waits:
  wait for pulse.
  wait to re-aquire monitor lock.
The Monitor*DuringWait tests aborting/interrupting the first wait. 
These all pass under MS.  The Monitor*AfterWait tests
aborting/interrupting the second wait.  These fail under MS.

I have posted Russell's example code to
nttp://microsoft.public.dotnet.general.  So far I have one response.  I
wouldn't place too much faith in it, but the responder said he thought
it was a bug.

Finally, the way the TestMonitorAbortEnter test fails, at least for me,
is decidedly odd.  (It doesn't fail under MS.)  I get this result:
  Test failed: Monitor.Enter threw an exception we couldn't catch!
Icky, to say the least.  Something has gone badly wrong.

> Is this correct behaviour?  I don't think so.  Java doesn't allow a thread
> to be interrupted if it can't reaquire the lock it needs to exit wait with a
> consistant state.  Java's behaviour with stop (java's equivalent of abort)
> seems to be that it allows the thread to exit wait without aquiring the lock
> as well.  I would like to point out now that Sun wasn't being stupid when
> the depreciated Thread.stop.

Yes, I agree.

> BTW, I'm still looking over your code.  I'm integrating some of your ideas
> with the existing code.  I still believe that the current algorithm is
> faster and by my tests, it can be 4-10 times faster (though in typical
> desktop apps it may not make too much difference).  The current algorithm is
> very fast when the monitor is uncontested and doesn't require all those
> "global locks" you mentioned except for one compare and exchange.

Hmmm.  My first reaction was impossible - I know my critical path is
shorter than yours!  But I rang my own tests and got the same result. 
Errk!

But I hadn't noticed that you had added ILWaitMonitorFastClaim while I
was writing my version of monitor.c.  I altered my code to use it, and
now it is faster than yours.  It ranges from being about the same for
the wait test, to twice as fast when there is contention.  The lock
(...) without contention (which I assume is the typical case), runs
about 1.5 times as fast.  The benchmark I used to get these results is
attached.  The new version of the patch has been uploaded to savannah.

Unless I am missing something, I don't see how my new version could
possibly ever be slower than your code given they use the same routines
in ILWait*.  The only reason it was slower before is that they weren't
yours used the new ILWaitMonitorFastClaim.

BTW, there is no reason for ILWaitMonitorFastClaim to check for
Interrupts or Aborts.  Interrupts do not take effect until the thread
next blocks, ie is in the WaitJoinSleep state.  The MS Doco actually
says this explicitly.  Also I have written code to verify it is the case
under MS.  God knows what aborts do - but since they can take effect any
time, there hardly seems any point in testing for them in the critical
path.

>   On most
> platforms where the CAX is optimised, Monitor.Enter will require no context
> switches to the kernel because it never has to call ILWaitMonitorTryEnter
> (which uses kernel level locks).  There are a lot of good ideas like not
> incrementing monitor->waiters with InterlockedIncrement if
> InterlockedIncrement is implemented using global locks.  I've decided that
> putting monitors on the GC heap is a good idea.  If a lock is entered
> without being exited before it gets GC-ed then the monitor will leak unless
> it is GC allocated.  The currently implementation of ILWaitMutexClose
> doesn't allow mutexes to close while still being owned so your
> implementation (without the fixes) leaks as well.

Well - both implementations would leak for the same reason.  I did
assume ILWaitHandleClose would be the equivalent of a Destroy.  If it
isn't then perhaps we should add a Destroy.

> I also still like thread
> local free lists for monitors -- combined with the current algorithm it
> gives good performance.

But till not as fast as my implementation ... yet.

> Leaks can be prevented by limiting the free list to
> a reasonable number of monitors (32?).  Keeping count the number of monitors
> (etc) on the list will be fast and lock free because the lists are
> thread-local.

This is additional complexity.  Right now, you have an example of
something that is about 200 lines shorter, and runs up to twice as
fast.  If you can beat that then start thinking about adding complexity.

> FYI, the reason the current monitor implementation failed 5 tests instead of
> 3 is because Monitor.Wait assumed that the monitor is only reclaimed the
> call to ILWaitMonitorWait succeeds without being aborted or interrupted.  If
> there is an abort or interrupt request it assumes that the monitor hasn't
> been reaquired.  As mentioned above, the correct behaviour is ambiguous but
> I'm leaning towards the Sun's implementation rather than Microsoft's (it's
> also easier to implement ;-)).

OK.

> Anyway, try the following program (put a break point in the
> ThreadInterruptedException handler) and cry like I did.  I'd be pretty keen
> to meet with you on IRC and chat about threading issues further...

Hmmm.  I have just read this ... I possibly missed you.  You can always
get me on MSN/ICQ/Yahoo at any time.  One day I will all Jabber as
well.  My IM handles are at the end of this page:

http://www.lubemobile.com.au/ras/agenda/

Finally, back to where this all started.  My real program still hangs
... and it doesn't use interrupt or abort.  It hangs in both your and my
version of monitor.c.  I added some trace to monitor.c, matched up every
entrance to Monitor.Enter to its return.  Ie .. the program is not
waiting in Monitor.Enter.  It doesn't (or at least my code doesn't) use
Monitor.Wait.  I don't have the faintest idea about what is going on.  I
am off to investigate.

Attachment: x.cs
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]