|
From: | Richard Frith-Macdonald |
Subject: | Re: Another multithreading bug (I think) |
Date: | Thu, 7 Sep 2006 12:23:04 +0100 |
On 7 Sep 2006, at 11:26, Richard Frith-Macdonald wrote:
On 7 Sep 2006, at 10:56, Wim Oudshoorn wrote:Richard Frith-Macdonald <richard@tiptree.demon.co.uk> writes:2 - the connection_table_gate is a non-recursive lock.Changed to be recursive.3 - the connection_table_gate should NOT change to a recursive lock because it guards a hash table.That's not a problem because the table does not retain its contents, so removing/adding an object has no side effects.Hm, well, the connection_table_gate is used to lock access to the connection_table. connection_table is a NSHashTable and if the lock is recursive it can happen that during enumeration, by having side effects, indirectly a new connection will/can be added or removed.Of course it is a bug if this happens. Like the +[NSConnection _threadWillExit]bug I mentioned recently. So if the lock is recursive such anconnection_table usage bug will lead to a crash and in the other case to a deadlock. I think the deadlock is slightly easier to debug (if the stacktrace works at least.)Sure ... what I was trying to say was that there are no side effects ... none that I can find anyway ... so a recursive lock would be OK. However, I agree that it's nicer to use non-recursive locks where possible.a more elegant solution here would be to scrap the special processingin -release and have -dealloc do the work: 1. lock the connection table lock 2. check the current retain count3a. ... if it is greater than 1, then something else has retained theobject ... so we cancel the dealloc and unlock 3b. if it is 1, we remove the connection from the table, unlock, and proceed to finalize/deallocateIf it happens in dealloc you could conceivably use the _refGate lock.This is already recursive and an ivar, so you don't have threads fightingfor the global lock.Unfortunately that wouldn't work ... since another thread could find the connection in the table while our thread is deallocating it, and then be left trying to retain a deallocated object. We really have to prevent other threads from trying to use the object if we are deallocating it.But ... I think this reveals a bug in NSObject ... the -retain and -release implementations are not currently entirely thread-safe. Theyhandle the testing and changing of the reference counts in a thread- safe manner, but during release there is a gap between testing/ decrementing the reference count and calling the -dealloc method.Hm I see, yuck.Not too difficult to fix I think. I'm short of good test programs for threading though :-(
Had more thoughts about this ... protecting the whole of the deallocation with the same lock that protects the reference counting does not seem good ... it would mean that other threads are locked out of doing *any* retain/release on *any* object while a connection is being deallocated ... if it didn't cause deadlocks it would be bound to at least be a performance issue. Not to mention that all deallocation operations would need to be wrapped in exception handlers to ensure that the locks got unlocked again. I also checked with a test program on MacOS-X and found that it has the same unprotected gap between the check/update of the reference count and the call to -dealloc.
So ... I think we have to assume that the interpretation of this is ...If a release causes the extra reference count to drop to zero then we assume that no other thread can access the object (if another thread could access it, it should have retained it, in which case the extra reference count could not have dropped to zero).
The implication of this seems to be that if we are going to store an object (like a connection) in any sort of collection where it's not retained and should remove itsself upon deallocation, we need to implement custom retain/release methods to manage thread-safety. So there may well be similar thread safety problems with other classes.
As for the NSConnection class, it seems to me that the simplest solution now is to make nsconnection_table_gate a recursive lock and use it to protect both the retain and release methods (and make very sure that there really are no important side effects when manipulating the table).
[Prev in Thread] | Current Thread | [Next in Thread] |