[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: NSHashTable with weak objects crashing
From: |
Fred Kiefer |
Subject: |
Re: NSHashTable with weak objects crashing |
Date: |
Sun, 24 Nov 2019 15:50:13 +0100 |
Thank you David for looking into this issue. I am rather not an expert on this
area but looking through the existing code I would have expected that a fix
should rather go into the NSConcretePointerFunctions.m or
NSConcretePointerFunctions..h file. There we seem to set up the matching
functions when working with the different pointer semantics. I understand that
this mechanism is a lot more clumsy than what is possible with C++ templates
and the compiler support that is offered there for weak pointers.
But I may be completely off here.
Fred
> Am 24.11.2019 um 13:27 schrieb David Chisnall <gnustep@theravensnest.org>:
>
> Hi,
>
> I have spent some time looking at this issue. It appears to be an issue in
> how the GSIMap macros are used in NSConcrete{Hash,Map}Table. This is
> horribly complex code to try to fix, because:
>
> - It is generic code that is using C macros rather than type-safe C++
> templates, so has a load of casts and is very difficult to follow where any
> of the parts actually work.
>
> - It is trying to implement ARC semantics, but the raw operations are retain
> and release (which doesn’t really match to the behaviour of weak references),
> rather than read / assign / delete. At a high level, ARC defines operations
> in terms of ownership by variables, manual RR defines operations in terms of
> operations on objects.
>
> One specific problem in this instance is that GSIMap was that GSIMap was
> passing node->key to the HASH macro, but in ARC mode a weak object pointer
> can’t just be loaded, you need to use the correct read barrier.
>
> There was a comment from 2013 from RFM saying that the old code caused double
> retains, but unfortunately his new code was also incorrect and did not use
> the weak read / write barriers at all in the weak reference case, completely
> breaking NSMapTable with weak keys or values.
>
> I have now fixed it at least enough that the test passes and submitted a PR:
>
> https://github.com/gnustep/libs-base/pull/84
>
> If I were writing this code from scratch, I’d make a templated class that
> took C++ smart pointers as arguments and then have an Objective-C class
> cluster, one for each of the valid combinations of NSPointerFunctionOptions.
> This would make it impossible for bugs like this to creep in, because any
> access would implicitly go via the smart pointer wrappers (if the compiler
> supports ARC, then the ObjC ones would be fairly trivial). WinObjC uses a
> similar approach, though with run-time specialisation.
>
> Note that, even when correct, the current code is quite dependent on the ARC
> optimiser for performance. It will repeatedly call objc_loadWeak(), which
> returns an autoreleased object. The ARC optimiser can hopefully transform
> some of these into objc_loadWeakRetained() calls followed by objc_release()
> calls and avoid filling up the autorelease pool, but I’m not convinced that
> it will for this code. If someone were willing to rewrite GSIMap to expose
> an interface cleanly in terms of operations on variables that hold references
> rather than on objects, then it would be possible for the hash function, for
> example, to be defined as objc_loadWeakRetained(), -hash, objc_release()
> directly, which would avoid touching the autorelease pool.
>
> For reference, a std::unordered_map with strong or weak object pointers Just
> Works™ with clang. If we were using C++ and ARC in GNUstep, the NSMapTable
> and NSHashTable code would be a fraction of their current size and a lot more
> correct. There are various reasons that I rarely contribute to GNUstep
> anymore, but the fact that the community actively prevents me from using
> tools that make my life easier was a significant one.
>
> As a high-level observation, not using ARC within GNUstep itself means that
> we ship bugs. This code has been broken since at least 2013, possibly
> earlier. If GNUstep developers are not using modern Objective-C features
> then anyone who tries to use them will find that things are broken.
>
> David
>
> P.S. I spent about an hour of this debugging process fighting the tests
> suite. Please can someone who understands the test suite make it provide
> simple instructions for building a single test? Or even doing a -j32 build
> of all of the tests, rather than waiting for each one to run before moving on
> to the next one. It takes me about 10 seconds to compile -base and then a
> few minutes to wait for gmake check -j32 to build the test that I care about.
>
>
>> On 11 Nov 2019, at 22:16, Fred Kiefer <fredkiefer@gmx.de> wrote:
>>
>> Hi David,
>>
>> you are aware that Frederik is talking about the travis CI system on Github
>> where his tests are failing? Either the whole build system is set up
>> incorrectly or we have a general issue for week pointers.
>> We probably should not expect these tests to work with gcc, maybe there is a
>> way to disable them for this setup?
>>
>> Fred
>>
>>> Am 11.11.2019 um 18:36 schrieb David Chisnall <gnustep@theravensnest.org>:
>>>
>>>
>>> Where do the tests fail currently? These look sufficiently simple that
>>> they should just be calling a few of the ARC functions. The implementation
>>> of these classes is in NSConcrete{PointerFunctions,HashTable}.m and their
>>> interaction with the runtime is defined here:
>>>
>>> https://github.com/gnustep/libs-base/blob/3d77109fb634f11e5d51dd9b13002102ab419729/Source/NSConcretePointerFunctions.h#L33
>>>
>>> From what you've said, I wonder if you're compiling GNUstep-base without
>>> access to the libobjc2 headers? If so, then it will provide the
>>> old-runtime versions of these functions, where weak is a synonym for
>>> unsafe_unretained (these classes on macOS did this back in the 10.7 or so
>>> days). Can you put a #error in NSConcretePointerFunctions.h on line 33 and
>>> check that this breaks the build for you? If the build still works, then
>>> you're building GNUstep in a configuration that does not provide weak
>>> pointer semantics.
>>>
>>> It might be better, now, to have these methods error at run time if this
>>> support is not available, rather than give an unsafe version.
>>>
>>> If you are compiling the correct version, the most likely cause is that
>>> we're using objc_storeWeak() on a location in memory that does not contain
>>> either 0 or a valid object pointer (e.g. if NSHashTable is getting unzeroed
>>> memory). Please can you put a breakpoint on objc_storeWeak and see what
>>> happens?
>>>
>>> David
>>>
>>> On 11/11/2019 15:00, Frederik Seiffert wrote:
>>>> Hi all,
>>>> I’ve opened a pull request with some very simple tests for the
>>>> NSHashTable/NSMapTable weak object support:
>>>> https://github.com/gnustep/libs-base/pull/80
>>>> Unfortunately they are failing on the CI, so it seems that the issues
>>>> described below are not specific to our setup. I also confirmed the tests
>>>> pass when run against Apple’s Foundation.
>>>> Does anyone have knowledge of the weak pointer support in GNUstep and
>>>> could take a look at this?
>>>> Thanks!
>>>> Frederik
>>>>> Am 07.10.2019 um 16:51 schrieb Frederik Seiffert <frederik@algoriddim.com
>>>>> <mailto:frederik@algoriddim.com>>:
>>>>>
>>>>> Hi David,
>>>>>
>>>>>> While replacing the hash table, I managed to turn it into a
>>>>>> reproduceable fault on at least one machine. The bug is quite subtle:
>>>>>>
>>>>>> We have a map from objects to weak reference structures.
>>>>>> The weak reference structure points to the object.
>>>>>> When we delete the last weak reference, we delete the object from the
>>>>>> map.
>>>>>> We delete the object from the map using the object as the key.
>>>>>> But by the time a weak reference is deallocated, its object pointer has
>>>>>> been zeroed...
>>>>>> ...so we always remove a random element from the map and leave a
>>>>>> dangling pointer in the table.
>>>>>
>>>>> Should this already be fixed on the latest libobjc2 master?
>>>>>
>>>>> Going back to my original issue about NSHashTable with weak objects, I’m
>>>>> still seeing it crash with the latest libobjc2 master and also using the
>>>>> arc-cxx branch.
>>>>>
>>>>> It reproduces quite easily with the following code (compiled with ARC),
>>>>> which crashes either directly in addObject: on the first invocation (when
>>>>> using weakObjectsHashTable), or in -allObjects on the second or third
>>>>> invocation (when using hashTableWithWeakObjects).
>>>>>
>>>>> static __strong NSHashTable *_hashTable = nil;
>>>>> static __strong NSObject *_test = nil;
>>>>> - (void)testHashTable
>>>>> {
>>>>> @autoreleasepool {
>>>>> if (!_hashTable) {
>>>>> _hashTable = [NSHashTable weakObjectsHashTable];// or
>>>>> hashTableWithWeakObjects
>>>>> _test = [NSObjectnew];
>>>>> [_hashTable addObject:_test];// crash 1
>>>>> }else {
>>>>> _test =nil;
>>>>> }
>>>>>
>>>>> NSLog(@"HashTable %@ (_test: %@)", _hashTable.allObjects, _test);//
>>>>> crash 2
>>>>> }
>>>>> }
>>>>>
>>>>> Similarly, NSMapTable crashes as well in the second invocation of the
>>>>> following function, although I’m not sure if this is the same root cause
>>>>> as the hash table crash:
>>>>>
>>>>> static __strong NSMapTable *_mapTable = nil;
>>>>> static __strong NSObject *_test = nil;
>>>>> - (void)testMapTable
>>>>> {
>>>>> @autoreleasepool {
>>>>> if (!_mapTable) {
>>>>> _mapTable = [NSMapTable strongToWeakObjectsMapTable];
>>>>> _test = [NSObjectnew];
>>>>> [_mapTable setObject:_test forKey:@"test"];
>>>>> }else {
>>>>> NSLog(@"obj pre: %@", [_mapTable objectForKey:@"test"]);//
>>>>> crash!!!
>>>>> _test =nil;
>>>>> NSLog(@"obj post: %@", [_mapTable objectForKey:@"test"]);
>>>>> }
>>>>>
>>>>> NSLog(@"MapTable %@ (_test: %@)",
>>>>> _mapTable.dictionaryRepresentation, _test);
>>>>> }
>>>>> }
>>>>>
>>>>> I’m pasting the stack traces below. I’d appreciate if anyone can shed
>>>>> some light on this.
>>>>>
>>>>> Thanks!
>>>>> Frederik
>>>>>
>>>>>
>>>>>
>>>>> *NSHashTable crash using weakObjectsHashTable:*
>>>>>
>>>>> * frame #0: 0xeca8c1d0 libart.so`art_sigsegv_fault
>>>>> frame #1: 0xeca8c774 libart.so`art::FaultManager::HandleFault(int,
>>>>> siginfo*, void*) + 372
>>>>> frame #2: 0xeca8c49b libart.so`art::art_fault_handler(int, siginfo*,
>>>>> void*) (.llvm.650222801) + 43
>>>>> frame #3: 0x625bd6af
>>>>> app_process32`___lldb_unnamed_symbol22$$app_process32 + 623
>>>>> frame #4: 0xefee7c50 libc.so`___lldb_unnamed_symbol2$$libc.so + 1
>>>>> frame #5: 0xd1e56762 libobjc.so`objc_msgSend at
>>>>> objc_msgSend.x86-32.S:120
>>>>> frame #6: 0xd14f69cd libgnustep-base.so`hashObject(item=0xcf40c100,
>>>>> size=0x00000000) at NSConcretePointerFunctions.m:126
>>>>> frame #7: 0xd1400a72
>>>>> libgnustep-base.so`pointerFunctionsHash(PF=0xd0ea7900, item=0xcf40c100)
>>>>> at NSConcretePointerFunctions.h:180
>>>>> frame #8: 0xd13fd9b6
>>>>> libgnustep-base.so`GSIMapBucketForKey(map=0xd0ea78d4, key=GSIMapKey @
>>>>> 0xff99f2a4) at GSIMap.h:410
>>>>> frame #9: 0xd1400ce3
>>>>> libgnustep-base.so`GSIMapAddNodeToMap(map=0xd0ea78d4, node=0xcf40c0f0) at
>>>>> GSIMap.h:453
>>>>> frame #10: 0xd13fcb64 libgnustep-base.so`GSIMapAddKey(map=0xd0ea78d4,
>>>>> key=GSIMapKey @ 0xff99f334) at GSIMap.h:1118
>>>>> frame #11: 0xd13fe82f libgnustep-base.so`-[NSConcreteHashTable
>>>>> addObject:](self=0xd0ea78d4, _cmd="C", anObject=0xd0efd14c) at
>>>>> NSConcreteHashTable.m:848
>>>>> frame #12: 0xd1e2e7ff libnative-lib.so`::-[ObjectiveCTest
>>>>> testHashTable](self=0xcf40c0e4, _cmd="V\x1c") at ObjectiveCTest.mm:78
>>>>>
>>>>> *NSHashTable crash using hashTableWithWeakObjects:*
>>>>>
>>>>> * frame #0: 0xeca8c1d0 libart.so`art_sigsegv_fault
>>>>> frame #1: 0xeca8c774 libart.so`art::FaultManager::HandleFault(int,
>>>>> siginfo*, void*) + 372
>>>>> frame #2: 0xeca8c49b libart.so`art::art_fault_handler(int, siginfo*,
>>>>> void*) (.llvm.650222801) + 43
>>>>> frame #3: 0x625bd6af
>>>>> app_process32`___lldb_unnamed_symbol22$$app_process32 + 623
>>>>> frame #4: 0xefee7c50 libc.so`___lldb_unnamed_symbol2$$libc.so + 1
>>>>> frame #5: 0xd1ea975d libobjc.so`objc_msgSend at
>>>>> objc_msgSend.x86-32.S:120
>>>>> frame #6: 0xd13170bd libgnustep-base.so`-[GSInlineArray
>>>>> initWithObjects:count:](self=0xcf4fd184, _cmd="\xffffff81",
>>>>> objects=0xff99f320, count=1) at GSArray.m:420
>>>>> frame #7: 0xd131a41d libgnustep-base.so`-[GSPlaceholderArray
>>>>> initWithObjects:count:](self=0xcf4fd184, _cmd="\xffffff81",
>>>>> objects=0xff99f320, count=1) at GSArray.m:1199
>>>>> frame #8: 0xd13c2b80 libgnustep-base.so`-[NSConcreteHashTable
>>>>> allObjects](self=0xd0da9a14, _cmd="m\x02") at NSConcreteHashTable.m:875
>>>>> frame #9: 0xd1df7844 libnative-lib.so`::-[ObjectiveCTest
>>>>> testHashTable](self=0xcf391444, _cmd="V\x1c") at ObjectiveCTest.mm:84
>>>>>
>>>>> *NSMapTable crash:*
>>>>>
>>>>> * frame #0: 0xeca8c1d1 libart.so`art_sigsegv_fault + 1
>>>>> frame #1: 0xeca8c774 libart.so`art::FaultManager::HandleFault(int,
>>>>> siginfo*, void*) + 372
>>>>> frame #2: 0xeca8c49b libart.so`art::art_fault_handler(int, siginfo*,
>>>>> void*) (.llvm.650222801) + 43
>>>>> frame #3: 0x625bd6af
>>>>> app_process32`___lldb_unnamed_symbol22$$app_process32 + 623
>>>>> frame #4: 0xefee7c50 libc.so`___lldb_unnamed_symbol2$$libc.so + 1
>>>>> frame #5: 0xd1e29c6d
>>>>> libobjc.so`::objc_retainAutoreleasedReturnValue(id) [inlined]
>>>>> isPersistentObject(obj=0xd0bfb27c) at arc.mm:291
>>>>> frame #6: 0xd1e29c66
>>>>> libobjc.so`::objc_retainAutoreleasedReturnValue(id) [inlined]
>>>>> retain(obj=0xd0bfb27c) at arc.mm:296
>>>>> frame #7: 0xd1e29c66
>>>>> libobjc.so`::objc_retainAutoreleasedReturnValue(id) [inlined]
>>>>> objc_retain(obj=0xd0bfb27c) at arc.mm:587
>>>>> frame #8: 0xd1e29c62
>>>>> libobjc.so`::objc_retainAutoreleasedReturnValue(obj=0xd0bfb27c) at
>>>>> arc.mm:581
>>>>> frame #9: 0xd1175a04 libnative-lib.so`::-[ObjectiveCTest
>>>>> testMapTable](self=0xcf66c064, _cmd="X\x1c") at ObjectiveCTest.mm:97
>>
>>
>
>