bug-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: refcounts assertion


From: Justus Winter
Subject: Re: refcounts assertion
Date: Mon, 07 Mar 2016 11:40:26 +0100
User-agent: alot/0.3.8.dev

Quoting Samuel Thibault (2016-03-07 01:40:01)
> Hello,
> 
> Some kind of refcount assertion failure we have been having is "refcount
> detected use-after-free!" from the call to refcounts_ref inside
> _ports_bucket_class_iterate, called for instance by the periodic sync.
> I'm thinking: what about this scenario:
> 
> - thread A calls diskfs_nput()
> - that calls refcounts_deref_weak(), which gets both hard and weak
>   counts to 0
> - thread B calls diskfs_sync_everything, which calls
>   ports_bucket_iterate, which calls _ports_bucket_class_iterate
> - _ports_bucket_class_iterate iterates over nodes, and calls
>   refcounts_ref for the node deref-ed above. It then traps on the
>   assertion.
> 
> So the situation is that diskfs_nput is releasing a node while
> sync_everything is trying to sync it.

That hypothesis should be easy enough to test.  It's a good bet
though, the iteration code is fishy enough as it is.

I have a patch that revamps it so that we avoid the copying.  The
thing with the patch is (iirc) that it requires the called function to
be idempotent, which afaics every function used with the
_ports_bucket_class_iterate were.  Or maybe the patch was for the node
iteration function, I don't remember and cannot check currently.

> Just using refcounts_unsafe_ref is most probably not a solution since
> the node is on its way out, the function called by the iterator will
> most probably get it all wrong on a node being destroyed.
> 
> Another way would be as attached: just skip nodes which didn't have any
> reference.  I however don't know libports enough to say whether that's
> correct behavior for libports or not:

That looks like a brown-paper fix that could easily use-after-free.
refcounts_unsafe_ref can be safely used in libports because of the
quiescent thread thing.

> AIUI callers of refcounts_deref* are supposed to check for 0,0 and
> are then responsible for destroying the node?

Yes.

> 
> Samuel



reply via email to

[Prev in Thread] Current Thread [Next in Thread]