bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#28211: Grafting code triggers GC/thread-safety issue on Guile 2.2.2


From: Andy Wingo
Subject: bug#28211: Grafting code triggers GC/thread-safety issue on Guile 2.2.2
Date: Wed, 09 May 2018 11:11:21 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux)

On Wed 09 May 2018 02:32, Mark H Weaver <address@hidden> writes:

> However, I think it's _far_ more likely that the NULL argument on the
> stack was copied from memory shared by multiple threads without proper
> thread synchronization.

I think this is unlikely on x86 given its total-store-ordering memory
model.  I agree with you about the value of barriers, but I don't think
they are part of this bug that Ludo is seeing.

>> I commented out the MADV_DONTNEED call to be sure, but I can still
>> reproduce the bug.
>
> I strongly doubt that the MADV_DONTNEED is relevant to this issue.

It could be.  It would zero out VM stack frames, and if GC is called
when/if vp->sp is out of date, then that would be possible.  However I
think vp->sp is never out of date, so that's probably not it.  The
things that can be out of date are the on-heap copy of the IP (vp->ip)
and the local register copy of the sp (sp).  It's more likely for the
local "sp" cache to be out of date -- if we recursed through Scheme in a
call out from the VM, eventualy causing stack expansion and relocation,
then on the return forgot to re-cache the sp value, that could be it.

Similarly, forgetting to set vp->ip before calling out to something that
could GC could likewise cause a problem because the stack map wouldn't
be right and the precise stack marker could clear a value by mistake.
This only happens for non-innermost frames though; the innermost frame
is marked conservatively.

The rules are: update vp->ip before something that can allocate, and
update local "sp" after returning from a C function that could have
recursively called Scheme.

I did find a couple places in the VM where we forgot to do one of these,
e.g. 07b7490f73fb4a6cb0c9577d082d37c8d9cee7b0 and just now
9a72e212622fa3bd118d7c02c4386601285b3224.  These two patches aren't
shipped yet fwiw.

Andy





reply via email to

[Prev in Thread] Current Thread [Next in Thread]