lightning
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Atomic operations


From: Paulo César Pereira de Andrade
Subject: Re: Atomic operations
Date: Fri, 12 Aug 2022 08:15:53 -0300

Em qui., 11 de ago. de 2022 às 17:52, Marc Nieper-Wißkirchen
<marc.nieper+gnu@gmail.com> escreveu:

  Hi Marc,

> PS: This document may be helpful as well: 
> https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html.

  Right now, my idea of what should be done would be to have a
jit_casr(bool_return_in_register, address_in_register,
new_value_in_register, old_value_in_register);

  It would need a new pattern of 4 registers instruction, as only
the first one is modified. The current qmul* and qdiv* ones put
result in the first two register arguments.

  And while at that, also have a:

jit_tasr(old_value_in_register, address_in_register, new_value_in_register);

  Likely can have the _c, _s, _i, and _l modifiers, or have only a 32
bit variant.

  Very few cpus should not have some construct for it. For those, we just add
a software fallback with some kind of spin lock. In either case,  we "cheat" and
look at what assembly gcc generates or the kernel uses for the
equivalent construct.

  About the ABA problem, we ignore it. Just have some note, describe the issue
and how to avoid it. I would suggest using it with the address_in_register value
as a special lock, not the actual variable.

> Am Do., 11. Aug. 2022 um 21:35 Uhr schrieb Marc Nieper-Wißkirchen 
> <marc.nieper+gnu@gmail.com>:
>>
>> Hi Paulo,
>>
>> Am Di., 9. Aug. 2022 um 12:40 Uhr schrieb Paulo César Pereira de Andrade 
>> <paulo.cesar.pereira.de.andrade@gmail.com>:
>>>
>>> > Here is a minimal API, albeit written for Scheme: 
>>> > https://srfi.schemers.org/srfi-230/srfi-230.html.  What is an atomic 
>>> > (fixnum) box there should be word-sized memory location in GNU lightning. 
>>> >  Atomic pairs (two words) are important for some algorithms.  If they are 
>>> > not easily implementable on a particular architecture, GNU lightning 
>>> > should report this so that the user can call C library routines (from 
>>> > stdatomic) or GCC builtins themselves.
>>> >
>>> > As for GNU lightning instructions, we would probably at least need the 
>>> > following instructions (for word-sized integers):
>>> >
>>> > - loads and stores with relaxed memory order (if I have understood 
>>> > correctly, we can use the usual GNU lightning load/store instructions)
>>> > - loads with acquire memory order
>>> > - stores with release memory order
>>> > - swap (load and store) with relaxed memory order
>>> > - swap (load and store) with acquire-release memory order
>>> > - compare-and-swap with relaxed memory order
>>> > - compare-and-swap with acquire-release memory order
>>>
>>>   If lightning were to provide such primites, I believe it should
>>> only "make a contract" of supporting strong compare-and-swap,
>>> not on shared memory (a different process might die with the
>>> lock held), to allow some kind of mutex implementation, what
>>> could be expensive if there are too many waiters spinning.
>>
>>
>> I am not sure whether I have understood your "contract".
>>
>> In any case, if a mutex is needed we could just call the GCC-provided 
>> software implementation in libatomic ([1]) (after checking that libatomic's 
>> ABI is supposed to be stable and works with different compilers as well).  
>> Alternatively, we can roll out our own hash table of mutexes where the hash 
>> is calculated from the memory address that is to be accessed atomically in 
>> software.
>>
>>>
>>>   Still not trivial to get it on all supported ports, at least with the same
>>> semantics, because if need to implement in an external function call,
>>> it would need to save/restore all JT_R* and JIT_F* registers in the
>>> worst case. Most times could just inline what gcc generates.
>>
>>
>> As atomic operations are usually quite costly, the overhead of saving and 
>> restoring registers is probably not too bad.
>>
>>>
>>> > And the same, if supported, for double-word-sized memory operands.
>>> >
>>> > And then the following arithmetic operations (in relaxed and 
>>> > acquire-release semantics):
>>> >
>>> > - fetch-add
>>> > - fetch-sub
>>> > - fetch-or
>>> > - fetch-xor
>>> > - fetch-and
>>> >
>>> > And then an instruction to emit a memory order (release, acquire, 
>>> > acquire-release, sequential consistency) as atomic_thread_fence in 
>>> > stdatomic.h.
>>> > To simplify the interface, it may make sense to offer all operations (but 
>>> > the thread fence instruction) only with relaxed semantics so that the 
>>> > programmer has to emit thread fence instructions explicitly.
>>>
>>>   The simplest way to implement it is to have it have some PIC code
>>> implementing it, and use two jit_jmpr to/from the code, but lightning
>>> would still treat the jit_jmpr as function calls, that is, invalidate non 
>>> callee
>>> save registers.
>>
>>
>> If internally, lightning uses an instruction different from jit_jmpr to 
>> "call" the atomic code, it can have more detailed knowledge about the 
>> registers that have to be saved.
>>
>>>
>>>
>>>   As long as using only jmpr, and not modifying registers, should be
>>> enough to call jit_live() once "returning" for any non callee save register
>>> used in the construct, or that must be alive for other use.
>>>
>>> Thanks!
>>> Paulo
>>
>>
>> --
>>
>> [1] 
>> https://gcc.gnu.org/git/?p=gcc.git;a=tree;f=libatomic;h=7e61d96034e3b2f3c697d30e9d88ef9482f047e5;hb=HEAD
>>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]