Re: Atomic operations

lightning

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Atomic operations

From:	Paulo César Pereira de Andrade
Subject:	Re: Atomic operations
Date:	Thu, 11 Aug 2022 18:00:29 -0300

Em qui., 11 de ago. de 2022 às 16:35, Marc Nieper-Wißkirchen
<marc.nieper+gnu@gmail.com> escreveu:
>
> Hi Paulo,

  Hi Marc,

> Am Di., 9. Aug. 2022 um 12:40 Uhr schrieb Paulo César Pereira de Andrade 
> <paulo.cesar.pereira.de.andrade@gmail.com>:
>>
>> > Here is a minimal API, albeit written for Scheme: 
>> > https://srfi.schemers.org/srfi-230/srfi-230.html.  What is an atomic 
>> > (fixnum) box there should be word-sized memory location in GNU lightning.  
>> > Atomic pairs (two words) are important for some algorithms.  If they are 
>> > not easily implementable on a particular architecture, GNU lightning 
>> > should report this so that the user can call C library routines (from 
>> > stdatomic) or GCC builtins themselves.
>> >
>> > As for GNU lightning instructions, we would probably at least need the 
>> > following instructions (for word-sized integers):
>> >
>> > - loads and stores with relaxed memory order (if I have understood 
>> > correctly, we can use the usual GNU lightning load/store instructions)
>> > - loads with acquire memory order
>> > - stores with release memory order
>> > - swap (load and store) with relaxed memory order
>> > - swap (load and store) with acquire-release memory order
>> > - compare-and-swap with relaxed memory order
>> > - compare-and-swap with acquire-release memory order
>>
>>   If lightning were to provide such primites, I believe it should
>> only "make a contract" of supporting strong compare-and-swap,
>> not on shared memory (a different process might die with the
>> lock held), to allow some kind of mutex implementation, what
>> could be expensive if there are too many waiters spinning.
>
>
> I am not sure whether I have understood your "contract".

  I meant officially supporting atomic operations. Also was thinking
initially on supporting only some minimal features for common
usages like spin locks and/or rwlocks. These are not cheap, but
avoid context switches.

> In any case, if a mutex is needed we could just call the GCC-provided 
> software implementation in libatomic ([1]) (after checking that libatomic's 
> ABI is supposed to be stable and works with different compilers as well).  
> Alternatively, we can roll out our own hash table of mutexes where the hash 
> is calculated from the memory address that is to be accessed atomically in 
> software.

  libatomic abi should be stable.

>>   Still not trivial to get it on all supported ports, at least with the same
>> semantics, because if need to implement in an external function call,
>> it would need to save/restore all JT_R* and JIT_F* registers in the
>> worst case. Most times could just inline what gcc generates.
>
>
> As atomic operations are usually quite costly, the overhead of saving and 
> restoring registers is probably not too bad.

  This would require either, lightning having a static libatomic, or
requiring linking to liblightning to also link with libatomic. There is
already some code like this.
  On ports where JIT_R* and JIT_F* are callee save, it can just call
the libatomic functions. In others, it should save/restore all registers
that map to JIT_R*, or JIT_F*.
  Functions in jit that call default atomic operations would need to
make a hidden alloca call to save/restore such registers.

>> > And the same, if supported, for double-word-sized memory operands.
>> >
>> > And then the following arithmetic operations (in relaxed and 
>> > acquire-release semantics):
>> >
>> > - fetch-add
>> > - fetch-sub
>> > - fetch-or
>> > - fetch-xor
>> > - fetch-and
>> >
>> > And then an instruction to emit a memory order (release, acquire, 
>> > acquire-release, sequential consistency) as atomic_thread_fence in 
>> > stdatomic.h.
>> > To simplify the interface, it may make sense to offer all operations (but 
>> > the thread fence instruction) only with relaxed semantics so that the 
>> > programmer has to emit thread fence instructions explicitly.
>>
>>   The simplest way to implement it is to have it have some PIC code
>> implementing it, and use two jit_jmpr to/from the code, but lightning
>> would still treat the jit_jmpr as function calls, that is, invalidate non 
>> callee
>> save registers.
>
>
> If internally, lightning uses an instruction different from jit_jmpr to 
> "call" the atomic code, it can have more detailed knowledge about the 
> registers that have to be saved.

  The problem is the registers modified by libatomic code.

>>   As long as using only jmpr, and not modifying registers, should be
>> enough to call jit_live() once "returning" for any non callee save register
>> used in the construct, or that must be alive for other use.

  Note that registers must be restored, and then, call jit_live() so that
they are not used as temporaries.

>> Thanks!
>> Paulo
>
>
> --
>
> [1] 
> https://gcc.gnu.org/git/?p=gcc.git;a=tree;f=libatomic;h=7e61d96034e3b2f3c697d30e9d88ef9482f047e5;hb=HEAD

  Surely this can be done, but it is not a small project. Probably
inline code, what really matters, would be implemented only for
x86_64 for a significant amount of time.

  Isn't it easier to just call C code from lightning, and then, your
code handles the fact that non callee save registers might have
been trashed?

  Still, if you have some scratch sample implementation, I would
like to see it :)

Thanks!
Paulo

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Atomic operations, (continued)

Prev by Date: Re: Atomic operations
Next by Date: Fwd: [PATCH 4/4] Expose and add documentation for jit_live/jit_get_reg/jit_unget_reg.
Previous by thread: Re: Atomic operations
Next by thread: Clearing of instruction cache
Index(es):
- Date
- Thread