lightning
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Atomic operations


From: Paulo César Pereira de Andrade
Subject: Re: Atomic operations
Date: Fri, 12 Aug 2022 14:01:17 -0300

Em sex., 12 de ago. de 2022 às 13:39, Marc Nieper-Wißkirchen
<marc.nieper+gnu@gmail.com> escreveu:
>
>
>
>
> Am Fr., 12. Aug. 2022 um 18:27 Uhr schrieb Paulo César Pereira de Andrade 
> <paulo.cesar.pereira.de.andrade@gmail.com>:
>>
>> Em sex., 12 de ago. de 2022 às 11:45, Marc Nieper-Wißkirchen
>> <marc.nieper+gnu@gmail.com> escreveu:
>> >
>> > I agree that casr/tasr (maybe also casi/tasi) are the most important 
>> > instructions.  As long as it can be implemented on the supported set of 
>> > CPUs, I don't see why we shouldn't have _c, _s, _i, _l variants alongside 
>> > the word-size variant.
>> >
>> > The only problem I see is that an atomic store in one thread together with 
>> > an atomic load of the same memory location in another thread will become 
>> > costly if release-acquire semantics are needed.  This can be emulated with 
>> > casr/tasr (I assume that their semantics is sequentially consistent) but 
>> > this is a costly operation while on many important CPUs like x86 all 
>> > release-acquire semantics actually do not need special instructions.
>>
>>   At first,  my idea would be to implement tas (test and set). During
>> this exercise,
>> learn more detailed about what the different backends provide.
>>   Then, implement cas (compare and swap), that also requires a new pattern 
>> for
>> 4 argument instructions.
>>
>> > It is probably enough to provide a global release and an acquire 
>> > instruction (which does the equivalent of C11's atomic_thread_fence 
>> > (memory_order_release) and atomic_thread_fence (memory_order_acquire), 
>> > respectively).  (These would be no-ops on x86, for example.)  And maybe 
>> > also an operation corresponding to atomic_thread_fence 
>> > (memory_order_seq_cst).
>>
>>   And later, consider any kind of transactional memory support. That will 
>> depend
>> on how much is required on software fallbacks.
>>
>>   This has very low priority for me, so, if you want to start, I can help :)
>
>
> You are right that we should start with tas and cas first.  It will already 
> cover a lot of use cases (and everything else can be emulated).
>
> At one point, I will have a look at how to implement it (First, I have to 
> dive a bit more into your code, though.)
> However, an instruction that can be used to embed data is of higher priority 
> for me.  Would this be implemented with the various is, ic, ... macros in 
> each jit-XXX-cpu.c?  After such an instruction, would the alignment have to 
> be adjusted, or is it done by lightning automatically on those ports where 
> instructions have to be aligned?

  A good name for it could be jit_embed(void *data, jit_int32_t length);
And just memcpy the data verbatim. It would be required to keep a
copy of the argument until code is finally emitted.
  Should not embed more than jit_get_max_instr() bytes, that is basically
JIT_MAX_INSTR. Otherwise, it might end up writing code out of bounds.
  Alignment is not automatically done before emitting code, so, it should
keep code aligned.
  There is a, far more complicated than needed, example in jit_arm.c,
basically in _flush_consts() it does:
    jit_memcpy(_jitc->consts.data, _jitc->consts.values, _jitc->consts.size);
    _jit->pc.w += _jitc->consts.size;

> [...]
>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]