[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: SMP, barriers, etc.

From: Samuel Thibault
Subject: Re: SMP, barriers, etc.
Date: Mon, 28 Dec 2009 12:28:14 +0100
User-agent: Mutt/1.5.12-2006-07-14

Da Zheng, le Mon 28 Dec 2009 19:15:16 +0800, a écrit :
> On 09-12-28 上午11:00, Samuel Thibault wrote:
> > Da Zheng, le Mon 28 Dec 2009 10:31:26 +0800, a écrit :
> >> On 09-12-27 下午6:38, Samuel Thibault wrote:
> >>> Da Zheng, le Sun 27 Dec 2009 16:39:04 +0800, a écrit :
> >>>> Is the process above correct?
> >>>
> >>> I have never actually programmed the architectures where things work
> >>> like this (powerpc & such), but that's what I have understood from the
> >>> code and explanations here and there, yes.  It's a sort of transactional
> >>> memory actually.
> >> I just think if it's a bit too expensive that a processor has to monitor 
> >> other
> >> processors' cache even though it's only one address.
> > 
> > It's not more expensive than usual operations: it already has to do it
> > to process cache line invalidations for everything that is in the cache.
> I don't understand. Do you mean processing cache line invalidation in local 
> cache?

Yes: a processor already has to listen to what other processors want to
do with data that it has in its cache.

> >> That conditional store instruction needs to do more if it succeeds. It has 
> >> to
> >> invalidate cache lines specified by the monitored address in other 
> >> processors.
> > 
> > Locked operations on Intel have to do the same :)
> Doesn't the intel processor maintain cache coherency by hardware?

Err, yes. But I guess that's also the case with the Alpha, no?

> All instructions that modify memory should invalidate relevant cache
> lines in other processor.

Yes, but that's all handled by the hardware.  What software has to do
however is to make sure this happens with the couple of ordering
requirements it has.

> >> Now it seems to me that the memory barrier is only to ensure that the 
> >> processor
> >> executes instructions in the order we want.
> > 
> > Not only that, but also as a clobber for the compiler. Note however
> > that atomic_add doesn't have it, and I believe it could be dropped for
> > add_return too. In Linux, atomic.h operations do _not_ necessarily
> atomic_add_return can be used for implementing something like locks, so the
> clobber for the compiler cannot be dropped.


> >> But the data dependency barrier
> >> seems to imply cache coherency according to
> >> linux/Documentation/memory-barriers.txt.
> > 
> > Err, there is _some_ cache coherency introduced by dependency barriers,
> > yes.
> > 
> >> A bit confused:-(
> > 
> > By what?
> I'm confused by whether memory barrier instructions imply cache coherency.
> The memory model usually says all other cache lines can be updated 
> *eventually*.
> It doesn't say that memory barrier instructions can update cache lines in 
> other
> processors. So does the data dependency barrier invalidate all out-of-date 
> cache
> lines in the local processor?

Ah, no, it's just a barrier: it doesn't asserts that everything that
happened in the machine is visible to the processor, that would be way
too expensive. What it asserts is just the _ordering_ of visibility of
the changes.  That's why in general a memory barrier is needed both at
the read and the write side, so that both the writing processor and the
reading processor cooperate on the ordering of the visibility of the
changes.  That's much more lightweight for the hardware cache coherency
protocol and still enough to implement locks, RCU lists etc.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]