Memory ordering properties of Atomic::r-m-w operations
David Holmes
david.holmes at oracle.com
Tue Nov 8 10:35:17 UTC 2016
On 8/11/2016 8:18 PM, Andrew Haley wrote:
> On 08/11/16 01:11, David Holmes wrote:
>> On 6/11/2016 8:54 PM, Andrew Haley wrote:
>>> On 05/11/16 18:43, David Holmes wrote:
>>>> Forking new discussion from:
>>>>
>>>> RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64
>>>>
>>>> On 1/11/2016 7:44 PM, Andrew Haley wrote:
>>>>> On 31/10/16 21:30, David Holmes wrote:
>>
>>> if you have
>>>
>>> store_relaxed(a)
>>> load_seq_cst(b)
>>> store_seq_cst(c)
>>> load_relaxed(d)
>>>
>>> there's nothing to prevent
>>>
>>> load_seq_cst(b)
>>> load_relaxed(d)
>>> store_relaxed(a)
>>> store_seq_cst(c)
>>>
>>> It is true that neither store a nor load d have moved across this
>>> operation, but they have exchanged places. As far as GCC is concerned
>>> this is a correct implementation, and it does meet the requirement of
>>> sequential consistency as defined in the C++ memory model.
>>
>> It does? Then it emphasises what I just said about not knowing what it
>> means to implement an operation with seq_cst semantics.
>
> I take your point, but seq_cst is not a real mystery, it's just a
> matter of looking it up: it's all defined in the C++11 standard. And
> it's not significantly different from Java volatile.
I have looked at it of course, but still find it rather "mysterious".
>> I would have expected full ordering of all loads and stores to get
>> "sequential consistency".
>
> Why? There are only two sequentially-consistent loads and stores in
> that block of code. Of course those two have a total order. But you
> surely wouldn't expect a sequentially-consistent store to be ordered
> with respect to a relaxed load.
I guess I think of sequentially consistent as a global property of a
system, not relative to just atomic operations.
>>> Ouch. Yes, I agree that something needs fixing. That comment:
>>>
>>> // Use release_store_fence to update values like the thread state,
>>> // where we don't want the current thread to continue until all our
>>> // prior memory accesses (including the new thread state) are visible
>>> // to other threads.
>>>
>>> ... seems very unhelpful, at least because a release fence (using
>>> conventional terminology) does not have that property: a release
>>> fence is only LoadStore|StoreStore.
>>
>> In release_store_fence the release and fence are distinct memory
>> ordering components. It is not a store combined with a "release
>> fence" but a store between a "release" and a "fence". And critically
>> in hotspot that "fence" must have visibility guarantees to ensure
>> correctness of Dekker-duality algorithms.
>
> Ah, that is a slightly misleading name. The "_fence" at the end of
> the name is really a StoreLoad fence, got it. I noticed that once
> before, but I'd forgotten. I guess what's intended here is a
> sequentially-consistent store.
It is intended to be:
release(); store; fence();
but might be implementable in a more efficient manner when combined in a
single function.
I have a problem with referring to a "storeload fence". storeload is one
form of memory barrier - a full fence represents all four forms to me.
Terminology is a disaster in this field unfortunately - one
architectures barrier is anothers fence. :(
>> Note the equivalence of release() with LoadStore|StoreStore is a
>> definition within orderAccess.hpp, it is not a general equivalence.
>
> OK. It would certainly be nice if HotSpot could move to using
> standard terminology. Then, in time, we could just use the C++11
> atomics.
The stand-alone (unbound) release() and acquire() are defined as they
are to allow them to be associated with a subsequent store, or previous
load, in cases where we can not access the variable directly to apply a
release_store, or load_acquire operation. This is somewhat independent
of the atomic API.
David
-----
> Andrew.
>
More information about the hotspot-dev
mailing list