Presentation: Understanding OrderAccess

David Holmes david.holmes at oracle.com
Mon Nov 28 21:22:29 UTC 2016


Hi Martin,

I've added Erik explicitly to the cc as he and I have been discussing 
fences and "visibility", and of course he most recently revised the 
descriptions in orderAccess.hpp

On 29/11/2016 2:29 AM, Doerr, Martin wrote:
> Hi David,
>
> sending the email again with corrected subject + removed confusing statement. My spam filter had added "[JUNK]". I have no clue what it didn't like. Sorry for that.
>
>> Problem there, I think, is that fence() is really not special in that regard. You need to insert something between the two loads to force a globally consistent view of memory. But what part of fence() gives that guarantee?
>
> This is really hard to explain. Maybe there are better explanations out there, but I'll give it a try:
>
> I think the comment in orderAccess.hpp is not bad:
> // Finally, we define a "fence" operation, as a bidirectional barrier.
> // It guarantees that any memory access preceding the fence is not // reordered w.r.t. any memory accesses subsequent to the fence in program // order.
>
> One can consider a fence as a global operation which separates a set of accesses A from a set of accesses B.
> If A contains a load, one has to include the corresponding store which may have been performed by another thread into A.
> Especially the storeLoad part of the barrier must include stores performed by other processors but observed by this one.

But again that attribution of global properties is not something I think 
is necessarily implied or intended by OrderAccess. Or maybe it is, but 
as it is only an issue on non-multicopy-atomic systems, it has never 
been called out explicitly. ?? And those global properties must also be 
a part of the other barriers (as the fence is just the combination of 
them all) - but I don't know how you would describe the affects of the 
other barriers (like loadload) in "global" terms.

David
-----

>
>> Yeah I've seen the mappings but it is the conceptual model that I have a problem with. Andrew's reply makes it somewhat clearer - if every atomic op is seq-cst then you get a seq-cst execution ...
>> but does that somehow bind all memory accesses not just those involved in the atomic ops? And how do non seq-cst atomic ops interact with seq-cst ones?
>
> "Atomic operations tagged memory_order_seq_cst not only order memory the same way as release/acquire ordering (everything that happened-before a store in one thread becomes a visible side effect in the thread that did a load), but also establish a single total modification order of all atomic operations that are so tagged." [4]
>
> So acquire+release orders wrt. all memory accesses while the total modification order only applies to "atomic operations that are so tagged". This is pretty much like volatile vs. non-volatile in Java [5].
>
>
> Best regards,
> Martin
>
> [4] http://en.cppreference.com/w/cpp/atomic/memory_order#Sequentially-consistent_ordering
> [5] http://g.oswego.edu/dl/jmm/cookbook.html
>
>
> -----Original Message-----
> From: David Holmes [mailto:david.holmes at oracle.com]
> Sent: Montag, 28. November 2016 13:56
> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
> Subject: Re: Presentation: Understanding OrderAccess
>
> Hi Martin,
>
> On 28/11/2016 8:43 PM, Doerr, Martin wrote:
>> Hi David,
>>
>> I know, multi-copy atomicity is hard to understand. It is relevant for complex scenarios in which more than 2 threads are involved.
>> I think a good explanation is given in the paper [1] which we had discussed some time ago (email thread [2]).
>>
>> The term "multiple-copy atomicity" is described as "... in a machine
>> which is not multiple-copy atomic, even if a write instruction is access-atomic, the write may become visible to different threads at different times ...".
>>
>> I think "IRIW" (in [1] "6.1 Extending SB to more threads: IRIW and RWC") is the most comprehensible example.
>> The key property of the architectures is that "... writes can be propagated to different threads in different orders ...".
>
> Thanks for the reminder of that discussion. :)
>
>> A globally consistent order can be enforced by adding - in hotspot terms - OrderAccess::fence() between the read accesses.
>
> Problem there, I think, is that fence() is really not special in that regard. You need to insert something between the two loads to force a globally consistent view of memory. But what part of fence() gives that guarantee? Maybe there is something we need to define for non-multi-copy-atomicarchitectures to use just for this purpose.
>
>> Since you have asked about C++11, there's an example implementation for PPC [3].
>> Load Seq Cst uses a heavy-weight sync instruction (OrderAccess::fence() in hotspot terms) before the load. Such "Load Seq Cst" observe writes in a globally consistent order.
>
> Yeah I've seen the mappings but it is the conceptual model that I have a problem with. Andrew's reply makes it somewhat clearer - if every atomic op is seq-cst then you get a seq-cst execution ... but does that somehow bind all memory accesses not just those involved in the atomic ops? And how do non seq-cst atomic ops interact with seq-cst ones?
>
>> Btw.: We have implemented the Java volatile accesses very similar to [3] for PPC64 even though the recent Java memory model does not strictly require this implementation.
>> But I guess the Java memory model is beyond the scope of your presentation.
>
> Oh yes way out of scope! :)
>
> Cheers,
> David
>
>> Best regards,
>> Martin
>>
>>
>> [1] http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf
>> [2]
>> http://mail.openjdk.java.net/pipermail/core-libs-dev/2014-December/030
>> 212.html [3]
>> http://open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2745.html
>>
>>
>> -----Original Message-----
>> From: David Holmes [mailto:david.holmes at oracle.com]
>> Sent: Montag, 28. November 2016 06:56
>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers
>> <hotspot-dev at openjdk.java.net>
>> Subject: Re: Presentation: Understanding OrderAccess
>>
>> Hi Martin
>>
>> On 24/11/2016 2:20 AM, Doerr, Martin wrote:
>>> Hi David,
>>>
>>> thank you very much for the presentation. I think it provides a good guideline for hotspot development.
>>
>> Thanks.
>>
>>>
>>> Would you like to add something about multi-copy atomicity?
>>
>> Not really. :)
>>
>>> E.g. there's a usage of OrderAccess::fence() in GenericTaskQueue<E, F, N>::pop_global which is only needed on platforms which don't provide this property (PPC and ARM).
>>>
>>> It is needed in the following scenario:
>>> - Different threads write 2 variables.
>>> - Readers of these 2 variables expect a globally consistent order of the write accesses.
>>>
>>> In this case, the readers must use OrderAccess::fence() between the 2 load accesses on platforms without "multi-copy atomicity".
>>
>> Hmmm ... I know this code was discussed at length a couple of years ago ... and I know I've probably forgotten most of what was discussed ... so I'll have to revisit this because this seems wrong ...
>>
>>> (While taking a look at it, the condition "#if !(defined SPARC ||
>>> defined IA32 || defined AMD64)" is not accurate and should better get
>>> improved. E.g. s390 is multi-copy atomic.)
>>>
>>>
>>> I like that you have added our cmpxchg_memory_order definition. We implemented it even more conservative than C++' seq_cst on PPC64.
>>
>> I still can't get my head around the C++11 terminology for this and
>> how you are expected to use it - what does it mean for an individual
>> operation to be "sequentially consistent" ? :(
>>
>> Cheers,
>> David
>>
>>>
>>> Thanks and best regards,
>>> Martin
>>>
>>>
>>> -----Original Message-----
>>> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On
>>> Behalf Of David Holmes
>>> Sent: Mittwoch, 23. November 2016 06:08
>>> To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
>>> Subject: Presentation: Understanding OrderAccess
>>>
>>> This is a presentation I recently gave internally to the runtime and serviceability teams that may be of more general interest to hotspot developers.
>>>
>>> http://cr.openjdk.java.net/~dholmes/presentations/Understanding-Order
>>> A
>>> ccess-v1.1.pdf
>>>
>>> Cheers,
>>> David
>>>


More information about the hotspot-dev mailing list