Presentation: Understanding OrderAccess
martin.doerr at sap.com
Mon Nov 28 10:43:22 UTC 2016
I know, multi-copy atomicity is hard to understand. It is relevant for complex scenarios in which more than 2 threads are involved.
I think a good explanation is given in the paper  which we had discussed some time ago (email thread ).
The term "multiple-copy atomicity" is described as
"... in a machine which is not multiple-copy atomic, even if a write instruction is access-atomic, the write may become visible to different threads at different times ...".
I think "IRIW" (in  "6.1 Extending SB to more threads: IRIW and RWC") is the most comprehensible example.
The key property of the architectures is that "... writes can be propagated to different threads in different orders ...".
A globally consistent order can be enforced by adding - in hotspot terms - OrderAccess::fence() between the read accesses.
Since you have asked about C++11, there's an example implementation for PPC .
Load Seq Cst uses a heavy-weight sync instruction (OrderAccess::fence() in hotspot terms) before the load. Such "Load Seq Cst" observe writes in a globally consistent order.
Btw.: We have implemented the Java volatile accesses very similar to  for PPC64 even though the recent Java memory model does not strictly require this implementation.
But I guess the Java memory model is beyond the scope of your presentation.
From: David Holmes [mailto:david.holmes at oracle.com]
Sent: Montag, 28. November 2016 06:56
To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
Subject: Re: Presentation: Understanding OrderAccess
On 24/11/2016 2:20 AM, Doerr, Martin wrote:
> Hi David,
> thank you very much for the presentation. I think it provides a good guideline for hotspot development.
> Would you like to add something about multi-copy atomicity?
Not really. :)
> E.g. there's a usage of OrderAccess::fence() in GenericTaskQueue<E, F, N>::pop_global which is only needed on platforms which don't provide this property (PPC and ARM).
> It is needed in the following scenario:
> - Different threads write 2 variables.
> - Readers of these 2 variables expect a globally consistent order of the write accesses.
> In this case, the readers must use OrderAccess::fence() between the 2 load accesses on platforms without "multi-copy atomicity".
Hmmm ... I know this code was discussed at length a couple of years ago ... and I know I've probably forgotten most of what was discussed ... so I'll have to revisit this because this seems wrong ...
> (While taking a look at it, the condition "#if !(defined SPARC ||
> defined IA32 || defined AMD64)" is not accurate and should better get
> improved. E.g. s390 is multi-copy atomic.)
> I like that you have added our cmpxchg_memory_order definition. We implemented it even more conservative than C++' seq_cst on PPC64.
I still can't get my head around the C++11 terminology for this and how you are expected to use it - what does it mean for an individual operation to be "sequentially consistent" ? :(
> Thanks and best regards,
> -----Original Message-----
> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On
> Behalf Of David Holmes
> Sent: Mittwoch, 23. November 2016 06:08
> To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
> Subject: Presentation: Understanding OrderAccess
> This is a presentation I recently gave internally to the runtime and serviceability teams that may be of more general interest to hotspot developers.
More information about the hotspot-dev