Presentation: Understanding OrderAccess

Doerr, Martin martin.doerr at
Mon Nov 28 10:43:22 UTC 2016

Hi David,

I know, multi-copy atomicity is hard to understand. It is relevant for complex scenarios in which more than 2 threads are involved.
I think a good explanation is given in the paper [1] which we had discussed some time ago (email thread [2]).
The term "multiple-copy atomicity" is described as
"... in a machine which is not multiple-copy atomic, even if a write instruction is access-atomic, the write may become visible to different threads at different times ...".

I think "IRIW" (in [1] "6.1 Extending SB to more threads: IRIW and RWC") is the most comprehensible example.
The key property of the architectures is that "... writes can be propagated to different threads in different orders ...".

A globally consistent order can be enforced by adding - in hotspot terms - OrderAccess::fence() between the read accesses.

Since you have asked about C++11, there's an example implementation for PPC [3].
Load Seq Cst uses a heavy-weight sync instruction (OrderAccess::fence() in hotspot terms) before the load. Such "Load Seq Cst" observe writes in a globally consistent order.

Btw.: We have implemented the Java volatile accesses very similar to [3] for PPC64 even though the recent Java memory model does not strictly require this implementation.
But I guess the Java memory model is beyond the scope of your presentation.

Best regards,


-----Original Message-----
From: David Holmes [mailto:david.holmes at] 
Sent: Montag, 28. November 2016 06:56
To: Doerr, Martin <martin.doerr at>; hotspot-dev developers <hotspot-dev at>
Subject: Re: Presentation: Understanding OrderAccess

Hi Martin

On 24/11/2016 2:20 AM, Doerr, Martin wrote:
> Hi David,
> thank you very much for the presentation. I think it provides a good guideline for hotspot development.


> Would you like to add something about multi-copy atomicity?

Not really. :)

> E.g. there's a usage of OrderAccess::fence() in GenericTaskQueue<E, F, N>::pop_global which is only needed on platforms which don't provide this property (PPC and ARM).
> It is needed in the following scenario:
> - Different threads write 2 variables.
> - Readers of these 2 variables expect a globally consistent order of the write accesses.
> In this case, the readers must use OrderAccess::fence() between the 2 load accesses on platforms without "multi-copy atomicity".

Hmmm ... I know this code was discussed at length a couple of years ago ... and I know I've probably forgotten most of what was discussed ... so I'll have to revisit this because this seems wrong ...

> (While taking a look at it, the condition "#if !(defined SPARC || 
> defined IA32 || defined AMD64)" is not accurate and should better get 
> improved. E.g. s390 is multi-copy atomic.)
> I like that you have added our cmpxchg_memory_order definition. We implemented it even more conservative than C++' seq_cst on PPC64.

I still can't get my head around the C++11 terminology for this and how you are expected to use it - what does it mean for an individual operation to be "sequentially consistent" ? :(


> Thanks and best regards,
> Martin
> -----Original Message-----
> From: hotspot-dev [mailto:hotspot-dev-bounces at] On 
> Behalf Of David Holmes
> Sent: Mittwoch, 23. November 2016 06:08
> To: hotspot-dev developers <hotspot-dev at>
> Subject: Presentation: Understanding OrderAccess
> This is a presentation I recently gave internally to the runtime and serviceability teams that may be of more general interest to hotspot developers.
> ccess-v1.1.pdf
> Cheers,
> David

More information about the hotspot-dev mailing list