Presentation: Understanding OrderAccess

Mon Nov 28 10:43:22 UTC 2016

Hi David,

I know, multi-copy atomicity is hard to understand. It is relevant for complex scenarios in which more than 2 threads are involved.
I think a good explanation is given in the paper [1] which we had discussed some time ago (email thread [2]).

The term "multiple-copy atomicity" is described as
"... in a machine which is not multiple-copy atomic, even if a write instruction is access-atomic, the write may become visible to different threads at different times ...".

I think "IRIW" (in [1] "6.1 Extending SB to more threads: IRIW and RWC") is the most comprehensible example.
The key property of the architectures is that "... writes can be propagated to different threads in different orders ...".

A globally consistent order can be enforced by adding - in hotspot terms - OrderAccess::fence() between the read accesses.

Since you have asked about C++11, there's an example implementation for PPC [3].
Load Seq Cst uses a heavy-weight sync instruction (OrderAccess::fence() in hotspot terms) before the load. Such "Load Seq Cst" observe writes in a globally consistent order.

Btw.: We have implemented the Java volatile accesses very similar to [3] for PPC64 even though the recent Java memory model does not strictly require this implementation.
But I guess the Java memory model is beyond the scope of your presentation.

Best regards,
Martin

[1] http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf
[2] http://mail.openjdk.java.net/pipermail/core-libs-dev/2014-December/030212.html
[3] http://open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2745.html

-----Original Message-----
From: David Holmes [mailto:david.holmes at oracle.com] 
Sent: Montag, 28. November 2016 06:56
To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
Subject: Re: Presentation: Understanding OrderAccess

Hi Martin

On 24/11/2016 2:20 AM, Doerr, Martin wrote:
> Hi David,
>
> thank you very much for the presentation. I think it provides a good guideline for hotspot development.

Thanks.

>
> Would you like to add something about multi-copy atomicity?

Not really. :)

> E.g. there's a usage of OrderAccess::fence() in GenericTaskQueue<E, F, N>::pop_global which is only needed on platforms which don't provide this property (PPC and ARM).
>
> It is needed in the following scenario:
> - Different threads write 2 variables.
> - Readers of these 2 variables expect a globally consistent order of the write accesses.
>
> In this case, the readers must use OrderAccess::fence() between the 2 load accesses on platforms without "multi-copy atomicity".

Hmmm ... I know this code was discussed at length a couple of years ago ... and I know I've probably forgotten most of what was discussed ... so I'll have to revisit this because this seems wrong ...

> (While taking a look at it, the condition "#if !(defined SPARC || 
> defined IA32 || defined AMD64)" is not accurate and should better get 
> improved. E.g. s390 is multi-copy atomic.)
>
>
> I like that you have added our cmpxchg_memory_order definition. We implemented it even more conservative than C++' seq_cst on PPC64.

I still can't get my head around the C++11 terminology for this and how you are expected to use it - what does it mean for an individual operation to be "sequentially consistent" ? :(

Cheers,
David

>
> Thanks and best regards,
> Martin
>
>
> -----Original Message-----
> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On 
> Behalf Of David Holmes
> Sent: Mittwoch, 23. November 2016 06:08
> To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
> Subject: Presentation: Understanding OrderAccess
>
> This is a presentation I recently gave internally to the runtime and serviceability teams that may be of more general interest to hotspot developers.
>
> http://cr.openjdk.java.net/~dholmes/presentations/Understanding-OrderA
> ccess-v1.1.pdf
>
> Cheers,
> David
>