RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64

Doerr, Martin martin.doerr at sap.com
Fri Oct 21 12:57:42 UTC 2016

Hi all,

thank you very much for reviewing. I fully agree with the latest replies.

I think Hiroshi's latest webrev (http://cr.openjdk.java.net/~horii/8154736/webrev.05/) is pretty close to it.
There are only still acquire barriers which could be replaced by a comment like "We rely on memory_order_consume here.".
I'd prefer this, too, even though acquire barriers in failure cases would probably not really hurt.
Cmpxchg Release,Relaxed + Load Consume seems to be the pattern which matches the needs exactly.

The webrev also contains a logging change in psPromotionManager.inline.hpp which I'm not sure if it's still wanted.

Not sure if aarch64 should be addressed in a separate change.

Besides that, it looks good to me.

Best regards,

-----Original Message-----
From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-bounces at openjdk.java.net] On Behalf Of Andrew Haley
Sent: Dienstag, 11. Oktober 2016 11:26
To: Kim Barrett; David Holmes
Cc: hotspot-compiler-dev; Hiroshi H Horii; Tim Ellison; ppc-aix-port-dev at openjdk.java.net; Michihiro Horie; hotspot-gc-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64

On 06/10/16 23:16, Kim Barrett wrote:

> The key issue here is that we copy obj into new_obj, and then make
> new_obj accessible to other threads via the CAS.  Those other
> threads might attempt to access data in new_obj.  This suggests the
> CAS ought to have at least a release fence to ensure the copy is
> complete before the CAS is performed.  No amount of fencing on the
> read side (such as in the work stealing) can remove that need.

I agree.

> And that might be all that is needed.  On the post-CAS side, we load
> the forwardee and then load values from it.  I thik we can use
> implicit consume with dependent loads (except on Alpha) plus the
> suggested release fence to get the desired effect.

That's probably true, except that there's not really any such thing as
"implicit consume" in C++.  While all of the hardware we use respects
address dependencies, it's not something that the compiler knows
about, and it's explicitly undefined behaviour in the C++ memory
model.  If we're depending on memory_order_consume, perhaps we ought
to think about adding it to Atomic, even though it's just a volatile
load in older compilers.


More information about the ppc-aix-port-dev mailing list