[vector]: RFR (XS): Small bug fixes
Lupusoru, Razvan A
razvan.a.lupusoru at intel.com
Tue Feb 27 01:40:46 UTC 2018
Gotcha - thanks to both of you for clarifying!
I recommend that we instead update it to Unsafe().loadFence() followed by Unsafe().storeFence(). From my understanding, this should be safe while having the desired performance characteristics. From what it seems to me, this will generate: MemBarCPUOrder LoadFence MemBarCPUOrder StoreFence. MemBarCPUOrder seems to be what I want which is provides memory ordering guarantees in compiler. And LoadFence and StoreFence translate to nothing in x86 due to memory model.
Even better, we could update the C2 side to manually insert MemBarCPUOrder without load and store fences. This would allow us to move forward in absence of an actual solution for aliasing in presence of wide memory accesses.
Thoughts?
Thanks,
Razvan
From: John Rose [mailto:john.r.rose at oracle.com]
Sent: Monday, February 26, 2018 5:33 PM
To: Paul Sandoz <paul.sandoz at oracle.com>
Cc: Lupusoru, Razvan A <razvan.a.lupusoru at intel.com>; panama-dev at openjdk.java.net
Subject: Re: [vector]: RFR (XS): Small bug fixes
On Feb 26, 2018, at 5:18 PM, Paul Sandoz <paul.sandoz at oracle.com<mailto:paul.sandoz at oracle.com>> wrote:
On Feb 26, 2018, at 5:04 PM, Lupusoru, Razvan A <razvan.a.lupusoru at intel.com<mailto:razvan.a.lupusoru at intel.com>> wrote:
Hi Vladimir,
I am not too familiar with what Unsafe.fullFence() ends up generating in terms of code. However, if it generates something like "mfence" instruction, it is undesirable. Ideally we want a scheduling barrier so that C2 will not move memory operations across the barrier but that has zero cost in terms of generated code. If it has a zero cost indeed, then the patch looks fine to me.
It will generate something equivalent to mfence, ‘lock addl’ (see matches for MemBarVolatileNode and also OrderAccess::fence).
I thought this was just a temporary reprieve to avoid crashes until something better is worked out.
Yes, the fullFence is stronger, associated with volatile reads and writes.
Most raw unsafe accesses use CPU order barriers which are what Razvan wants.
But the Java Unsafe API doesn't directly provide such a fence.
Perhaps one could trick C2 into placing a CPU order barrier next to code
which folds up to nothing, but such a trick would be fragile. Certainly
CPU order fences must have been considered as a possible fence,
but they aren't in today's kit.
We could try Reference.reachabilityFence(null), which (as it happens)
is not reorderable. That's cheaper than fullFence.
— John
More information about the panama-dev
mailing list