Acquire/Release vs Volatile in VarHandle

Andrew Haley aph at redhat.com
Tue Jul 18 16:49:22 UTC 2017


On 18/07/17 09:30, Gianluca Stivan wrote:

> apologies if this is not the correct mailing list, it seemed the most
> relevant :).

hotspot-dev would be it.

> I've been experimenting with the new VarHandle APIs and especially the
> Acquire/Release and Opaque modes. To my understanding they should map
> pretty much 1-1 to the memory_order_{acquire,release} and
> memory_order_relaxed on C++11.
> 
> As i was benchmarking, I didn't notice any significant change in
> performance from relaxing my memory constraints from Volatile to a
> Acquire/Release. I am using ARMv7, as it's hardware with a weaker memory
> model.
> 
> I am just a beginner in this topic, but I started digging into it and
> noticed that it appears that all the calls to Acquire/Release just become
> calls to Volatile.
> 
> // X-VarHandle.java.template
> @ForceInline
> static $type$ getAcquire(FieldInstanceReadOnly handle, Object holder) {
>   return
> UNSAFE.get$Type$Acquire(Objects.requireNonNull(handle.receiverType.cast(holder)),
> handle.fieldOffset);
> }
> 
> // Unsafe
> @HotSpotIntrinsicCandidate
> public final Object getObjectAcquire(Object o, long offset) {
>   return getObjectVolatile(o, offset);
> }
> 
> Am I correct in that? Why is that?

The clue is "HotSpotIntrinsicCandidate": it means that your call to
getObjectAcquire might be replaced with a compiler intrinsic.  But only
might: it depends on how much work the author of that port has done.

> I then thought, if I can't lower my constraints using the VarHandle APIs,
> maybe I could use fences. But again the results are similar. After some
> more digging, it appears that release/acquire fences compile to a full
> fence (at least on linux_arm).
> 
> // orderAccess_linux_arm.inline.hpp
> inline void OrderAccess::acquire()    { dmb_ld(); }
> inline void OrderAccess::release()    { dmb_sy(); }
> inline void OrderAccess::fence()      { dmb_sy(); }
> 
> `dmb_ld()` is just `dbm_sy()` unless architecture is AARCH64 (which is ARM
>> = v8, as I understand)
> 
> As I mentioned, I'm new to all of this, but am I correct in understanding
> that there seems to be no way to actually get true release/acquire/relaxed
> behavior (as you would in C++) on the current build of JDK9?
> Similar algorithms on C++ do show some differences in performance when I
> relax memory constraints from seq_cst to a mix of relaxed and
> acquire/release.

It probably doesn't mean much more than the target you're running does not
generate fully-optimized code for acquire/release.  To find out I'd build
the disassembler and look at the generated code.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


More information about the jdk9-dev mailing list