[concurrency-interest] RFR: 8065804: JEP 171: Clarifications/corrections for fence intrinsics

Sun Dec 7 22:40:41 UTC 2014

On 6/12/2014 7:49 AM, Martin Buchholz wrote:
> On Thu, Dec 4, 2014 at 5:55 PM, David Holmes <david.holmes at oracle.com> wrote:
>
>> In general phrasing like: " also known as a LoadLoad plus LoadStore barrier
>> ..." is misleading to me as these are not "aliases"- the loadFence (in this
>> case) is being specified to have the same semantics as the
>> loadload|storeload. It should say "corresponds to a LoadLoad plus LoadStore
>> barrier"
>
> +     * Ensures that loads before the fence will not be reordered with loads and
> +     * stores after the fence; also known as a LoadLoad plus LoadStore barrier,
>
> I don't understand this.  I believe they _are_ aliases.  The first
> clause perfectly describes a "LoadLoad plus LoadStore barrier".

I find the language use inappropriate - you are defining the first to be 
the second.

>   - as per the "Corresponds to a C11 ...". And referring to things
>> like "load-acquire fence" is meaningless without some reference to a
>> definition - who defines a load-acquire fence? Is there a universal
>> definition? I would be okay with something looser eg:
>
> Well, I'm defining "load-acquire fence" here in the javadoc - I'm
> claiming that loadFence is also known via other terminology, including
> "load-acquire fence".  Although it's true that "load-acquire fence" is
> also used to refer to the corresponding C11 fence, which has subtly
> different semantics.

When you say "also known as XXX" it means that XXX is already defined 
elsewhere. Unless there is a generally accepted definition of XXX then 
this doesn't add much value.

>> /**
>>    * Ensures that loads before the fence will not be reordered with loads and
>>    * stores after the fence. Corresponds to a LoadLoad plus LoadStore
>> barrier,
>>    * and also to the C11 atomic_thread_fence(memory_order_acquire).
>>    * Sometimes referred to as a "load-acquire fence".
>>    *
>>
>> Also I find this comment strange:
>>
>> !      * A pure StoreStore fence is not provided, since the addition of
>> LoadStore
>> !      * is almost always desired, and most current hardware instructions
>> that
>> !      * provide a StoreStore barrier also provide a LoadStore barrier for
>> free.
>>
>> because inside hotspot we use storeStore barriers a lot, without any
>> loadStore at the same point.
>
> I believe the use of e.g. OrderAccess::storestore in the hotspot
> sources is unfortunate.

I don't! Not at all!

> The actual implementations of storestore (see below) seem to
> universally give you the stronger ::release barrier,

Don't conflate hardware barriers and compiler barriers. On TSO systems 
storestore() is a no-op for the hardware but a compiler-barrier is still 
required. The compiler barrier is stronger than storestore but that's 
because we don't have fine-grained compiler barriers. As I've said 
previously there is an open bug to clean up the orderAccess definitions 
because semantically it is very misleading to define things like 
storestore() as release(). However if you look at the implementation of 
things we are not actually adding any additional semantics to the 
storestore(). Eg for x86 (after a recent change top cleanup the 
compiler-barrier:

inline void OrderAccess::release() {
   compiler_barrier();
}

So storestore() is also just a compiler barrier, thought I'd prefer to 
see it expressed as:

inline void OrderAccess::storestore() { compiler_barrier(); }

than the misleading (and very wrong on non-TSO):

inline void OrderAccess::storestore() { release(); }

And of course the non-TSO platforms use the barrier necessary on that 
platform.

> and it seems
> likely that hotspot engineers are implicitly relying on that, that
> some uses of ::storestore in the hotspot sources are bugs (should be
> ::release instead)

That seems a rather baseless speculation on your part. Having been 
involved in a lot of discussions involving memory barrier usage in 
various algorithms in different areas of hotspot I can assure you that 
the use of storestore() has been quite deliberate and does not assume an 
implicit loadstore().

Also as I've said many many times now release() as a stand-alone 
operation is meaningless.

David
-----

and that there is very little potential performance
> benefit from using ::storestore instead of ::release, precisely
> because the additional loadstore barrier is very close to free on all
> current hardware.  Writing correct code using ::storestore is harder
> than ::release, which is already difficult enough. C11 doesn't provide
> a corresponding fence, which is a strong hint.
>
>
> ./bsd_zero/vm/orderAccess_bsd_zero.inline.hpp:71:inline void
> OrderAccess::storestore() { release(); }
> ./linux_sparc/vm/orderAccess_linux_sparc.inline.hpp:35:inline void
> OrderAccess::storestore() { release(); }
> ./aix_ppc/vm/orderAccess_aix_ppc.inline.hpp:73:inline void
> OrderAccess::storestore() { inlasm_lwsync();  }
> ./linux_zero/vm/orderAccess_linux_zero.inline.hpp:70:inline void
> OrderAccess::storestore() { release(); }
> ./solaris_sparc/vm/orderAccess_solaris_sparc.inline.hpp:40:inline void
> OrderAccess::storestore() { release(); }
> ./linux_ppc/vm/orderAccess_linux_ppc.inline.hpp:75:inline void
> OrderAccess::storestore() { inlasm_lwsync();  }
> ./solaris_x86/vm/orderAccess_solaris_x86.inline.hpp:40:inline void
> OrderAccess::storestore() { release(); }
> ./linux_x86/vm/orderAccess_linux_x86.inline.hpp:35:inline void
> OrderAccess::storestore() { release(); }
> ./bsd_x86/vm/orderAccess_bsd_x86.inline.hpp:35:inline void
> OrderAccess::storestore() { release(); }
> ./windows_x86/vm/orderAccess_windows_x86.inline.hpp:35:inline void
> OrderAccess::storestore() { release(); }
>