missing memory barrier in acmp with C2

Aleksey Shipilev shade at redhat.com
Wed Oct 26 16:19:37 UTC 2016



On 10/26/2016 02:35 PM, Andrew Haley wrote:
> On 26/10/16 12:27, Roman Kennke wrote:
>> Am Mittwoch, den 26.10.2016, 13:24 +0200 schrieb Roland Westrelin:
>>> http://cr.openjdk.java.net/~roland/shenandoah/membar-acmp/webrev.00/
>>>
>>> The code generated for acmp is missing a memory barrier.
>>
>> Great!
>>
>>> Should it be a loadstore + loadload as in
>>> ShenandoahBarrierSet::asm_acmp_barrier() on aarch64 or simply a
>>> loadload?
>>
>> I can come up with a reason for loadload, but not for loadstore, I
>> think loadstore is not necessary there. I'd go for the less restrictive
>> fence unless we come up with a good reason not to.
> 
> The general rule is that you can get away with loadload fences if you
> really know what you are doing, but it is exceedingly subtle.
> 
> Imagine this.  We have two variables, a boolean x_init and an oop
> x.
> 
> Thread 1:
> <Initialize x>
> x_init.store_release(true);
> 
> Thread 2:
> if (x_init.load_aquire())
>     x.blah = y
> 
> If you replace the load acquire with a loadload fence, the store of
> x.blah can become visible before the initialization of x.  In this
> particular case you are probably OK, but in general it's not worth the
> risk of using naked loadload or laodstore fences IMO.  This is an
> optimization that can break things and is very unlikely to result in a
> significant performance improvement.  Default to correct!

I understand the sentiment, and have nothing against it.

However, in the particular case of acmp barrier, loadload seems enough,
because we are indeed only ordering the loads. No potential stores are
of our interest here, and Hans' example talks about stores. As far as I
understood Hans' argument over the years, it was basically about "think
about what is happening around too", and we don't care about that for acmp.

Thanks,
-Aleksey



More information about the shenandoah-dev mailing list