[vector]: RFR (XS): Small bug fixes

Kandu, Rahul rahul.kandu at intel.com
Tue Feb 27 18:48:35 UTC 2018


Would it still generate "mfence" or "lock addl" with use of loadFence() and storeFence().

-Rahul

."And LoadFence and StoreFence translate to nothing in x86 due to memory model."

-----Original Message-----
From: panama-dev [mailto:panama-dev-bounces at openjdk.java.net] On Behalf Of Vladimir Ivanov
Sent: Tuesday, February 27, 2018 7:55 AM
To: Lupusoru, Razvan A <razvan.a.lupusoru at intel.com>; John Rose <john.r.rose at oracle.com>; Paul Sandoz <paul.sandoz at oracle.com>
Cc: panama-dev at openjdk.java.net
Subject: Re: [vector]: RFR (XS): Small bug fixes

Paul, John, Razvan, thanks for reviews. Pushed both fixes.

Best regards,
Vladimir Ivanov

On 2/27/18 6:37 PM, Lupusoru, Razvan A wrote:
> Updated solution via Unsafe.loadFence() (or storeFence()) will work for me. And thanks for other fix on NegVI.
> 
> --Razvan
> 
> -----Original Message-----
> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com]
> Sent: Tuesday, February 27, 2018 7:31 AM
> To: Lupusoru, Razvan A <razvan.a.lupusoru at intel.com>; John Rose 
> <john.r.rose at oracle.com>; Paul Sandoz <paul.sandoz at oracle.com>
> Cc: panama-dev at openjdk.java.net
> Subject: Re: [vector]: RFR (XS): Small bug fixes
> 
> Good point, Razvan.
> 
>> I recommend that we instead update it to Unsafe().loadFence() 
>> followed by Unsafe().storeFence(). From my understanding, this should 
>> be safe while having the desired performance characteristics. From 
>> what it seems to me, this will generate: MemBarCPUOrder LoadFence 
>> MemBarCPUOrder StoreFence. MemBarCPUOrder seems to be what I want 
>> which is provides memory ordering guarantees in compiler. And 
>> LoadFence and StoreFence translate to nothing in x86 due to memory model.
> 
> As John pointed out, bare MemBarCPUOrder is enough. There's no way right now to insert it, but Unsafe.loadFence() or Unsafe.storeFence() will fix the problem as well.
>> Even better, we could update the C2 side to manually insert 
>> MemBarCPUOrder without load and store fences. This would allow us to 
>> move forward in absence of an actual solution for aliasing in 
>> presence of wide memory accesses.
> 
> I proposed the fix as a stop-the-gap solution.
> 
> I agree that having it implemented in C2 is appealing (e.g., there are some optimization opportunities to avoid barrier depending on whether reboxing actually happens on not) and a viable solution in a longer term.
> 
> Best regards,
> Vladimir Ivanov
> 
>> *From:*John Rose [mailto:john.r.rose at oracle.com]
>> *Sent:* Monday, February 26, 2018 5:33 PM
>> *To:* Paul Sandoz <paul.sandoz at oracle.com>
>> *Cc:* Lupusoru, Razvan A <razvan.a.lupusoru at intel.com>; 
>> panama-dev at openjdk.java.net
>> *Subject:* Re: [vector]: RFR (XS): Small bug fixes
>>
>> On Feb 26, 2018, at 5:18 PM, Paul Sandoz <paul.sandoz at oracle.com 
>> <mailto:paul.sandoz at oracle.com>> wrote:
>>
>>      On Feb 26, 2018, at 5:04 PM, Lupusoru, Razvan A
>>      <razvan.a.lupusoru at intel.com <mailto:razvan.a.lupusoru at intel.com>>
>>      wrote:
>>
>>
>>          Hi Vladimir,
>>
>>          I am not too familiar with what Unsafe.fullFence() ends up
>>          generating in terms of code. However, if it generates something
>>          like "mfence" instruction, it is undesirable. Ideally we want a
>>          scheduling barrier so that C2 will not move memory operations
>>          across the barrier but that has zero cost in terms of generated
>>          code. If it has a zero cost indeed, then the patch looks fine to me.
>>
>>
>>      It will generate something equivalent to mfence, ‘lock addl’ (see
>>      matches for MemBarVolatileNode and also OrderAccess::fence).
>>
>>      I thought this was just a temporary reprieve to avoid crashes until
>>      something better is worked out.
>>
>> Yes, the fullFence is stronger, associated with volatile reads and writes.
>>
>> Most raw unsafe accesses use CPU order barriers which are what Razvan wants.
>>
>>
>> But the Java Unsafe API doesn't directly provide such a fence.
>>
>> Perhaps one could trick C2 into placing a CPU order barrier next to 
>> code
>>
>> which folds up to nothing, but such a trick would be fragile.
>> Certainly
>>
>>    CPU order fences must have been considered as a possible fence,
>>
>> but they aren't in today's kit.
>>
>> We could try Reference.reachabilityFence(null), which (as it happens)
>>
>> is not reorderable. That's cheaper than fullFence.
>>
>> — John
>>


More information about the panama-dev mailing list