[vector]: RFR (XS): Small bug fixes
Vladimir Ivanov
vladimir.x.ivanov at oracle.com
Tue Feb 27 15:54:54 UTC 2018
Paul, John, Razvan, thanks for reviews. Pushed both fixes.
Best regards,
Vladimir Ivanov
On 2/27/18 6:37 PM, Lupusoru, Razvan A wrote:
> Updated solution via Unsafe.loadFence() (or storeFence()) will work for me. And thanks for other fix on NegVI.
>
> --Razvan
>
> -----Original Message-----
> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com]
> Sent: Tuesday, February 27, 2018 7:31 AM
> To: Lupusoru, Razvan A <razvan.a.lupusoru at intel.com>; John Rose <john.r.rose at oracle.com>; Paul Sandoz <paul.sandoz at oracle.com>
> Cc: panama-dev at openjdk.java.net
> Subject: Re: [vector]: RFR (XS): Small bug fixes
>
> Good point, Razvan.
>
>> I recommend that we instead update it to Unsafe().loadFence() followed
>> by Unsafe().storeFence(). From my understanding, this should be safe
>> while having the desired performance characteristics. From what it
>> seems to me, this will generate: MemBarCPUOrder LoadFence
>> MemBarCPUOrder StoreFence. MemBarCPUOrder seems to be what I want
>> which is provides memory ordering guarantees in compiler. And
>> LoadFence and StoreFence translate to nothing in x86 due to memory model.
>
> As John pointed out, bare MemBarCPUOrder is enough. There's no way right now to insert it, but Unsafe.loadFence() or Unsafe.storeFence() will fix the problem as well.
>> Even better, we could update the C2 side to manually insert
>> MemBarCPUOrder without load and store fences. This would allow us to
>> move forward in absence of an actual solution for aliasing in presence
>> of wide memory accesses.
>
> I proposed the fix as a stop-the-gap solution.
>
> I agree that having it implemented in C2 is appealing (e.g., there are some optimization opportunities to avoid barrier depending on whether reboxing actually happens on not) and a viable solution in a longer term.
>
> Best regards,
> Vladimir Ivanov
>
>> *From:*John Rose [mailto:john.r.rose at oracle.com]
>> *Sent:* Monday, February 26, 2018 5:33 PM
>> *To:* Paul Sandoz <paul.sandoz at oracle.com>
>> *Cc:* Lupusoru, Razvan A <razvan.a.lupusoru at intel.com>;
>> panama-dev at openjdk.java.net
>> *Subject:* Re: [vector]: RFR (XS): Small bug fixes
>>
>> On Feb 26, 2018, at 5:18 PM, Paul Sandoz <paul.sandoz at oracle.com
>> <mailto:paul.sandoz at oracle.com>> wrote:
>>
>> On Feb 26, 2018, at 5:04 PM, Lupusoru, Razvan A
>> <razvan.a.lupusoru at intel.com <mailto:razvan.a.lupusoru at intel.com>>
>> wrote:
>>
>>
>> Hi Vladimir,
>>
>> I am not too familiar with what Unsafe.fullFence() ends up
>> generating in terms of code. However, if it generates something
>> like "mfence" instruction, it is undesirable. Ideally we want a
>> scheduling barrier so that C2 will not move memory operations
>> across the barrier but that has zero cost in terms of generated
>> code. If it has a zero cost indeed, then the patch looks fine to me.
>>
>>
>> It will generate something equivalent to mfence, ‘lock addl’ (see
>> matches for MemBarVolatileNode and also OrderAccess::fence).
>>
>> I thought this was just a temporary reprieve to avoid crashes until
>> something better is worked out.
>>
>> Yes, the fullFence is stronger, associated with volatile reads and writes.
>>
>> Most raw unsafe accesses use CPU order barriers which are what Razvan wants.
>>
>>
>> But the Java Unsafe API doesn't directly provide such a fence.
>>
>> Perhaps one could trick C2 into placing a CPU order barrier next to
>> code
>>
>> which folds up to nothing, but such a trick would be fragile.
>> Certainly
>>
>> CPU order fences must have been considered as a possible fence,
>>
>> but they aren't in today's kit.
>>
>> We could try Reference.reachabilityFence(null), which (as it happens)
>>
>> is not reorderable. That's cheaper than fullFence.
>>
>> — John
>>
More information about the panama-dev
mailing list