[vector]: RFR (XS): Small bug fixes
Kandu, Rahul
rahul.kandu at intel.com
Tue Feb 27 20:19:05 UTC 2018
Yes- mfence on a combined sequence of calls.
-----Original Message-----
From: Paul Sandoz [mailto:paul.sandoz at oracle.com]
Sent: Tuesday, February 27, 2018 11:07 AM
To: Kandu, Rahul <rahul.kandu at intel.com>
Cc: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>; Lupusoru, Razvan A <razvan.a.lupusoru at intel.com>; John Rose <john.r.rose at oracle.com>; panama-dev at openjdk.java.net
Subject: Re: [vector]: RFR (XS): Small bug fixes
> On Feb 27, 2018, at 10:48 AM, Kandu, Rahul <rahul.kandu at intel.com> wrote:
>
> Would it still generate "mfence" or "lock addl" with use of loadFence() and storeFence().
>
No, neither independently nor for a combined sequence of calls (i think your questions refers to the latter?).
Paul.
> -Rahul
>
> ."And LoadFence and StoreFence translate to nothing in x86 due to memory model."
>
> -----Original Message-----
> From: panama-dev [mailto:panama-dev-bounces at openjdk.java.net] On
> Behalf Of Vladimir Ivanov
> Sent: Tuesday, February 27, 2018 7:55 AM
> To: Lupusoru, Razvan A <razvan.a.lupusoru at intel.com>; John Rose
> <john.r.rose at oracle.com>; Paul Sandoz <paul.sandoz at oracle.com>
> Cc: panama-dev at openjdk.java.net
> Subject: Re: [vector]: RFR (XS): Small bug fixes
>
> Paul, John, Razvan, thanks for reviews. Pushed both fixes.
>
> Best regards,
> Vladimir Ivanov
>
> On 2/27/18 6:37 PM, Lupusoru, Razvan A wrote:
>> Updated solution via Unsafe.loadFence() (or storeFence()) will work for me. And thanks for other fix on NegVI.
>>
>> --Razvan
>>
>> -----Original Message-----
>> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com]
>> Sent: Tuesday, February 27, 2018 7:31 AM
>> To: Lupusoru, Razvan A <razvan.a.lupusoru at intel.com>; John Rose
>> <john.r.rose at oracle.com>; Paul Sandoz <paul.sandoz at oracle.com>
>> Cc: panama-dev at openjdk.java.net
>> Subject: Re: [vector]: RFR (XS): Small bug fixes
>>
>> Good point, Razvan.
>>
>>> I recommend that we instead update it to Unsafe().loadFence()
>>> followed by Unsafe().storeFence(). From my understanding, this
>>> should be safe while having the desired performance characteristics.
>>> From what it seems to me, this will generate: MemBarCPUOrder
>>> LoadFence MemBarCPUOrder StoreFence. MemBarCPUOrder seems to be what
>>> I want which is provides memory ordering guarantees in compiler. And
>>> LoadFence and StoreFence translate to nothing in x86 due to memory model.
>>
>> As John pointed out, bare MemBarCPUOrder is enough. There's no way right now to insert it, but Unsafe.loadFence() or Unsafe.storeFence() will fix the problem as well.
>>> Even better, we could update the C2 side to manually insert
>>> MemBarCPUOrder without load and store fences. This would allow us to
>>> move forward in absence of an actual solution for aliasing in
>>> presence of wide memory accesses.
>>
>> I proposed the fix as a stop-the-gap solution.
>>
>> I agree that having it implemented in C2 is appealing (e.g., there are some optimization opportunities to avoid barrier depending on whether reboxing actually happens on not) and a viable solution in a longer term.
>>
>> Best regards,
>> Vladimir Ivanov
>>
>>> *From:*John Rose [mailto:john.r.rose at oracle.com]
>>> *Sent:* Monday, February 26, 2018 5:33 PM
>>> *To:* Paul Sandoz <paul.sandoz at oracle.com>
>>> *Cc:* Lupusoru, Razvan A <razvan.a.lupusoru at intel.com>;
>>> panama-dev at openjdk.java.net
>>> *Subject:* Re: [vector]: RFR (XS): Small bug fixes
>>>
>>> On Feb 26, 2018, at 5:18 PM, Paul Sandoz <paul.sandoz at oracle.com
>>> <mailto:paul.sandoz at oracle.com>> wrote:
>>>
>>> On Feb 26, 2018, at 5:04 PM, Lupusoru, Razvan A
>>> <razvan.a.lupusoru at intel.com <mailto:razvan.a.lupusoru at intel.com>>
>>> wrote:
>>>
>>>
>>> Hi Vladimir,
>>>
>>> I am not too familiar with what Unsafe.fullFence() ends up
>>> generating in terms of code. However, if it generates something
>>> like "mfence" instruction, it is undesirable. Ideally we want a
>>> scheduling barrier so that C2 will not move memory operations
>>> across the barrier but that has zero cost in terms of generated
>>> code. If it has a zero cost indeed, then the patch looks fine to me.
>>>
>>>
>>> It will generate something equivalent to mfence, ‘lock addl’ (see
>>> matches for MemBarVolatileNode and also OrderAccess::fence).
>>>
>>> I thought this was just a temporary reprieve to avoid crashes until
>>> something better is worked out.
>>>
>>> Yes, the fullFence is stronger, associated with volatile reads and writes.
>>>
>>> Most raw unsafe accesses use CPU order barriers which are what Razvan wants.
>>>
>>>
>>> But the Java Unsafe API doesn't directly provide such a fence.
>>>
>>> Perhaps one could trick C2 into placing a CPU order barrier next to
>>> code
>>>
>>> which folds up to nothing, but such a trick would be fragile.
>>> Certainly
>>>
>>> CPU order fences must have been considered as a possible fence,
>>>
>>> but they aren't in today's kit.
>>>
>>> We could try Reference.reachabilityFence(null), which (as it
>>> happens)
>>>
>>> is not reorderable. That's cheaper than fullFence.
>>>
>>> — John
>>>
More information about the panama-dev
mailing list