RFR(M): 8028515: PPC64 (part 113.2): opto: Introduce MemBarAcquire/ReleaseWide.
David Holmes
david.holmes at oracle.com
Wed Nov 20 12:05:49 PST 2013
First apologies that the email you replied to here didn't go to the
lists. Due to the way it was filtered I assumed it was a direct email
only. :)
On 21/11/2013 12:22 AM, Lindenmaier, Goetz wrote:
> Hi David,
>
> thanks for the comprehensive explanation. That's in agreement with
> what I understand. Except that I think, unidirectional barriers only
> make sense if you have Acquire/Release operations dedicated to
> a certain operation. Then other operations of the same kind can pass
> the barrier in one direction.
>
> This also matches the extension I contributed to hotspot (8024921),
> which now allows to issue such operations right along with the
> loads/stores.
>
> This change here tries to separate the barriers used after load/stores from
> the 'wide' or better 'fence' variants.
But Unsafe.loadFence and Unsafe.storeFence _are_ intended for use after
loads/stores - but with stronger semantics than just simple loadLoad or
storeStore barriers.
> Given the basic four barriers (L-L, L-S, S-L, S-S) there are 16 possible
> operations. Unfortunately, Hotspot knows only four of these:
> MemBarStoreStore: S-S
Okay - nice and clear.
> MemBarAcquire: L-S, L-L
I'm not sure what makes this "acquire" but this seems to be exactly what
Unsafe.loadFence requires.
> MemBarRelease: L-S, S-S
Again I'm not sure what makes this "release" - I also find it odd that
it is L-S not S-L. If S-L this would be exactly what Unsafe.storeFence
needs.
> MemBarVolatile: L-L, L-S, S-L, S-S
A full fence, needed in the worst case when accessing a volatile variable.
Also note that in different parts of the VM we have different
expressions of these, both at the JIT level and runtime eg see
orderAccess.hpp (which is not in itself self-consistent or clear: see
https://bugs.openjdk.java.net/browse/JDK-7143664).
Also I said that the Unsafe ops require bi-directional fences but that
was imprecise - the _load and _store ops only require two of the four
possible barriers which may be why you see these as distinct from the
membars associated directly with variable accesses ?
> and these don't map, e.g., the operations available on PPC:
> lwsync: L-S, L-L, S-S
> sync: L-L, L-S, S-L, S-S
> What is the reason why it's hard for us to generate optimal code
> with the available optimizations:
> MemBarAcquire;
> MemBarRelease;
> are mapped to
> lwsync
> lwsync
> which is obviously redundant.
Are you referring to generated code if these nodes are adjacent? If so I
assume the problem is that you can't logically coalesce these, even if
the actual generated code could be coalesced?
> A generic solution would be to have a single operation, that can be
> Tagged with the barriers (L-L, L-S, S-L, S-S) it should guarantee.
> Two operations could be merged by an optimization simply by
> or-ing the tags.
> But this is not reality in HotSpot, and out of scope of this change.
Ok
>> I think perhaps that Vladimir's suggestion is best - if these are the
>> Membar nodes for these unsafe ops (and only these) then name them to
>> match the unsafe ops. At least then it is clearer where the
>> specification for their behaviour comes from.
> OK, so I'll rename them as Vladimir proposed, and post a new webrev.
I still have this concern regarding the fact you think these are not
related to load/store operations. But I'll wait for the webrev.
Thanks for your patience on this. Hopefully we can sort it out in the
next 24 hours before I takeoff.
David
-----
> Best regards and a good flight home,
> Goetz.
>
>
>
>
> -----Original Message-----
> From: David Holmes [mailto:david.holmes at oracle.com]
> Sent: Mittwoch, 20. November 2013 14:30
> To: Lindenmaier, Goetz
> Cc: David Holmes
> Subject: Re: RFR(M): 8028515: PPC64 (part 113.2): opto: Introduce MemBarAcquire/ReleaseWide.
>
> Hi Goetz,
>
> On 20/11/2013 7:36 PM, Lindenmaier, Goetz wrote:
>> Hi David,
>> I don't know what you mean by uni/bi-directional.
>
> Pretty much as you describe below. A unidirectional barrier allows
> things to cross in one direction only ie it blocks motion one way; a
> bi-directional barrier ("full fence") blocks both ways and so prevents
> any reordering.
>
> As I described in earlier emails these kind of unreferenced
> acquire/release semantics ie not associated with any stores/loads is
> used to describe the memory barrier properties of a synchronized block,
> for example. Monitor acquisition has acquire semantics, while monitor
> release has store semantics. The implied barriers mean that any accesses
> can move into the synchronized region but none can move out - the "roach
> motel" semantics.
>
> Now as you say this kind of barrier is not what is wanted with these
> loadFence/storeFence operations, hence I think the generic
> MembarAcquire/MembarRelease are inappropriate names (whether the
> implementation is correct is a different matter). Similarly I find the
> MembarAcquireFence and MembarReleaseFence to be somewhat inappropriate -
> what do they mean ???
>
> The loadFence semantics imply a loadLoad|loadStore barrier.
> The storeFence semantics imply a storeLoad|storeStore barrier.
> The fullFence is of course all four.
>
> Again I'm unclear if the current or proposed MembarXXX actions actually
> map to barriers that implement those semantics.
>
> So yes this comes down to naming initially but also verification that
> the implementation does as required.
>
> I think perhaps that Vladimir's suggestion is best - if these are the
> Membar nodes for these unsafe ops (and only these) then name them to
> match the unsafe ops. At least then it is clearer where the
> specification for their behaviour comes from.
>
> I'll post a follow up email to the list when I get a chance. I'm
> currently in the US and attending lots of meetings then head back to
> Australia tomorrow evening.
>
> Thanks,
> David
>
>> I guess we agree that the documentation of the intrinsics, let's take
>> loadFence
>> for an example, talks about loads before the loadFence, and stores and
>> loads after it:
>> loads
>> ------------loadFence()--------------
>> loads, stores
>> With this, you can achieve any ordering by moving nodes up OR down:
>> Op1 Move Op1 Op2
>> Op2 down -------------
>> --------- =========> Op3
>> Op3 Op1
>> Alternatively, you can move op3 up, and get the same order:
>> Op1 Reorder Op2 Move Op3 Op2
>> Op2 Ops 1 and 2 Op1 up Op3
>> --------- =========> -------------- =========> Op1
>> Op3 Not restricted Op3 ----------
>> by later loadFence
>> So, if the operation restricts only moves in one direction, it's of no
>> help because
>> you can still get an execution order you did not expect. So
>> unidirectional operations
>> (e.g., only hindering operations from moving up) are pointless.
>> By the way, are you questioning whether the implementation is correct, or
>> whether the names express what is desired?
>> If it's the first, it's of no concern to this change, which is only about
>> matching these operations independently from those, that address
>> a dedicated memory operation.
>> If It's the second, please tell me what names to use, and I'll fix it.
>> (I hope you can properly read the 'drawings')
>> Best regards,
>> Goetz.
>> -----Original Message-----
>> From: David Holmes [mailto:david.holmes at oracle.com]
>> Sent: Dienstag, 19. November 2013 20:51
>> To: Lindenmaier, Goetz
>> Cc: 'Vladimir Kozlov'; 'ppc-aix-port-dev at openjdk.java.net';
>> 'hotspot-dev at openjdk.java.net'
>> Subject: Re: RFR(M): 8028515: PPC64 (part 113.2): opto: Introduce
>> MemBarAcquire/ReleaseWide.
>> On 20/11/2013 5:34 AM, Lindenmaier, Goetz wrote:
>>> Hi David,
>>>
>>> this are the instrinsics for sun/misc/Unsafe, which are documented
>>> as shown below. So I guess that's just what is desired.
>> I don't think so! What is described for Unsafe are full bi-directional
>> fences and "acquire" and "release" are only uni-directional fences!
>> David
>> -----
>>> Best regards,
>>> Goetz.
>>>
>>> /**
>>> * Ensures lack of reordering of loads before the fence
>>> * with loads or stores after the fence.
>>> * @since 1.8
>>> */
>>> public native void loadFence();
>>>
>>> /**
>>> * Ensures lack of reordering of stores before the fence
>>> * with loads or stores after the fence.
>>> * @since 1.8
>>> */
>>> public native void storeFence();
>>>
>>> /**
>>> * Ensures lack of reordering of loads or stores before the fence
>>> * with loads or stores after the fence.
>>> * @since 1.8
>>> */
>>> public native void fullFence();
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: David Holmes [mailto:david.holmes at oracle.com]
>>> Sent: Tuesday, November 19, 2013 8:08 PM
>>> To: Lindenmaier, Goetz
>>> Cc: Vladimir Kozlov; 'ppc-aix-port-dev at openjdk.java.net'; 'hotspot-dev at openjdk.java.net'
>>> Subject: Re: RFR(M): 8028515: PPC64 (part 113.2): opto: Introduce MemBarAcquire/ReleaseWide.
>>>
>>> So I like the rename but I am concerned there are semantic issues here:
>>>
>>> switch(id) {
>>> case vmIntrinsics::_loadFence:
>>> - insert_mem_bar(Op_MemBarAcquire);
>>> + insert_mem_bar(Op_MemBarFenceAcquire);
>>> return true;
>>> case vmIntrinsics::_storeFence:
>>> - insert_mem_bar(Op_MemBarRelease);
>>> + insert_mem_bar(Op_MemBarFenceRelease);
>>> return true;
>>>
>>> What is a _loadFence operation? Does it really map to
>>> MembarFenceAcquire? Ditto for the _storeFence. These sound like
>>> bi-directional fences - are they? Are they really only concerned with
>>> loads or stores ?
>>>
>>> David
>>>
>>> On 19/11/2013 6:34 PM, Lindenmaier, Goetz wrote:
>>>> Hi,
>>>>
>>>> I made a new webrev, using the new names:
>>>> http://cr.openjdk.java.net/~goetz/webrevs/8028515-1-wide/
>>>>
>>>> Thanks,
>>>> Goetz.
>>>>
>>>> -----Original Message-----
>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>> Sent: Montag, 18. November 2013 23:28
>>>> To: Lindenmaier, Goetz; 'David Holmes'
>>>> Cc: 'ppc-aix-port-dev at openjdk.java.net'; 'hotspot-dev at openjdk.java.net'
>>>> Subject: Re: RFR(M): 8028515: PPC64 (part 113.2): opto: Introduce MemBarAcquire/ReleaseWide.
>>>>
>>>> On 11/18/13 2:19 PM, Lindenmaier, Goetz wrote:
>>>>> Hi David, Vladimir,
>>>>>
>>>>> I also would prefer MemBarFenceAquire/Release, for two reasons
>>>>> - the same prefix shows they are all similar nodes.
>>>>> - I don't have to change MemBarVolatile to FullFence, which I didn't
>>>>> change yet and which is used in more places.
>>>>
>>>> Okay.
>>>>
>>>> Vladimir
>>>>
>>>>>
>>>>> Best regards,
>>>>> Goetz.
>>>>>
>>>>> -----Original Message-----
>>>>> From: David Holmes [mailto:david.holmes at oracle.com]
>>>>> Sent: Monday, November 18, 2013 10:49 PM
>>>>> To: Vladimir Kozlov
>>>>> Cc: Lindenmaier, Goetz; 'ppc-aix-port-dev at openjdk.java.net'; 'hotspot-dev at openjdk.java.net'
>>>>> Subject: Re: RFR(M): 8028515: PPC64 (part 113.2): opto: Introduce MemBarAcquire/ReleaseWide.
>>>>>
>>>>> Vladimir,
>>>>>
>>>>> On 19/11/2013 5:36 AM, Vladimir Kozlov wrote:
>>>>>> I think David asked for different nodes not just one.
>>>>>> We can have new membar nodes with names matching Unsafe methods names:
>>>>>> LoadFence, StoreFence, FullFence. I am fine with it.
>>>>>
>>>>> But I don't think that actually describes them - they don't distinguish
>>>>> between loads and stores only the direction in which the barrier
>>>>> operates. "acquire" only allows accesses to move below it; "release"
>>>>> only allows accesses to move above it ie:
>>>>>
>>>>> load a
>>>>> store b, c
>>>>> acquire();
>>>>> load d
>>>>> store e,f
>>>>> release();
>>>>> load g
>>>>> store h, i
>>>>>
>>>>> allows the loads of 'a' and 'g' to move into the region between acquire
>>>>> and release; similarly for the stores to b and h. But the load of d nor
>>>>> the store to e can not move in relation to either acquire or release.
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>> MemBar and Fence have similar meaning so lets don't mix them in node names.
>>>>>>
>>>>>> Thanks,
>>>>>> Vladimir
>>>>>>
>>>>>> On 11/18/13 8:19 AM, Lindenmaier, Goetz wrote:
>>>>>>> Hi David,
>>>>>>>
>>>>>>> as reply to your comment on the bug:
>>>>>>>
>>>>>>> Well, I sitll would need 2 different nodes, as on PPC we do
>>>>>>> MemBarAcquireWide --> lwsync
>>>>>>> MemBarReleaseWide --> lwsync
>>>>>>> MemBarVolatile --> sync.
>>>>>>> On Sparc, you even do 3 different operations.
>>>>>>>
>>>>>>> Or should I name them MemBarFenceAcquire and MemBarFenceRelease?
>>>>>>> This all depends a lot on the available instructions of the processors.
>>>>>>> Therefore I think a really clean representation that, at the same
>>>>>>> time, allows
>>>>>>> to find the cheapest set of instructions to express it on all
>>>>>>> processors, is impossible.
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Goetz
>>>>>>>
>>>>>>> PS: Should I respond to comments in the bug right in the bug
>>>>>>> or on the mailing lists?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> From:ppc-aix-port-dev-bounces at openjdk.java.net
>> <mailto:ppc-aix-port-dev-bounces at openjdk.java.net>
>>>>>>> [mailto:ppc-aix-port-dev-bounces at openjdk.java.net] On Behalf Of
>>>>>>> Lindenmaier, Goetz
>>>>>>> Sent: Montag, 18. November 2013 15:19
>>>>>>> To: 'hotspot-dev at openjdk.java.net';
>>>>>>> 'ppc-aix-port-dev at openjdk.java.net'; Vladimir Kozlov
>>>>>>> Subject: RFR(M): 8028515: PPC64 (part 113.2): opto: Introduce
>>>>>>> MemBarAcquire/ReleaseWide.
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> The c2 compiler inserts MemBarAcquire/Release nodes to enforce memory
>>>>>>> ordering in various places. Some order a certain load/store with other
>>>>>>> operations. Inline_unsafe_fence() inserts MemBars that do not
>>>>>>> correspont to a memory operation. So far, the same nodes were used.
>>>>>>>
>>>>>>> This change introduces MemBarAcquire/ReleaseWide to use where no
>>>>>>> dedicated load/store is ordered. With this change, these nodes can be
>>>>>>> matched differently, what is needed on PPC64.
>>>>>>>
>>>>>>> When reviewing 8024921 (part 113) we decided to avoid #defines in
>>>>>>> inline_unsafe_fence() and to introduce new MemBar operations.
>>>>>>>
>>>>>>> Please review and test this change.
>>>>>>> http://cr.openjdk.java.net/~goetz/webrevs/8028515-0-wide/
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Goetz.
>>>>>>>
More information about the ppc-aix-port-dev
mailing list