RFR(M): 8028515: PPC64 (part 113.2): opto: Introduce MemBarAcquire/ReleaseWide.
Vladimir Kozlov
vladimir.kozlov at oracle.com
Mon Nov 25 23:49:25 PST 2013
I prefer Load/StoreFence. I need to prepare closed changes and do testing before push. I will do it tomorrow.
Thanks,
Vladimir
On 11/25/13 11:29 PM, David Holmes wrote:
> On 26/11/2013 5:24 PM, Lindenmaier, Goetz wrote:
>> Hi David,
>>
>> OK, thank you!
>>
>> Do you think this change can be pushed now?
>
> I think so.
>
>> Which naming would you prefer, finally?
>> http://cr.openjdk.java.net/~goetz/webrevs/8028515-1-wide MemBarFenceAcquire
>> http://cr.openjdk.java.net/~goetz/webrevs/8028515-2-wide loadFence
>
> loadFence out of those two choices, but I'll leave that to Vladimir to confirm.
>
> Thanks,
> David
>
>> Best regards,
>> Goetz.
>>
>>
>> -----Original Message-----
>> From: David Holmes [mailto:david.holmes at oracle.com]
>> Sent: Tuesday, November 26, 2013 1:43 AM
>> To: Lindenmaier, Goetz
>> Cc: 'Vladimir Kozlov'; 'ppc-aix-port-dev at openjdk.java.net'; 'hotspot-dev at openjdk.java.net'
>> Subject: Re: RFR(M): 8028515: PPC64 (part 113.2): opto: Introduce MemBarAcquire/ReleaseWide.
>>
>> Hi Goetz & Martin,
>>
>> On 25/11/2013 9:26 PM, Lindenmaier, Goetz wrote:
>>> Hi David,
>>>
>>> I promised to comment on your concern about individual load/stores:
>>>> As you state ld.acq and st.rel only
>>>> relate to the target load/store _but_ there are no actions in the Java
>>>> Memory Model that only provide ordering with respect to a single load or
>>>> store!
>>> We derive this from the well-known cookbook:
>>> http://g.oswego.edu/dl/jmm/cookbook.html
>>> Have a look at the table "required Barriers".
>>> It talks about individual operations. E.g., imagine
>>> Normal load 1
>>> Volatile load 2
>>> Normal load 3
>>> Then load1 may float arbitrarily to any place, but load3 may not
>>> float above load2.
>>> If I issue a 'wide' Acquire barriers after load2 (lwsync), this hinders load1 to
>>> float below this barrier, introducing a constraint not specified by this
>>> table.
>>>
>>> Does this satisfy your concerns?
>>
>> I was far too general in my comment re the JMM, I was thinking in terms
>> of happens-before which is a transitive relation, and generalizing that
>> to infer that all barriers have more far reaching affects that just the
>> current load/store. But of course when you drill down into the actual
>> requirements there are places where these more directed barriers are
>> perfectly fine.
>>
>> Thanks,
>> David
>>
>>> Best regards,
>>> Goetz and Martin
>>>
>>>
>>> -----Original Message-----
>>> From: David Holmes [mailto:david.holmes at oracle.com]
>>> Sent: Donnerstag, 21. November 2013 15:30
>>> To: Lindenmaier, Goetz
>>> Cc: 'hotspot-dev at openjdk.java.net'; 'ppc-aix-port-dev at openjdk.java.net'; 'Vladimir Kozlov'
>>> Subject: Re: RFR(M): 8028515: PPC64 (part 113.2): opto: Introduce MemBarAcquire/ReleaseWide.
>>>
>>> Hi Goetz,
>>>
>>> On 21/11/2013 8:23 PM, Lindenmaier, Goetz wrote:
>>>> Hi David,
>>>>
>>>> here the new webrev, naming the nodes LoadFence/StoreFence:
>>>> http://cr.openjdk.java.net/~goetz/webrevs/8028515-2-wide/
>>>
>>> Okay. I think I now disagree (when seeing in context) that Membar should
>>> have been dropped from the name but I'll let that lie.
>>>
>>> Aside: I find the descriptions of all those "membar" nodes very unclear
>>> - it would be better, for me, if there were expressed more clearly in
>>> terms of storeStore, storeLoad etc.
>>>
>>> Could you add to the comments in memnode.hpp that those nodes are only
>>> used by the respective Unsafe intrinisics? Or at least use that as an
>>> example usage?
>>>
>>>> The thing with the mail was my fault, I had sent the first mail
>>>> only to you accidentially. Sorry for that.
>>>>
>>>>> But Unsafe.loadFence and Unsafe.storeFence _are_ intended for use after
>>>>> loads/stores - but with stronger semantics than just simple loadLoad or
>>>>> storeStore barriers.
>>>> Yes. I mean they do not intend to sort a single load/store with succeeding
>>>> operations. This can not be implemented this way, as the operation
>>>> is not known in the intrinsic. I mean cases where you can use ia64 ld.acq
>>>> or ppc ld-twi-lwsync.
>>>
>>> If you are saying that loadFence and storeFence can not be implemented
>>> using ld.acq and st.rel then good I agree with you. But I would argue
>>> that general ld;MembarAcquire and st;MembarRelease can also not be
>>> implemented with those instructions! As you state ld.acq and st.rel only
>>> relate to the target load/store _but_ there are no actions in the Java
>>> Memory Model that only provide ordering with respect to a single load or
>>> store! The Unsafe *Ordered() operations may come closer to this but I
>>> would expect them to be handled via an intrinsic - and they are outside
>>> the JMM. So I would think the applicability of using ld.acq and st.rel
>>> would be very limited, so it concerns me that it seems to be associated
>>> with the very general MembarAcquire and MembarRelease nodes (ditto for
>>> the changes in 8024921).
>>>
>>> But how you implement this is your concern. :) From the perspective of
>>> this shared code my concern is that it is not clear exactly what the
>>> practical distinction is between MembarAcquire and LoadFence, and
>>> MembarRelease and StoreFence. I'm not a compiler person but I would find
>>> it quite difficult to know which can/should be used when. And given for
>>> most platforms they would be implemented the same, that would add to the
>>> uncertainty (with a risk that someone will make an arbitrary choice).
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>>> Also I said that the Unsafe ops require bi-directional fences but that
>>>>> was imprecise - the _load and _store ops only require two of the four
>>>>> possible barriers which may be why you see these as distinct from the
>>>>> membars associated directly with variable accesses ?
>>>> On ia64, ld.acq sorts _this_ _single_ load with all following ld/st operations.
>>>> For the intrinsic I must use mf, which sorts _all_ previous loads with all
>>>> following ld/st operations (and more). So my distinction is if I have a
>>>> store on one processor, and a load on the other, and I want to make sure
>>>> the other processor's load sees the stored value, I can use - on ia64 - st.rel/ld.acq and I'm fine.
>>>> If I did some arbitrary stores, and I want assure an other processor doing
>>>> some arbitrary load,s then I must use - on ia64- mf on both.
>>>> On sparc I can do a
>>>> membar( Assembler::Membar_mask_bits(Assembler::LoadStore | Assembler::LoadLoad) );
>>>> membar( Assembler::Membar_mask_bits(Assembler::LoadStore | Assembler::StoreStore) );
>>>> But, on sparc, there is no operation as ld.acq as far as I know.
>>>>
>>>> Change 8024921 enables to distinguish normal loads and such
>>>> that should do ld.acq. This change enables the compiler to
>>>> differentiate the MemBar nodes used better: such coming after
>>>> a dedicated load, and such valid for any load before the barrier.
>>>> This is already visible in the IR by some means: MemBars after
>>>> dedicated loads have an input edge at position MemBar::precedence
>>>> connecting them with the load, but that does not suffice to
>>>> match it (in the matcher) correctly.
>>>> If I use ld.acq, the following MemBarAcquire is superfluous,
>>>> but the MemBarAcquire used in the intrinsic is not. So want to
>>>> have different nodes.
>>>> In our VM, we use #ifdef PPC and issue MemBarRelease instead of MemBarAcquire,
>>>> and with #ifdef ia64 we use MemBarVolatile, as we know these operations will
>>>> issue the proper assembly instructions. But this is bad - I want to do it better
>>>> here.
>>>>
>>>>> Are you referring to generated code if these nodes are adjacent? If so I
>>>>> assume the problem is that you can't logically coalesce these, even if
>>>>> the actual generated code could be coalesced?
>>>> Yes. On ia64 it is worst, as we had to implement MemBarRelease, Acquire
>>>> and Volatile with mf() if we would not use ld.acq/st.rel. Then there
>>>> would be even more redundant operations. But that's out of scope of this change.
>>>>
>>>>> I still have this concern regarding the fact you think these are not
>>>>> related to load/store operations. But I'll wait for the webrev.
>>>> As stated above, yes, they are related to store/load operations. But not
>>>> to a dedicated ld/st -- I can not use ld.acq/st.rel as the ld/st are not
>>>> part of the intrinsic.
>>>>
>>>> (I'm using the ia64 mnemonics in the implementation because I think
>>>> they are nicely orthogonal. Also it's important to know that
>>>> ld-tdi-lwsync on ppc is cheaper than ld-lwsync because it affects only
>>>> the one load mentioned in the sequence. That's some hack in the
>>>> processor.)
>>>>
>>>> Best regards,
>>>> Goetz.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: David Holmes [mailto:david.holmes at oracle.com]
>>>> Sent: Mittwoch, 20. November 2013 21:06
>>>> To: Lindenmaier, Goetz
>>>> Cc: 'hotspot-dev at openjdk.java.net'; 'ppc-aix-port-dev at openjdk.java.net'; 'Vladimir Kozlov'
>>>> Subject: Re: RFR(M): 8028515: PPC64 (part 113.2): opto: Introduce MemBarAcquire/ReleaseWide.
>>>>
>>>> First apologies that the email you replied to here didn't go to the
>>>> lists. Due to the way it was filtered I assumed it was a direct email
>>>> only. :)
>>>>
>>>> On 21/11/2013 12:22 AM, Lindenmaier, Goetz wrote:
>>>>> Hi David,
>>>>>
>>>>> thanks for the comprehensive explanation. That's in agreement with
>>>>> what I understand. Except that I think, unidirectional barriers only
>>>>> make sense if you have Acquire/Release operations dedicated to
>>>>> a certain operation. Then other operations of the same kind can pass
>>>>> the barrier in one direction.
>>>>>
>>>>> This also matches the extension I contributed to hotspot (8024921),
>>>>> which now allows to issue such operations right along with the
>>>>> loads/stores.
>>>>>
>>>>> This change here tries to separate the barriers used after load/stores from
>>>>> the 'wide' or better 'fence' variants.
>>>>
>>>> But Unsafe.loadFence and Unsafe.storeFence _are_ intended for use after
>>>> loads/stores - but with stronger semantics than just simple loadLoad or
>>>> storeStore barriers.
>>>>
>>>>> Given the basic four barriers (L-L, L-S, S-L, S-S) there are 16 possible
>>>>> operations. Unfortunately, Hotspot knows only four of these:
>>>>> MemBarStoreStore: S-S
>>>>
>>>> Okay - nice and clear.
>>>>
>>>>> MemBarAcquire: L-S, L-L
>>>>
>>>> I'm not sure what makes this "acquire" but this seems to be exactly what
>>>> Unsafe.loadFence requires.
>>>>
>>>>> MemBarRelease: L-S, S-S
>>>>
>>>> Again I'm not sure what makes this "release" - I also find it odd that
>>>> it is L-S not S-L. If S-L this would be exactly what Unsafe.storeFence
>>>> needs.
>>>>
>>>>> MemBarVolatile: L-L, L-S, S-L, S-S
>>>>
>>>> A full fence, needed in the worst case when accessing a volatile variable.
>>>>
>>>> Also note that in different parts of the VM we have different
>>>> expressions of these, both at the JIT level and runtime eg see
>>>> orderAccess.hpp (which is not in itself self-consistent or clear: see
>>>> https://bugs.openjdk.java.net/browse/JDK-7143664).
>>>>
>>>> Also I said that the Unsafe ops require bi-directional fences but that
>>>> was imprecise - the _load and _store ops only require two of the four
>>>> possible barriers which may be why you see these as distinct from the
>>>> membars associated directly with variable accesses ?
>>>>
>>>>> and these don't map, e.g., the operations available on PPC:
>>>>> lwsync: L-S, L-L, S-S
>>>>> sync: L-L, L-S, S-L, S-S
>>>>> What is the reason why it's hard for us to generate optimal code
>>>>> with the available optimizations:
>>>>> MemBarAcquire;
>>>>> MemBarRelease;
>>>>> are mapped to
>>>>> lwsync
>>>>> lwsync
>>>>> which is obviously redundant.
>>>>
>>>> Are you referring to generated code if these nodes are adjacent? If so I
>>>> assume the problem is that you can't logically coalesce these, even if
>>>> the actual generated code could be coalesced?
>>>>
>>>>> A generic solution would be to have a single operation, that can be
>>>>> Tagged with the barriers (L-L, L-S, S-L, S-S) it should guarantee.
>>>>> Two operations could be merged by an optimization simply by
>>>>> or-ing the tags.
>>>>> But this is not reality in HotSpot, and out of scope of this change.
>>>>
>>>> Ok
>>>>
>>>>>> I think perhaps that Vladimir's suggestion is best - if these are the
>>>>>> Membar nodes for these unsafe ops (and only these) then name them to
>>>>>> match the unsafe ops. At least then it is clearer where the
>>>>>> specification for their behaviour comes from.
>>>>> OK, so I'll rename them as Vladimir proposed, and post a new webrev.
>>>>
>>>> I still have this concern regarding the fact you think these are not
>>>> related to load/store operations. But I'll wait for the webrev.
>>>>
>>>> Thanks for your patience on this. Hopefully we can sort it out in the
>>>> next 24 hours before I takeoff.
>>>>
>>>> David
>>>> -----
>>>>
>>>>> Best regards and a good flight home,
>>>>> Goetz.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: David Holmes [mailto:david.holmes at oracle.com]
>>>>> Sent: Mittwoch, 20. November 2013 14:30
>>>>> To: Lindenmaier, Goetz
>>>>> Cc: David Holmes
>>>>> Subject: Re: RFR(M): 8028515: PPC64 (part 113.2): opto: Introduce MemBarAcquire/ReleaseWide.
>>>>>
>>>>> Hi Goetz,
>>>>>
>>>>> On 20/11/2013 7:36 PM, Lindenmaier, Goetz wrote:
>>>>>> Hi David,
>>>>>> I don't know what you mean by uni/bi-directional.
>>>>>
>>>>> Pretty much as you describe below. A unidirectional barrier allows
>>>>> things to cross in one direction only ie it blocks motion one way; a
>>>>> bi-directional barrier ("full fence") blocks both ways and so prevents
>>>>> any reordering.
>>>>>
>>>>> As I described in earlier emails these kind of unreferenced
>>>>> acquire/release semantics ie not associated with any stores/loads is
>>>>> used to describe the memory barrier properties of a synchronized block,
>>>>> for example. Monitor acquisition has acquire semantics, while monitor
>>>>> release has store semantics. The implied barriers mean that any accesses
>>>>> can move into the synchronized region but none can move out - the "roach
>>>>> motel" semantics.
>>>>>
>>>>> Now as you say this kind of barrier is not what is wanted with these
>>>>> loadFence/storeFence operations, hence I think the generic
>>>>> MembarAcquire/MembarRelease are inappropriate names (whether the
>>>>> implementation is correct is a different matter). Similarly I find the
>>>>> MembarAcquireFence and MembarReleaseFence to be somewhat inappropriate -
>>>>> what do they mean ???
>>>>>
>>>>> The loadFence semantics imply a loadLoad|loadStore barrier.
>>>>> The storeFence semantics imply a storeLoad|storeStore barrier.
>>>>> The fullFence is of course all four.
>>>>>
>>>>> Again I'm unclear if the current or proposed MembarXXX actions actually
>>>>> map to barriers that implement those semantics.
>>>>>
>>>>> So yes this comes down to naming initially but also verification that
>>>>> the implementation does as required.
>>>>>
>>>>> I think perhaps that Vladimir's suggestion is best - if these are the
>>>>> Membar nodes for these unsafe ops (and only these) then name them to
>>>>> match the unsafe ops. At least then it is clearer where the
>>>>> specification for their behaviour comes from.
>>>>>
>>>>> I'll post a follow up email to the list when I get a chance. I'm
>>>>> currently in the US and attending lots of meetings then head back to
>>>>> Australia tomorrow evening.
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>> I guess we agree that the documentation of the intrinsics, let's take
>>>>>> loadFence
>>>>>> for an example, talks about loads before the loadFence, and stores and
>>>>>> loads after it:
>>>>>> loads
>>>>>> ------------loadFence()--------------
>>>>>> loads, stores
>>>>>> With this, you can achieve any ordering by moving nodes up OR down:
>>>>>> Op1 Move Op1 Op2
>>>>>> Op2 down -------------
>>>>>> --------- =========> Op3
>>>>>> Op3 Op1
>>>>>> Alternatively, you can move op3 up, and get the same order:
>>>>>> Op1 Reorder Op2 Move Op3 Op2
>>>>>> Op2 Ops 1 and 2 Op1 up Op3
>>>>>> --------- =========> -------------- =========> Op1
>>>>>> Op3 Not restricted Op3 ----------
>>>>>> by later loadFence
>>>>>> So, if the operation restricts only moves in one direction, it's of no
>>>>>> help because
>>>>>> you can still get an execution order you did not expect. So
>>>>>> unidirectional operations
>>>>>> (e.g., only hindering operations from moving up) are pointless.
>>>>>> By the way, are you questioning whether the implementation is correct, or
>>>>>> whether the names express what is desired?
>>>>>> If it's the first, it's of no concern to this change, which is only about
>>>>>> matching these operations independently from those, that address
>>>>>> a dedicated memory operation.
>>>>>> If It's the second, please tell me what names to use, and I'll fix it.
>>>>>> (I hope you can properly read the 'drawings')
>>>>>> Best regards,
>>>>>> Goetz.
>>>>>> -----Original Message-----
>>>>>> From: David Holmes [mailto:david.holmes at oracle.com]
>>>>>> Sent: Dienstag, 19. November 2013 20:51
>>>>>> To: Lindenmaier, Goetz
>>>>>> Cc: 'Vladimir Kozlov'; 'ppc-aix-port-dev at openjdk.java.net';
>>>>>> 'hotspot-dev at openjdk.java.net'
>>>>>> Subject: Re: RFR(M): 8028515: PPC64 (part 113.2): opto: Introduce
>>>>>> MemBarAcquire/ReleaseWide.
>>>>>> On 20/11/2013 5:34 AM, Lindenmaier, Goetz wrote:
>>>>>>> Hi David,
>>>>>>>
>>>>>>> this are the instrinsics for sun/misc/Unsafe, which are documented
>>>>>>> as shown below. So I guess that's just what is desired.
>>>>>> I don't think so! What is described for Unsafe are full bi-directional
>>>>>> fences and "acquire" and "release" are only uni-directional fences!
>>>>>> David
>>>>>> -----
>>>>>>> Best regards,
>>>>>>> Goetz.
>>>>>>>
>>>>>>> /**
>>>>>>> * Ensures lack of reordering of loads before the fence
>>>>>>> * with loads or stores after the fence.
>>>>>>> * @since 1.8
>>>>>>> */
>>>>>>> public native void loadFence();
>>>>>>>
>>>>>>> /**
>>>>>>> * Ensures lack of reordering of stores before the fence
>>>>>>> * with loads or stores after the fence.
>>>>>>> * @since 1.8
>>>>>>> */
>>>>>>> public native void storeFence();
>>>>>>>
>>>>>>> /**
>>>>>>> * Ensures lack of reordering of loads or stores before the fence
>>>>>>> * with loads or stores after the fence.
>>>>>>> * @since 1.8
>>>>>>> */
>>>>>>> public native void fullFence();
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: David Holmes [mailto:david.holmes at oracle.com]
>>>>>>> Sent: Tuesday, November 19, 2013 8:08 PM
>>>>>>> To: Lindenmaier, Goetz
>>>>>>> Cc: Vladimir Kozlov; 'ppc-aix-port-dev at openjdk.java.net'; 'hotspot-dev at openjdk.java.net'
>>>>>>> Subject: Re: RFR(M): 8028515: PPC64 (part 113.2): opto: Introduce MemBarAcquire/ReleaseWide.
>>>>>>>
>>>>>>> So I like the rename but I am concerned there are semantic issues here:
>>>>>>>
>>>>>>> switch(id) {
>>>>>>> case vmIntrinsics::_loadFence:
>>>>>>> - insert_mem_bar(Op_MemBarAcquire);
>>>>>>> + insert_mem_bar(Op_MemBarFenceAcquire);
>>>>>>> return true;
>>>>>>> case vmIntrinsics::_storeFence:
>>>>>>> - insert_mem_bar(Op_MemBarRelease);
>>>>>>> + insert_mem_bar(Op_MemBarFenceRelease);
>>>>>>> return true;
>>>>>>>
>>>>>>> What is a _loadFence operation? Does it really map to
>>>>>>> MembarFenceAcquire? Ditto for the _storeFence. These sound like
>>>>>>> bi-directional fences - are they? Are they really only concerned with
>>>>>>> loads or stores ?
>>>>>>>
>>>>>>> David
>>>>>>>
>>>>>>> On 19/11/2013 6:34 PM, Lindenmaier, Goetz wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I made a new webrev, using the new names:
>>>>>>>> http://cr.openjdk.java.net/~goetz/webrevs/8028515-1-wide/
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Goetz.
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>>>>>> Sent: Montag, 18. November 2013 23:28
>>>>>>>> To: Lindenmaier, Goetz; 'David Holmes'
>>>>>>>> Cc: 'ppc-aix-port-dev at openjdk.java.net'; 'hotspot-dev at openjdk.java.net'
>>>>>>>> Subject: Re: RFR(M): 8028515: PPC64 (part 113.2): opto: Introduce MemBarAcquire/ReleaseWide.
>>>>>>>>
>>>>>>>> On 11/18/13 2:19 PM, Lindenmaier, Goetz wrote:
>>>>>>>>> Hi David, Vladimir,
>>>>>>>>>
>>>>>>>>> I also would prefer MemBarFenceAquire/Release, for two reasons
>>>>>>>>> - the same prefix shows they are all similar nodes.
>>>>>>>>> - I don't have to change MemBarVolatile to FullFence, which I didn't
>>>>>>>>> change yet and which is used in more places.
>>>>>>>>
>>>>>>>> Okay.
>>>>>>>>
>>>>>>>> Vladimir
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>> Goetz.
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: David Holmes [mailto:david.holmes at oracle.com]
>>>>>>>>> Sent: Monday, November 18, 2013 10:49 PM
>>>>>>>>> To: Vladimir Kozlov
>>>>>>>>> Cc: Lindenmaier, Goetz; 'ppc-aix-port-dev at openjdk.java.net'; 'hotspot-dev at openjdk.java.net'
>>>>>>>>> Subject: Re: RFR(M): 8028515: PPC64 (part 113.2): opto: Introduce MemBarAcquire/ReleaseWide.
>>>>>>>>>
>>>>>>>>> Vladimir,
>>>>>>>>>
>>>>>>>>> On 19/11/2013 5:36 AM, Vladimir Kozlov wrote:
>>>>>>>>>> I think David asked for different nodes not just one.
>>>>>>>>>> We can have new membar nodes with names matching Unsafe methods names:
>>>>>>>>>> LoadFence, StoreFence, FullFence. I am fine with it.
>>>>>>>>>
>>>>>>>>> But I don't think that actually describes them - they don't distinguish
>>>>>>>>> between loads and stores only the direction in which the barrier
>>>>>>>>> operates. "acquire" only allows accesses to move below it; "release"
>>>>>>>>> only allows accesses to move above it ie:
>>>>>>>>>
>>>>>>>>> load a
>>>>>>>>> store b, c
>>>>>>>>> acquire();
>>>>>>>>> load d
>>>>>>>>> store e,f
>>>>>>>>> release();
>>>>>>>>> load g
>>>>>>>>> store h, i
>>>>>>>>>
>>>>>>>>> allows the loads of 'a' and 'g' to move into the region between acquire
>>>>>>>>> and release; similarly for the stores to b and h. But the load of d nor
>>>>>>>>> the store to e can not move in relation to either acquire or release.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>>> MemBar and Fence have similar meaning so lets don't mix them in node names.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Vladimir
>>>>>>>>>>
>>>>>>>>>> On 11/18/13 8:19 AM, Lindenmaier, Goetz wrote:
>>>>>>>>>>> Hi David,
>>>>>>>>>>>
>>>>>>>>>>> as reply to your comment on the bug:
>>>>>>>>>>>
>>>>>>>>>>> Well, I sitll would need 2 different nodes, as on PPC we do
>>>>>>>>>>> MemBarAcquireWide --> lwsync
>>>>>>>>>>> MemBarReleaseWide --> lwsync
>>>>>>>>>>> MemBarVolatile --> sync.
>>>>>>>>>>> On Sparc, you even do 3 different operations.
>>>>>>>>>>>
>>>>>>>>>>> Or should I name them MemBarFenceAcquire and MemBarFenceRelease?
>>>>>>>>>>> This all depends a lot on the available instructions of the processors.
>>>>>>>>>>> Therefore I think a really clean representation that, at the same
>>>>>>>>>>> time, allows
>>>>>>>>>>> to find the cheapest set of instructions to express it on all
>>>>>>>>>>> processors, is impossible.
>>>>>>>>>>>
>>>>>>>>>>> Best regards,
>>>>>>>>>>> Goetz
>>>>>>>>>>>
>>>>>>>>>>> PS: Should I respond to comments in the bug right in the bug
>>>>>>>>>>> or on the mailing lists?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> From:ppc-aix-port-dev-bounces at openjdk.java.net
>>>>>> <mailto:ppc-aix-port-dev-bounces at openjdk.java.net>
>>>>>>>>>>> [mailto:ppc-aix-port-dev-bounces at openjdk.java.net] On Behalf Of
>>>>>>>>>>> Lindenmaier, Goetz
>>>>>>>>>>> Sent: Montag, 18. November 2013 15:19
>>>>>>>>>>> To: 'hotspot-dev at openjdk.java.net';
>>>>>>>>>>> 'ppc-aix-port-dev at openjdk.java.net'; Vladimir Kozlov
>>>>>>>>>>> Subject: RFR(M): 8028515: PPC64 (part 113.2): opto: Introduce
>>>>>>>>>>> MemBarAcquire/ReleaseWide.
>>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> The c2 compiler inserts MemBarAcquire/Release nodes to enforce memory
>>>>>>>>>>> ordering in various places. Some order a certain load/store with other
>>>>>>>>>>> operations. Inline_unsafe_fence() inserts MemBars that do not
>>>>>>>>>>> correspont to a memory operation. So far, the same nodes were used.
>>>>>>>>>>>
>>>>>>>>>>> This change introduces MemBarAcquire/ReleaseWide to use where no
>>>>>>>>>>> dedicated load/store is ordered. With this change, these nodes can be
>>>>>>>>>>> matched differently, what is needed on PPC64.
>>>>>>>>>>>
>>>>>>>>>>> When reviewing 8024921 (part 113) we decided to avoid #defines in
>>>>>>>>>>> inline_unsafe_fence() and to introduce new MemBar operations.
>>>>>>>>>>>
>>>>>>>>>>> Please review and test this change.
>>>>>>>>>>> http://cr.openjdk.java.net/~goetz/webrevs/8028515-0-wide/
>>>>>>>>>>>
>>>>>>>>>>> Best regards,
>>>>>>>>>>> Goetz.
>>>>>>>>>>>
More information about the ppc-aix-port-dev
mailing list