RFR (S): 8007898: Incorrect optimization of Memory Barriers in Matcher::post_store_load_barrier()
Doerr, Martin
martin.doerr at sap.com
Thu Jul 11 09:06:15 PDT 2013
Hi Vladimir,
thanks for your explanation. If there's always a MemBarAcquire before the MembarVolatile, it should work. I agree.
Searching for MemBarRelease for optimizing out redundant MemBarAcquire is different idea.
We can discuss our solution when it comes with the PPC64 port (which will probably not be too soon).
Anyway, your change looks good. We'll test it in our SAPJVM as well.
Kind regards,
Martin
-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
Sent: Donnerstag, 11. Juli 2013 17:41
To: Doerr, Martin
Cc: hotspot compiler; Aleksey Shipilev; Lindenmaier, Goetz
Subject: Re: RFR (S): 8007898: Incorrect optimization of Memory Barriers in Matcher::post_store_load_barrier()
Hi Martin,
On 7/11/13 6:06 AM, Doerr, Martin wrote:
> Hi Vladimir,
>
> thanks for fixing these problems. The change looks good.
>
> There's only one question remaining. We're not sure if the following pattern can occur:
There are release and acquire membars in graph (I do not show proj nodes for simplicity):
>
> MemBarVolatile
> ^ ^
> | |
> CtrlProj MemProj
> ^ ^
> | |
> | Load volatile
| ^
| |
MemBarAcquire
^
|
MemBarRelease
^
|
Store volatile
^
> |
> MemBarVolatile
>
So if there is volatile load between two stores, we should have MemBarAcquire between as well. That is why our current
code should work (we don't skip first MemBarVolatile if we see MemBarAcquire).
> If this can occur, we will have to traverse the memory edges in addition.
I also thought so before but control path will have MemBarAcquire before we see second MemBarVolatile we may not need to
look on memory path.
> We have implemented a similar optimization to optimize MemBarAcquire before MemBarVolatile away
> and we had to check the memory edges. An excerpt of our code is below.
Volatile and acquire are different types of membar in hw: StoreLoad vs LoadStore.
Yes, release membar is also LoadStore so in this sense you can "skip" MemBarAcquire (as in the graph above) but you need
to search for MemBarRelease in such case.
Thanks,
Vladimir
>
> Kind regards,
> Martin and Goetz
>
>
>
> // Get the Proj node, mem_proj, that can be used to iterate forward
> Node *mem_proj = NULL;
> DUIterator_Fast imax, i = mem->fast_outs(imax);
> while( true ) {
> mem_proj = mem->fast_out(i); // Throw out-of-bounds if proj not found
> assert( mem_proj->is_Proj(), "only projections here" );
> ProjNode *proj = (ProjNode*)mem_proj;
> if( proj->_con == TypeFunc::Memory &&
> !C->node_arena()->contains(mem_proj) ) // Unmatched old-space only
> break;
> i++;
> }
>
> bool found_membar = false;
> for( DUIterator_Fast jmax, j = mem_proj->fast_outs(jmax); j < jmax; j++ ) {
> Node *x = mem_proj->fast_out(j);
> // Proj might have an edge to a store or load node which precedes the membar
> if (x->is_Mem()) return false;
>
> // We can only allow xop == Op_MemBarVolatile when we traverse TypeFunc::Memory instead of TypeFunc::Control
> // because a LoadNode might get scheduled before the MemBarVolatile when considering TypeFunc::Control only.
> int xop = x->Opcode();
> if (xop == Op_MemBarVolatile) {
> // Make sure we're not missing Call/Phi/MergeMem by checking control edges.
> // That means we only set found_membar=true when x can be found via control proj AND memory proj.
> Node *in = x->in(0);
> if (in->is_Proj() && in->in(0)==vmb) found_membar = true;
> }
> }
>
> return found_membar;
>
>
>
>
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Donnerstag, 11. Juli 2013 05:39
> To: hotspot compiler
> Cc: Aleksey Shipilev; Doerr, Martin; Lindenmaier, Goetz
> Subject: RFR (S): 8007898: Incorrect optimization of Memory Barriers in Matcher::post_store_load_barrier()
>
> http://cr.openjdk.java.net/~kvn/8007898/webrev/
>
> The problem is caused by complicated (and questionable) barriers code
> generated for volatile stores. It is questionable from performance view
> because second volatile membar is for bottom memory (AliasIdxBot). So
> any memory operations can not pass it.
> The fix is to generate one "fat" membar instead of set of barriers.
>
> I also cleaned/fixed corresponding code in Matcher which tries to
> optimize out those multiply barriers. The problem with it was it did not
> take into account biased locking when cas could be not executed.
> Also code for FastUnlock ('If' check) was not removed before because of,
> I think, incorrect comment "upcoming fastlock block" - it should be
> "fastUnlock block".
>
> Added regression test which shows the problem with -XX:+StressLCM[GCM]
> flags added by Aleksey some time ago.
>
> Aleksey is testing performance of these changes and no regression is
> observed so far.
>
> Thanks,
> Vladimir
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
More information about the hotspot-compiler-dev
mailing list