RFR (S): 8007898: Incorrect optimization of Memory Barriers in Matcher::post_store_load_barrier()

Vladimir Kozlov vladimir.kozlov at oracle.com
Thu Jul 11 09:32:23 PDT 2013


Thank you, Martin

Vladimir

On 7/11/13 9:06 AM, Doerr, Martin wrote:
> Hi Vladimir,
>
> thanks for your explanation. If there's always a MemBarAcquire before the MembarVolatile, it should work. I agree.
>
> Searching for MemBarRelease for optimizing out redundant MemBarAcquire is different idea.
> We can discuss our solution when it comes with the PPC64 port (which will probably not be too soon).
>
> Anyway, your change looks good. We'll test it in our SAPJVM as well.
>
> Kind regards,
> Martin
>
>
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Donnerstag, 11. Juli 2013 17:41
> To: Doerr, Martin
> Cc: hotspot compiler; Aleksey Shipilev; Lindenmaier, Goetz
> Subject: Re: RFR (S): 8007898: Incorrect optimization of Memory Barriers in Matcher::post_store_load_barrier()
>
> Hi Martin,
>
> On 7/11/13 6:06 AM, Doerr, Martin wrote:
>> Hi Vladimir,
>>
>> thanks for fixing these problems. The change looks good.
>>
>> There's only one question remaining. We're not sure if the following pattern can occur:
>
> There are release and acquire membars in graph (I do not show proj nodes for simplicity):
>
>>
>> MemBarVolatile
>>    ^         ^
>>    |         |
>> CtrlProj  MemProj
>>    ^         ^
>>    |         |
>>    |       Load volatile
>       |         ^
>       |         |
>     MemBarAcquire
>       ^
>       |
>     MemBarRelease
>       ^
>       |
>     Store volatile
>       ^
>>    |
>> MemBarVolatile
>>
>
> So if there is volatile load between two stores, we should have MemBarAcquire between as well. That is why our current
> code should work (we don't skip first MemBarVolatile if we see MemBarAcquire).
>
>> If this can occur, we will have to traverse the memory edges in addition.
>
> I also thought so before but control path will have MemBarAcquire before we see second MemBarVolatile we may not need to
> look on memory path.
>
>> We have implemented a similar optimization to optimize MemBarAcquire before MemBarVolatile away
>> and we had to check the memory edges. An excerpt of our code is below.
>
> Volatile and acquire are different types of membar in hw: StoreLoad vs LoadStore.
> Yes, release membar is also LoadStore so in this sense you can "skip" MemBarAcquire (as in the graph above) but you need
> to search for MemBarRelease in such case.
>
> Thanks,
> Vladimir
>
>>
>> Kind regards,
>> Martin and Goetz
>>
>>
>>
>>     // Get the Proj node, mem_proj, that can be used to iterate forward
>>     Node *mem_proj = NULL;
>>     DUIterator_Fast imax, i = mem->fast_outs(imax);
>>     while( true ) {
>>       mem_proj = mem->fast_out(i);               // Throw out-of-bounds if proj not found
>>       assert( mem_proj->is_Proj(), "only projections here" );
>>       ProjNode *proj = (ProjNode*)mem_proj;
>>       if( proj->_con == TypeFunc::Memory &&
>>           !C->node_arena()->contains(mem_proj) ) // Unmatched old-space only
>>         break;
>>       i++;
>>     }
>>
>>     bool found_membar = false;
>>     for( DUIterator_Fast jmax, j = mem_proj->fast_outs(jmax); j < jmax; j++ ) {
>>       Node *x = mem_proj->fast_out(j);
>>       // Proj might have an edge to a store or load node which precedes the membar
>>       if (x->is_Mem()) return false;
>>
>>       // We can only allow xop == Op_MemBarVolatile when we traverse TypeFunc::Memory instead of TypeFunc::Control
>>       // because a LoadNode might get scheduled before the MemBarVolatile when considering TypeFunc::Control only.
>>       int xop = x->Opcode();
>>       if (xop == Op_MemBarVolatile) {
>>         // Make sure we're not missing Call/Phi/MergeMem by checking control edges.
>>         // That means we only set found_membar=true when x can be found via control proj AND memory proj.
>>         Node *in = x->in(0);
>>         if (in->is_Proj() && in->in(0)==vmb) found_membar = true;
>>       }
>>     }
>>
>>     return found_membar;
>>
>>
>>
>>
>> -----Original Message-----
>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>> Sent: Donnerstag, 11. Juli 2013 05:39
>> To: hotspot compiler
>> Cc: Aleksey Shipilev; Doerr, Martin; Lindenmaier, Goetz
>> Subject: RFR (S): 8007898: Incorrect optimization of Memory Barriers in Matcher::post_store_load_barrier()
>>
>> http://cr.openjdk.java.net/~kvn/8007898/webrev/
>>
>> The problem is caused by complicated (and questionable) barriers code
>> generated for volatile stores. It is questionable from performance view
>> because second volatile membar is for bottom memory (AliasIdxBot). So
>> any memory operations can not pass it.
>> The fix is to generate one "fat" membar instead of set of barriers.
>>
>> I also cleaned/fixed corresponding code in Matcher which tries to
>> optimize out those multiply barriers. The problem with it was it did not
>> take into account biased locking when cas could be not executed.
>> Also code for FastUnlock ('If' check) was not removed before because of,
>> I think, incorrect comment "upcoming fastlock block" - it should be
>> "fastUnlock block".
>>
>> Added regression test which shows the problem with -XX:+StressLCM[GCM]
>> flags added by Aleksey some time ago.
>>
>> Aleksey is testing performance of these changes and no regression is
>> observed so far.
>>
>> Thanks,
>> Vladimir
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>


More information about the hotspot-compiler-dev mailing list