RFR(M): 8080289: Intermediate writes in a loop not eliminated by optimizer

Vladimir Kozlov vladimir.kozlov at oracle.com
Wed Jun 17 19:03:01 UTC 2015


 > http://gee.cs.oswego.edu/dl/jmm/cookbook.html
 >
 > it’s allowed to reorder normal stores with normal stores

If we can guarantee that all passed stores are normal (I assume we will 
have barriers otherwise in between) then I agree. I am not sure why we 
didn't do it before, there could be a counterargument for that which I 
don't remember. To make sure, ask John.

 >> We need to call Ideal() again if store inputs are changed. So if st2 
is removed then inputs of st1 are changed so we need to rerun Ideal(). 
This allow to avoid having your new loop in the Ideal().
 >
 > Sorry, I don’t understand this. Are you saying there’s no need for a 
loop at all? Or are you saying that as soon as there’s a similar store 
that is found we should return from Ideal that will be called again to 
maybe find other similar stores?

Yes, it may simplify the code of Ideal. You may still need a loop to 
look for previous store which could be eliminated but you don't need to 
have 'prev'. As soon you remove one node, you exit Ideal returning 
'this' and it will be called again so you can search for another 
previous store.

 >> BOTTOM (all slices) Phi?
 >
 > Wouldn’t there be a MergeMem between the store and the Phi then?

Yes. Okay, you can keep the check as assert we will see if Nightly 
testing hit it it or not.

Thanks,
Vladimir

On 6/17/15 1:35 AM, Roland Westrelin wrote:
>
>>> That’s what I think the code does. That is if you have:
>>>
>>> st1->st2->st3->st4
>>
>> I assume st4 is first store and st1 is last. Right?
>
> Program order is:
> st4
> st3
> st2
> st1
>
>>> and st3 is redundant with st1, the chain should become:
>>>
>>> st1->st2->st4
>>
>> I am not sure it is correct optimization. On some machines result of st3 could be visible before result of st2. And you change it.
>> I am suggesting not do that. Do you need that for stores move from loop?
>
> It’s not required. It cleans up the graph in some cases like this:
>
>      static void test_after_5(int idx) {
>          for (int i = 0; i < 1000; i++) {
>              array[idx] = i;
>              array[idx+1] = i;
>              array[idx+2] = i;
>              array[idx+3] = i;
>              array[idx+4] = i;
>              array[idx+5] = i;
>          }
>      }
>
> all stores are sunk out of the loop but that happens after iteration splitting and so there are multiple redundant copies of each store that are not collapsed.
>
> This said, we currently reorder the stores even if it’s less aggressive than what I’m proposing. With program:
>
> st4
> st3
> st2
> st1
>
> If st1, st3 and st4 are on one slice and st2 is on another and if st1 and st3 store to the same address we optimize st3 out:
>
> st4
> st2
> st1
>
> so st3=st1 may only be visible after st2.
>
> Also, the way I read the first table in this:
>
> http://gee.cs.oswego.edu/dl/jmm/cookbook.html
>
> it’s allowed to reorder normal stores with normal stores
>
>>> so we need to change the memory input of st2 when we find st3 can be removed. In the code, at that point, this=st1, st = st3 and prev=st2.
>>
>> In this case the code should be:
>>
>>    if (st->in(MemNode::Address)->eqv_uncast(address) &&
>>      ...
>>    } else {
>>      prev = st;
>>    }
>>
>> to update 'prev' with 'st' only if 'st' is not removed.
>
> You’re right.
>
>> Also, I think, st->in(MemNode::Memory) could be put in local var since it is used several times in this code.
>>
>>>
>>>> You need to set improved = true since 'this' will not change. We also use 'make_progress' variable's name in such cases.
>>>
>>> In the example above, if we remove st2, we modify this, right?
>>
>> We need to call Ideal() again if store inputs are changed. So if st2 is removed then inputs of st1 are changed so we need to rerun Ideal(). This allow to avoid having your new loop in the Ideal().
>
> Sorry, I don’t understand this. Are you saying there’s no need for a loop at all? Or are you saying that as soon as there’s a similar store that is found we should return from Ideal that will be called again to maybe find other similar stores?
>
>>> We’ll find a path from the head that doesn’t go through the store and that exits the loop. What the comment doesn’t say is that with example 2 below:
>>>
>>> for (int i = 0; i < 10; i++) {
>>>    if (some_condition) {
>>>      uncommon_trap();
>>>    }
>>>    array[idx] = 999;
>>> }
>>>
>>> my verification code would find the early exit as well.
>>>
>>> It’s verification code only because if we have example 1 above, then we have a memory Phi to merge both branches of the if. So the pattern that we look for in PhaseIdealLoop::try_move_store_before_loop() won’t match: the loop’s memory Phi backedge won’t be the store. If we have example 2 above, then the loop’s memory Phi doesn’t have a single memory use. So I don’t think we need to check that the store post dominate the loop head in product. That’s my reasoning anyway and the verification code is there to verify it.
>>
>> I missed 'mem->in(LoopNode::LoopBackControl) == n' condition. Which reduce cases only to one store to this address in the loop - good.
>>
>> How you check in product VM that there are no other exists from a loop (your example 2)? Is it guarded by mem->outcnt() == 1 check?
>
> Yes.
>
>>>> Should you check phi == NULL instead of assert to make sure you have only one Phi node?
>>>
>>> Can there be more than one memory Phi for a particular slice that has in(0) == n_loop->_head?
>>> I would have expected that to be impossible.
>>
>> BOTTOM (all slices) Phi?
>
> Wouldn’t there be a MergeMem between the store and the Phi then?
>
> For the record, the webrev:
>
> http://cr.openjdk.java.net/~roland/8080289/webrev.00/
>
> Roland.
>


More information about the hotspot-compiler-dev mailing list