review for 7032963: StoreCM shouldn't participate in store elimination

Sat Apr 2 01:07:58 UTC 2011

Yes, this looks right.

Thanks,
Vladimir

Tom Rodriguez wrote:
> It turns out that the code I wrote is safe from this bug because the CastP2X has control so the card mark addresses don't appear to be the same since the CastP2X's are different.  Picking a higher control is the basis of the other idea I had for improving the barrier code since it improves the sharing of card mark computations.  If I enable that logic then the bug in my code shows up.  I've fixed it by checking for outcnt() == 1 in the main loop control.
> 
> tom
> 
> On Apr 1, 2011, at 5:15 PM, Vladimir Kozlov wrote:
> 
>> And, it seems, your current code covers this case already. So my false assumption helped you to find the real problem ;)
>>
>> Thanks,
>> Vladimir
>>
>> Vladimir Kozlov wrote:
>>> Actually I thought about slightly different case:
>>> a.f = x
>>> if (test) {
>>>  a.b = y;
>>> }
>>> But StoreCM for a.f should have several users (StoreCM for a.b and mergemem) so your condition (stop serch if multiple users) stays true.
>>> Vladimir
>>> Tom Rodriguez wrote:
>>>> On Apr 1, 2011, at 4:37 PM, Vladimir Kozlov wrote:
>>>>
>>>>> An other problem if n is on a branch and you could eliminate dominated StoreCM which above the split point resulting in not having StoreCM on opposite branch.
>>>> You mean:
>>>>
>>>> a.f = x
>>>> b.f = y;
>>>> if (test)
>>>>  return
>>>> a.b = c;
>>>>
>>>> The StoreCM for a.f has a single user but it's used by the StoreCM of b.f which has multiple users.  So I think the search needs to stop when it encounters multiple users of a StoreCM since that represents a split of control flow.  Thanks for catching that.
>>>>
>>>> Sounds like a job for partial redundancy elimination.
>>>>
>>>> tom
>>>>
>>>>> Vladimir
>>>>>
>>>>> Tom Rodriguez wrote:
>>>>>> On Apr 1, 2011, at 3:47 PM, Vladimir Kozlov wrote:
>>>>>>> You may put n->in(MemNode::Address) and n->in(MemNode::ValueIn) into locals before the loop. Also you need to kill the node explicitly otherwise it still be connected to its inputs:
>>>>>>>
>>>>>>> +         // Eliminate the previous StoreCM
>>>>>>> +         prev->set_req(MemNode::Memory, mem->in(MemNode::Memory));
>>>>>>> +         assert(mem->outcnt() == 0, "should be dead");
>>>>>>> +         mem->disconnect_inputs(NULL);
>>>>>> I'll have to rework the mem traversal a little.  Actually I think there might have been a bug with the old code since it always updated prev.  I believe this is correct:
>>>>>>       // Eliminate the previous StoreCM                                                                                                                     prev->set_req(MemNode::Memory, mem->in(MemNode::Memory));
>>>>>>       assert(mem->outcnt() == 0, "should be dead");
>>>>>>       mem->disconnect_inputs(NULL);
>>>>>>     } else {                                                                                                                                               prev = mem;                                                                                                                                         }
>>>>>>     mem = prev->in(MemNode::Memory);
>>>>>>   }
>>>>>> I think I'll put together a little test case to make sure this is working correctly.
>>>>>> tom
>>>>>>> Vladimir
>>>>>>>
>>>>>>> Tom Rodriguez wrote:
>>>>>>>> I could push this to hotspot-gc so it gets more CMS testing .
>>>>>>>> http://cr.openjdk.java.net/~never/7032963
>>>>>>>> 7032963: StoreCM shouldn't participate in store elimination
>>>>>>>> Reviewed-by:
>>>>>>>> StoreCM shouldn't participate in redundant store elimination since
>>>>>>>> that could violate the requirement that a StoreCM must be strictly
>>>>>>>> after a field update.  This results in a large number of redundant
>>>>>>>> StoreCMs being emitted for blocks of fields updates, so I added an
>>>>>>>> optimization to fold them up safely.  Previously the extra dependence
>>>>>>>> was converted into a precedence edge just before register allocation
>>>>>>>> but I moved this logic into final_graph_reshape.  I then added logic
>>>>>>>> to search through chains of StoreCMs to eliminate earlier redundant
>>>>>>>> ones and transfer their precedence edges to the one that is kept.
>>>>>>>> This ensures that they are scheduled properly.  This actually
>>>>>>>> eliminates duplicates that were previously missed so the code quality
>>>>>>>> is slightly better.  Tested by inspecting code generation with script
>>>>>>>> to identify duplicates.  Also ran CTW with -XX:+UseCondCardMark and
>>>>>>>> -XX:+UseG1GC.
>