[9] RFR(M): 8038348: Instance field load is replaced by wrong data Phi

Mon Aug 22 18:30:31 UTC 2016

Very nice! Thank you for finding this solution.
In jdk 10 we need to change all value types related to Node::_idx to uint (or node_idx_t). We very inconsistent about it in C2 code.

Thanks,
Vladimir

On 8/19/16 2:40 AM, Tobias Hartmann wrote:
> Hi Vladimir,
>
> as you suggested, I tried to record more information during Phi creation. Instead of using a CloneMap, I added a _inst_mem_id field to the PhiNode to store the node index of the memory Phi node through which we split the LoadNode. This is similar to _inst_id which records the node index of the allocation node.
>
> In LoadNode::Identity() we can now pass the index of the memory input to PhiNode::is_same_inst_field() to check if the data Phi represents the instance field value corresponding to the same memory state. This solves the case where we accidentally select a data Phi that represents an old value of the same instance field.
>
> Since we store the node index of a memory Phi, we need to be careful if this node is replaced or removed. To take care of replacement, I added code to PhaseIterGVN::subsume_node() to update the inst_mem_id of "linked" data Phi nodes to the index of the new node. We could also add such code to Node::clone() but then we would have to maintain a list of ids and it's not guaranteed that the cloned Phi remains "valid" because the node may be changed and used in a different context.
> If a referenced memory Phi is removed, it's only a problem if its node index is reused by another memory Phi because then data Phis may wrongly point to the new node. Currently, this is not a problem because PhaseRenumberLive is executed before EA. I added an assert for sanity.
>
> New webrev:
> http://cr.openjdk.java.net/~thartmann/8038348/webrev.01/
>
> I verified that this fixes the original problem reported by the Lucene team with JDK 7. Performance runs look good (see the link in the comment section). I'm currently running RBT.
>
> Thanks,
> Tobias
>
> On 04.08.2016 15:29, Tobias Hartmann wrote:
>> Hi Vladimir,
>>
>> On 04.08.2016 00:49, Vladimir Kozlov wrote:
>>> On 8/3/16 11:34 AM, Tobias Hartmann wrote:
>>>> Hi Vladimir,
>>>>
>>>> thanks for the review!
>>>>
>>>> On 03.08.2016 03:19, Vladimir Kozlov wrote:
>>>>> Thank you Tobias for looking on this old problem.
>>>>>
>>>>> So we have to find a data Phi which gives the same result as LoadNode::split_through_phi() for split through memory Phi3. Identity() code is short cut for that.
>>>>
>>>> Yes, exactly.
>>>>
>>>>> Unfortunately I forgot what the case was when I added this code.
>>>>
>>>> There is a comment that suggests that this is to avoid creating "infinite" Phis in a loop:
>>>>     // Search for an existing data phi which was generated before for the same
>>>>     // instance's field to avoid infinite generation of phis in a loop.
>>>>
>>>> You also described this in JDK-6673473 (see [1], [2]) which introduced that code. It's not clear to me how creation of infinite Phis can happen, do you remember the exact problem?
>>>
>>> More like this one:
>>>
>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/de93acbb64fc
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-6682236?focusedCommentId=12388989&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12388989
>>>
>>> It is the case where memory Phi's back edge input points back to this Phi (may be though nodes which does not modify the instance). As result Load node (clone on back branch) stays in graph and it will be split again because Ideal(), which does the split, called before Identity() before the fix.
>>
>> Right, thanks for the pointers.
>>
>>>>> It is definitely incorrect in this case when you have multiple data Phi nodes.
>>>>> Since in general case it will be difficult to find correct Phi may be we should not do this optimization. What happened if we do not do replacement if the region has several Phis? Can you check?
>>>>
>>>> I also thought about removing that code entirely. It does fix the problem but I was worried that we may miss optimization opportunities if equal Phis are not merged or new Phis are created for existing data paths, respectively. I'll do a performance another run without that code.
>>>
>>> Also Identity() for new data Phi will not IGVNed with previous generated data Phi since back edge input is different.
>>>
>>> Based on the above we can't skip Identity() code even if we have several Phi data nodes. Otherwise we can get infinite Phi nodes generation again.
>>>
>>> We back to the problem :(
>>
>> I did some correctness and performance runs with the Phi optimization in LoadNode::Identity() and PhaseMacroExpand::value_from_mem_phi() removed. Turns out we don't hit any asserts/problems with JDK 9 but performance degrades by up to 9% (see comment in the bug). Looks like we don't generate infinite Phis but we miss optimization opportunities.
>>
>>> How about this - record more information to Phi during split_through_Phi(). There is NodeCloneInfo and CloneMap for vectorization optimization. We can record split_through_Phi information in own CloneMap and in Identity code check that Load and data Phi were created before from the same Load node.
>>
>> Right, we could do this but that only works for the case where the data PhiNode was created from the same LoadNode. It would solve the problem with generation of infinite Phis but there may be cases where we would like to re-use a data PhiNode that was created by splitting another LoadNode (for the same field) through the same or a different memory PhiNode.
>>
>> I'll investigate if this approach is good enough.
>>
>> Best regards,
>> Tobias
>>
>>>>> Also what happens with LoadNode::split_through_phi() call which should happened before calling Identity()?
>>>>
>>>> Well, LoadNode::split_through_phi() invokes LoadNode::Identity() and if Identity does not return an existing Phi, we keep the LoadNode and remove it later. Or what do you mean exactly?
>>>
>>> Yes, I missed that Identity() call.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>>>
>>>> Thanks,
>>>> Tobias
>>>>
>>>> [1] https://bugs.openjdk.java.net/browse/JDK-6673473
>>>> [2] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2008-March/000070.html
>>>>
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>> On 7/29/16 4:42 AM, Tobias Hartmann wrote:
>>>>>> Hi,
>>>>>>
>>>>>> please review the following patch:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8038348
>>>>>> http://cr.openjdk.java.net/~thartmann/8038348/webrev.00/
>>>>>>
>>>>>> In C2, instance field loads may be replaced by a corresponding, already existing data Phi in LoadNode::identity():
>>>>>>      // Search for an existing data phi which was generated before for the same
>>>>>>      // instance's field to avoid infinite generation of phis in a loop.
>>>>>>
>>>>>> The problem is, that we only check if the Phi is from the same region and corresponds to the same instance field. We don't check if it contains the actual "up to date" value from the last store to the input memory:
>>>>>>      if (phi->is_Phi() && phi != mem &&
>>>>>>          phi->as_Phi()->is_same_inst_field(this_type, this_iid, this_index, this_offset)) {
>>>>>>        return phi;
>>>>>>      }
>>>>>>
>>>>>> In rare cases, it may happen that the same region contains multiple data Phis for the same instance field. For example:
>>>>>>
>>>>>>              AndI    AndI
>>>>>>              /  \    / \
>>>>>>             /  __\___| |
>>>>>>             | |  |     |
>>>>>>     Region  | |  AddI  AddI
>>>>>>       |  \  | |  | |   |  \_________
>>>>>>       |   \ | /  | |___|_____       |
>>>>>>       |   Phi1   |     |     |      |
>>>>>>       |________  |  ___|  Store1 Store2
>>>>>>                \ | /          \   /
>>>>>>                 Phi2          Phi3
>>>>>>                                 |
>>>>>>                               LoadI
>>>>>>
>>>>>> In this case, Phi3 is the instance field memory and Phi1/Phi2 are data Phis containing the value of the field. LoadINode::Identity() now tries to replace the load by a data Phi representing the field values corresponding to Store1 and Store2. Because Phi1Node::is_same_inst_field() returns true, the load is replaced by Phi1 which contains an "old" field value. We should use Phi2 instead.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> I added PhiNode::mem_contains_value() to check if the given 'value' Phi contains the values from stores on the corresponding input paths of this memory Phi by walking up the memory and data input chains and checking for consistency. To keep complexity low, the implementation is conservative:
>>>>>> - If true, 'value' is guaranteed to contain the values from stores on the memory Phi input paths.
>>>>>> - If false, either 'value' is not the correct Phi or we were not able to prove this due to optimizations of the memory chain that are not (yet) reflected by the value chain nodes or vice versa.
>>>>>>
>>>>>> This problem was caught by the Apache Lucene team some time ago [1] because it causes "impossible" assertions in one of their tests (see my investigation in the bug comments). I was able to reproduce the bug with an old version of their test suite (with JDK 7, 8 and 9) but unfortunately, I was not able to write a regression test. This is because the problem depends on the order in which nodes are being processed by IGVN (field info needs to propagate to Phis) and the order of nodes in the out array of the Region.
>>>>>>
>>>>>> I verified that the fix solves the problem reported by Apache Lucene with JDK 7, 8 and 9. I also executed the hs-comp RBT. Performance numbers look okay (still running).
>>>>>>
>>>>>> Thanks,
>>>>>> Tobias
>>>>>>
>>>>>> [1] https://issues.apache.org/jira/browse/LUCENE-5168
>>>>>>