[9] RFR(M): 8038348: Instance field load is replaced by wrong data Phi
Vladimir Kozlov
vladimir.kozlov at oracle.com
Wed Aug 3 22:49:24 UTC 2016
On 8/3/16 11:34 AM, Tobias Hartmann wrote:
> Hi Vladimir,
>
> thanks for the review!
>
> On 03.08.2016 03:19, Vladimir Kozlov wrote:
>> Thank you Tobias for looking on this old problem.
>>
>> So we have to find a data Phi which gives the same result as LoadNode::split_through_phi() for split through memory Phi3. Identity() code is short cut for that.
>
> Yes, exactly.
>
>> Unfortunately I forgot what the case was when I added this code.
>
> There is a comment that suggests that this is to avoid creating "infinite" Phis in a loop:
> // Search for an existing data phi which was generated before for the same
> // instance's field to avoid infinite generation of phis in a loop.
>
> You also described this in JDK-6673473 (see [1], [2]) which introduced that code. It's not clear to me how creation of infinite Phis can happen, do you remember the exact problem?
More like this one:
http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/de93acbb64fc
https://bugs.openjdk.java.net/browse/JDK-6682236?focusedCommentId=12388989&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12388989
It is the case where memory Phi's back edge input points back to this Phi (may be though nodes which does not modify the instance). As result Load node (clone on back branch) stays in graph and it
will be split again because Ideal(), which does the split, called before Identity() before the fix.
>
>> It is definitely incorrect in this case when you have multiple data Phi nodes.
>> Since in general case it will be difficult to find correct Phi may be we should not do this optimization. What happened if we do not do replacement if the region has several Phis? Can you check?
>
> I also thought about removing that code entirely. It does fix the problem but I was worried that we may miss optimization opportunities if equal Phis are not merged or new Phis are created for existing data paths, respectively. I'll do a performance another run without that code.
Also Identity() for new data Phi will not IGVNed with previous generated data Phi since back edge input is different.
Based on the above we can't skip Identity() code even if we have several Phi data nodes. Otherwise we can get infinite Phi nodes generation again.
We back to the problem :(
How about this - record more information to Phi during split_through_Phi(). There is NodeCloneInfo and CloneMap for vectorization optimization. We can record split_through_Phi information in own
CloneMap and in Identity code check that Load and data Phi were created before from the same Load node.
>
>> Also what happens with LoadNode::split_through_phi() call which should happened before calling Identity()?
>
> Well, LoadNode::split_through_phi() invokes LoadNode::Identity() and if Identity does not return an existing Phi, we keep the LoadNode and remove it later. Or what do you mean exactly?
Yes, I missed that Identity() call.
Thanks,
Vladimir
>
> Thanks,
> Tobias
>
> [1] https://bugs.openjdk.java.net/browse/JDK-6673473
> [2] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2008-March/000070.html
>
>>
>> Thanks,
>> Vladimir
>>
>> On 7/29/16 4:42 AM, Tobias Hartmann wrote:
>>> Hi,
>>>
>>> please review the following patch:
>>> https://bugs.openjdk.java.net/browse/JDK-8038348
>>> http://cr.openjdk.java.net/~thartmann/8038348/webrev.00/
>>>
>>> In C2, instance field loads may be replaced by a corresponding, already existing data Phi in LoadNode::identity():
>>> // Search for an existing data phi which was generated before for the same
>>> // instance's field to avoid infinite generation of phis in a loop.
>>>
>>> The problem is, that we only check if the Phi is from the same region and corresponds to the same instance field. We don't check if it contains the actual "up to date" value from the last store to the input memory:
>>> if (phi->is_Phi() && phi != mem &&
>>> phi->as_Phi()->is_same_inst_field(this_type, this_iid, this_index, this_offset)) {
>>> return phi;
>>> }
>>>
>>> In rare cases, it may happen that the same region contains multiple data Phis for the same instance field. For example:
>>>
>>> AndI AndI
>>> / \ / \
>>> / __\___| |
>>> | | | |
>>> Region | | AddI AddI
>>> | \ | | | | | \_________
>>> | \ | / | |___|_____ |
>>> | Phi1 | | | |
>>> |________ | ___| Store1 Store2
>>> \ | / \ /
>>> Phi2 Phi3
>>> |
>>> LoadI
>>>
>>> In this case, Phi3 is the instance field memory and Phi1/Phi2 are data Phis containing the value of the field. LoadINode::Identity() now tries to replace the load by a data Phi representing the field values corresponding to Store1 and Store2. Because Phi1Node::is_same_inst_field() returns true, the load is replaced by Phi1 which contains an "old" field value. We should use Phi2 instead.
>>
>>
>>
>>
>>
>>>
>>> I added PhiNode::mem_contains_value() to check if the given 'value' Phi contains the values from stores on the corresponding input paths of this memory Phi by walking up the memory and data input chains and checking for consistency. To keep complexity low, the implementation is conservative:
>>> - If true, 'value' is guaranteed to contain the values from stores on the memory Phi input paths.
>>> - If false, either 'value' is not the correct Phi or we were not able to prove this due to optimizations of the memory chain that are not (yet) reflected by the value chain nodes or vice versa.
>>>
>>> This problem was caught by the Apache Lucene team some time ago [1] because it causes "impossible" assertions in one of their tests (see my investigation in the bug comments). I was able to reproduce the bug with an old version of their test suite (with JDK 7, 8 and 9) but unfortunately, I was not able to write a regression test. This is because the problem depends on the order in which nodes are being processed by IGVN (field info needs to propagate to Phis) and the order of nodes in the out array of the Region.
>>>
>>> I verified that the fix solves the problem reported by Apache Lucene with JDK 7, 8 and 9. I also executed the hs-comp RBT. Performance numbers look okay (still running).
>>>
>>> Thanks,
>>> Tobias
>>>
>>> [1] https://issues.apache.org/jira/browse/LUCENE-5168
>>>
More information about the hotspot-compiler-dev
mailing list