[jdk17] RFR: 8269820: C2 PhaseIdealLoop::do_unroll get wrong opaque node [v2]

Hui Shi hshi at openjdk.java.net
Mon Jul 19 04:16:52 UTC 2021


On Mon, 12 Jul 2021 12:59:15 GMT, Roland Westrelin <roland at openjdk.org> wrote:

>> @rwestrel sorry for late update, I did detail checks about how opaque1 get cloned in loop optimization. Preventing split-thru-phi with an opaque input doesn't work in my investigation.
>> 
>> ![image](https://user-images.githubusercontent.com/70356247/125274735-d4a61b80-e340-11eb-9b78-ed2d87987c43.png)
>> IdealGraph in http://cr.openjdk.java.net/~hshi/8269820/loop_unroll.xml
>> 
>> 1. Opaque1 node is not split in split_thru_phi, it is split when split if in split_up.
>> 2. Entire split is triggered by Split if on if node 415
>> 3. Split starts at region node 210 and phi node 213, PhaseIdealLoop::split_up from 213->492->497(recursive to 516)->516->499->458
>> 4. If disable split 458 in PhaseIdealLoop::split_up, it will crash with other assertions, because "do_split_if" expects "have no instructions in the block containing the IF".
>
> @huishi-hs Thanks for exploring this fix. I still think blocking split thru phi is the way to go. Otherwise there's a risk that the Opaque1 ends up behind a Phi which would break the pattern entirely. This would require something like this:
> 
> https://github.com/rwestrel/jdk17/commit/e6d6ebecba478d0a080954494b927d4243ca2ee4

@rwestrel 
Testing https://github.com/rwestrel/jdk17/commit/e6d6ebecba478d0a080954494b927d4243ca2ee4, testing with -Xcomp -XX:-TieredCompilation (adding elapsedTimer for PhaseIdealLoop::split_if_with_blocks_post and merge_point_safe),  5% - 10% overhead for this extra check in entier split_if_with_blocks_post.  

[Total extra check            , 0.0000035 secs]
[Total after can_split_if time, 0.0000646 secs]
[Total extra check            , 0.0000032 secs]
[Total after can_split_if time, 0.0000649 secs]
[Total extra check            , 0.0000023 secs]
[Total after can_split_if time, 0.0000604 secs]
[Total extra check            , 0.0000023 secs]
[Total after can_split_if time, 0.0000214 secs]
[Total extra check            , 0.0000020 secs]
[Total after can_split_if time, 0.0000198 secs]
[Total after can_split_if time, 0.0000010 secs]
[Total extra check            , 0.0000030 secs]
[Total after can_split_if time, 0.0000383 secs]
[Total extra check            , 0.0000026 secs]
[Total after can_split_if time, 0.0000682 secs]


This also disables split if might lost some optimization and might have unexpected degradation.  Could you please review new fix? New fix is disabling swapping compare node's input in BoolNode.Ideal, when its second input is phi of opaque1 node.

Before transformation, opaque1 is performed on comve.
![image](https://user-images.githubusercontent.com/70356247/126101433-10a584f6-6fda-483d-ae75-29e0fc6c3a84.png)

After transformation, canonical loop limit is constant.
![image](https://user-images.githubusercontent.com/70356247/126101513-6e60981e-9a43-4f59-9d4b-8315850298b9.png)

-------------

PR: https://git.openjdk.java.net/jdk17/pull/208


More information about the hotspot-compiler-dev mailing list