RFR: 8278228: C2: Improve identical back-to-back if elimination [v2]

Wed Jan 5 13:14:15 UTC 2022

On Fri, 17 Dec 2021 15:55:10 GMT, Roland Westrelin <roland at openjdk.org> wrote:

>> C2 has had the ability to optimize:
>> 
>> (1)
>> 
>> if (some_condition) {
>>   // body 1
>> } else {
>>   // body 2
>> }
>> if (some_condition) {
>>   // body 3
>> } else {
>>   // body 4
>> }
>> 
>> into:
>> 
>> (4)
>> 
>> if (some_condition) {
>>   // body 1
>>   // body 3
>> } else {
>>   // body 2
>>   // body 4
>> }
>> 
>> for a while.
>> 
>> This is achieved by the intermediate step:
>> 
>> (2)
>> 
>> if (some_condition) {
>>   // body 1
>>   some_condition2 = true;
>> } else {
>>   // body 2
>>   some_condition2 = false;
>> }
>> if (some_condition2) {
>>   // body 3
>> } else {
>>   // body 4
>> }
>> 
>> which then allows the use of the exiting split if optimization. As a
>> result, the graph is transformed to:
>> 
>> (3)
>> 
>> if (some_condition) {
>>   // body 1
>>   some_condition2 = true;
>>   if (some_condition2) {
>>     body3: // a Region here
>>     // body3
>>   } else {
>>     goto body4;
>>   }
>> } else {
>>   // body 2
>>   some_condition2 = false;
>>   if (some_condition2) {
>>     goto body3;
>>   } else {
>>     body4:  // another Region here
>>     // body4;
>>   }
>> }
>> 
>> and finally to (4) above.
>> 
>> Recently, 8275610 has shown that this can break if some_condition is a
>> null check. If, say, body 3 has a control dependent CastPP, then when
>> body 1 and body 3 are merged, the CastPP of body 3 doesn't become
>> control dependent on the dominating if (because in step (2), the
>> CastPP hides behind a Region). As a result, the CastPP loses its
>> dependency on the null check.
>> 
>> After discussing this with Christian, it seemed this was caused by the
>> way this transformation relies on split if: having custom code that
>> wouldn't create Regions at body3 and body4 that are then optimized out
>> would solve the problem. Anyway, after looking at the split if code,
>> trying to figure out how to tease it apart in smaller steps and
>> reusing some of them to build a new transformation, it seemed too
>> complicated. So instead, I propose reusing split if in a slightly
>> different way:
>> 
>> skip step (2) but perform split if anyway to obtain:
>> 
>> if (some_condition) {
>>   // body 1
>>   if (some_condition) {
>>     body3: // Region1
>>     // CastPP is here, control dependent on Region1
>>     // body3
>>   } else {
>>     goto body4;
>>   }
>> } else {
>>   // body 2
>>  if (some_condition) {
>>     goto body3;
>>   } else {
>>     body4:  // Region2
>>     // body4;
>>   }
>> }
>> 
>> - A CastPP node would still be behind a Region. So next step, is to push
>> control dependent nodes through Region1 and Region2:
>> 
>> if (some_condition) {
>>   // body 1
>>   if (some_condition) {
>>     // A CastPP here
>>     body3: // Region1
>>     // body3
>>   } else {
>>     goto body4;
>>   }
>> } else {
>>   // body 2
>>  if (some_condition) {
>>     // A CastPP here
>>     goto body3;
>>   } else {
>>     body4:  // Region2
>>     // body4;
>>   }
>> }
>> 
>> - And then call dominated_by() to optimize the dominated:
>> 
>> if (some_condition) {
>> 
>> in both branches of the dominating if (some_condition) {. That also
>> causes the CastPP to become dependent on the dominating if.
>
> Roland Westrelin has updated the pull request incrementally with one additional commit since the last revision:
> 
>   only for nodes that depends_only_on_test()

That's a nice solution to use split if like that! Looks good.

Maybe you want to add the example (1) and the final result in the last snippet as comment to the optimization as well to provide a visual hint.

src/hotspot/share/opto/loopopts.cpp line 1360:

> 1358:     // Now split the IF
> 1359:     Node* new_false;
> 1360:     Node* new_true;

I suggest to rename these to `new_false_region` and `new_true_region` (same name as method parameter names) which makes it easier to follow the code. 

Could the parameter types of `do_split_if()` also directly be changed to `RegionNode`? While looking at the code, this would, however, require some more updates from `Node` to `RegionNode` and some `as_RegionNode()` + sanity assertions. But it could improve the readability of the split if code. For example that `new_iff` is actually a region and not an If node:
https://github.com/openjdk/jdk/blob/8b5de27ce1e4fe664e08879c2ca89d08db710c9d/src/hotspot/share/opto/split_if.cpp#L494
But that `RegionNode` cleanup could also be done in a separate RFE.

src/hotspot/share/opto/loopopts.cpp line 1369:

> 1367:     }
> 1368: #endif
> 1369:     assert(new_false->in(1)->in(0)->in(1) == dom_if->in(1), "");

Maybe add an explicit message like "must still be same bool after split".

-------------

Marked as reviewed by chagedorn (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/6882