RFR: JDK-8287284: C2: loop optimization performs split_thru_phi infinitely many times [v4]

Mon Oct 30 10:20:34 UTC 2023

On Mon, 30 Oct 2023 09:47:49 GMT, Tobias Holenstein <tholenstein at openjdk.org> wrote:

>> ### Node* phi = PhaseIdealLoop::split_thru_phi(Node* n, Node* region, int policy) 
>> The function `PhaseIdealLoop::split_thru_phi` splits Node `n` through a merge point (phi) if there is enough "win". First a clone x<sub>i</sub> of `n` is created for each input of the merge point. The general idea is that we want to replace enough of the x<sub>i</sub> clones with existing nodes or constants. The `policy` indicates how many "wins" are needed for success. A "win" is defined as one of the following:
>> - Identity(x<sub>i</sub>) has singleton type (i.e. a constant)
>> - Identity(x<sub>i</sub>) returns an already existing node
>> - x<sub>i</sub> commons up with an existing node (i.e. hash_find(x<sub>i</sub> != null)
>> 
>> _Example_
>> Looking at the phi (119 Phi) of `n` (135 LoadI) a clone of `n` is created for each input to the phi: 
>> - x<sub>1</sub> (140 LoadI) for the input [1] (127 Phi)
>> - x<sub>2</sub> (141 LoadI) for the input [2] (90 StoreI)
>> 
>> If there are enough "wins" (win $\geq$ policy) the wins are applied and a new merge point `phi` (139 Phi) is returned. 
>> `n` (135 LoadI) is replaced with `phi` (139 Phi) outside of `PhaseIdealLoop::split_thru_phi` if successful.
>> 
>> Example with **n =** `135 LoadI`, **region =** `129 Loop`, **policy =** `1` returns **phi =** `139 Phi`
>> <img width="286" alt="split_phi" src="https://github.com/openjdk/jdk/assets/71546117/4a1bcd8d-bbfd-4267-9d2b-5ba4e057df72">
>> Success because 1 "win" occurs: `88 AddI` = Identity(`141 LoadI`)
>> <img width="186" alt="88_identity" src="https://github.com/openjdk/jdk/assets/71546117/af3cb37a-c575-445a-b8d4-967013dd3fb0">
>> 
>> 
>> 
>> ### run() has infinite loop of split_thru_phi
>> 
>> // java -Xbatch -XX:-PartialPeelLoop -XX:CompileCommand=compileonly,Test::run Test.java
>> 
>> public class Test {
>> 
>>     public static int cnt = 1;
>> 
>>     public static void run() {
>>         int j = 0;
>>         do {
>>             j = cnt;
>>             for (int k = 0; k < 20000; k++) {
>>                 cnt += 2;
>>             }
>> 
>>         } while (++j < 10);
>>     }
>> 
>>     public static void main(String[] args) {
>>         for (int i = 0; i < 10; i++ ) {
>>             run();
>>         }
>>     }
>> }
>> 
>> `Compile::optimize_loops` ends up in an endless loop when optimizing run().
>> In each `PhaseIdealLoop iteration` a `LoadI` (e.g. 135) is split through a `Phi` (e.g. 119) and the newly created `LoadI` (e.g. 140) is again split through a `Phi` (e.g. 127) creating an endless repeating pattern of `L...
>
> Tobias Holenstein has updated the pull request incrementally with one additional commit since the last revision:
> 
>   added comments and only check if Load node

src/hotspot/share/opto/loopopts.cpp line 235:

> 233: // BUT it can also return true and the clone is in the outer loop
> 234: bool PhaseIdealLoop::can_move_to_inner_loop(Node* n, Node* region, Node* x) {
> 235:   assert(region->is_Loop(), "region should be a loop");

make argument a `LoopNode`, then you can drop the assert.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/15536#discussion_r1375973614