RFR: JDK-8287284: C2: loop optimization performs split_thru_phi infinitely many times

Tue Oct 24 10:29:33 UTC 2023

On Fri, 1 Sep 2023 13:13:20 GMT, Tobias Holenstein <tholenstein at openjdk.org> wrote:

> ### Node* phi = PhaseIdealLoop::split_thru_phi(Node* n, Node* region, int policy) 
> The function `PhaseIdealLoop::split_thru_phi` splits Node `n` through a merge point (phi) if there is enough "win". First a clone x<sub>i</sub> of `n` is created for each input of the merge point. The general idea is that we want to replace enough of the x<sub>i</sub> clones with existing nodes or constants. The `policy` indicates how many "wins" are needed for success. A "win" is defined as one of the following:
> - Identity(x<sub>i</sub>) has singleton type (i.e. a constant)
> - Identity(x<sub>i</sub>) returns an already existing node
> - x<sub>i</sub> commons up with an existing node (i.e. hash_find(x<sub>i</sub> != null)
> 
> _Example_
> Looking at the phi (119 Phi) of `n` (135 LoadI) a clone of `n` is created for each input to the phi: 
> - x<sub>1</sub> (140 LoadI) for the input [1] (127 Phi)
> - x<sub>2</sub> (141 LoadI) for the input [2] (90 StoreI)
> 
> If there are enough "wins" (win $\geq$ policy) the wins are applied and a new merge point `phi` (139 Phi) is returned. 
> `n` (135 LoadI) is replaced with `phi` (139 Phi) outside of `PhaseIdealLoop::split_thru_phi` if successful.
> 
> Example with **n =** `135 LoadI`, **region =** `129 Loop`, **policy =** `1` returns **phi =** `139 Phi`
> <img width="286" alt="split_phi" src="https://github.com/openjdk/jdk/assets/71546117/4a1bcd8d-bbfd-4267-9d2b-5ba4e057df72">
> Success because 1 "win" occurs: `88 AddI` = Identity(`141 LoadI`)
> <img width="186" alt="88_identity" src="https://github.com/openjdk/jdk/assets/71546117/af3cb37a-c575-445a-b8d4-967013dd3fb0">
> 
> 
> 
> ### run() has infinite loop of split_thru_phi
> 
> // java -Xbatch -XX:-PartialPeelLoop -XX:CompileCommand=compileonly,Test::run Test.java
> 
> public class Test {
> 
>     public static int cnt = 1;
> 
>     public static void run() {
>         int j = 0;
>         do {
>             j = cnt;
>             for (int k = 0; k < 20000; k++) {
>                 cnt += 2;
>             }
> 
>         } while (++j < 10);
>     }
> 
>     public static void main(String[] args) {
>         for (int i = 0; i < 10; i++ ) {
>             run();
>         }
>     }
> }
> 
> `Compile::optimize_loops` ends up in an endless loop when optimizing run().
> In each `PhaseIdealLoop iteration` a `LoadI` (e.g. 135) is split through a `Phi` (e.g. 119) and the newly created `LoadI` (e.g. 140) is again split through a `Phi` (e.g. 127) creating an endless repeating pattern of `LoadI` (e.g. 144) always connected to the circular dependent `Phis` (119, 127). This results in an ...

Nice analysis, Toby! I added a few suggestions/comments.

src/hotspot/share/opto/loopnode.hpp line 1739:

> 1737:   void update_addp_chain_base(Node* x, Node* old_base, Node* new_base);
> 1738: 
> 1739:   bool has_moved_to_inner_loop(Node* n, Node* region, Node* x);

Suggestion:

  bool moved_to_inner_loop(Node* n, Node* region, Node* x);

src/hotspot/share/opto/loopopts.cpp line 164:

> 162:     } else if (region->is_Loop() && i == LoopNode::LoopBackControl) {
> 163:       // it is not a win if work has moved from an outer to an inner loop
> 164:       if (has_moved_to_inner_loop(n, region, x)) {

Suggestion:

      if (moved_to_inner_loop(n, region, x)) {

src/hotspot/share/opto/loopopts.cpp line 167:

> 165:         wins = 0;
> 166:         break;
> 167:       }

Suggestion:

    } else if (region->is_Loop() && i == LoopNode::LoopBackControl &&
               moved_to_inner_loop(n, region, x)) {
      // it is not a win if 'x' moved from an outer to an inner loop
      wins = 0;
      break;

src/hotspot/share/opto/loopopts.cpp line 232:

> 230: }
> 231: 
> 232: // Check if Node 'x' has moved to inner loop compared to Node 'n'

Suggestion:

// Check if node 'x' moved to an inner loop relative to node 'n'

src/hotspot/share/opto/loopopts.cpp line 233:

> 231: 
> 232: // Check if Node 'x' has moved to inner loop compared to Node 'n'
> 233: bool PhaseIdealLoop::has_moved_to_inner_loop(Node* n, Node* region, Node* x) {

Suggestion:

bool PhaseIdealLoop::moved_to_inner_loop(Node* n, Node* region, Node* x) {

src/hotspot/share/opto/loopopts.cpp line 236:

> 234:   assert(region->is_Loop(), "region should be a loop");
> 235:   IdealLoopTree* n_loop_tree = get_loop(region);
> 236:   for (uint j = 1; j < n->req(); j++) {

Why do you need this loop at all? Wouldn't it be sufficient to just check `x_loop_tree->is_member(n_loop_tree)`?

-------------

PR Review: https://git.openjdk.org/jdk/pull/15536#pullrequestreview-1694406585
PR Review Comment: https://git.openjdk.org/jdk/pull/15536#discussion_r1369882151
PR Review Comment: https://git.openjdk.org/jdk/pull/15536#discussion_r1369882309
PR Review Comment: https://git.openjdk.org/jdk/pull/15536#discussion_r1369878335
PR Review Comment: https://git.openjdk.org/jdk/pull/15536#discussion_r1369878665
PR Review Comment: https://git.openjdk.org/jdk/pull/15536#discussion_r1369882683
PR Review Comment: https://git.openjdk.org/jdk/pull/15536#discussion_r1369939990