RFR: JDK-8287284: C2: loop optimization performs split_thru_phi infinitely many times
Tobias Holenstein
tholenstein at openjdk.org
Mon Oct 23 11:09:46 UTC 2023
### Node* phi = PhaseIdealLoop::split_thru_phi(Node* n, Node* region, int policy)
The function `PhaseIdealLoop::split_thru_phi` splits Node `n` through a merge point (phi) if there is enough "win". First a clone x<sub>i</sub> of `n` is created for each input of the merge point. The general idea is that we want to replace enough of the x<sub>i</sub> clones with existing nodes or constants. The `policy` indicates how many "wins" are needed for success. A "win" is defined as one of the following:
- Identity(x<sub>i</sub>) has singleton type (i.e. a constant)
- Identity(x<sub>i</sub>) returns an already existing node
- x<sub>i</sub> commons up with an existing node (i.e. hash_find(x<sub>i</sub> != null)
_Example_
Looking at the phi (119 Phi) of `n` (135 LoadI) a clone of `n` is created for each input to the phi:
- x<sub>1</sub> (140 LoadI) for the input [1] (127 Phi)
- x<sub>2</sub> (141 LoadI) for the input [2] (90 StoreI)
If there are enough "wins" (win $\geq$ policy) the wins are applied and a new merge point `phi` (139 Phi) is returned.
`n` (135 LoadI) is replaced with `phi` (139 Phi) outside of `PhaseIdealLoop::split_thru_phi` if successful.
Example with **n =** `135 LoadI`, **region =** `129 Loop`, **policy =** `1` returns **phi =** `139 Phi`
<img width="286" alt="split_phi" src="https://github.com/openjdk/jdk/assets/71546117/4a1bcd8d-bbfd-4267-9d2b-5ba4e057df72">
Success because 1 "win" occurs: `88 AddI` = Identity(`141 LoadI`)
<img width="186" alt="88_identity" src="https://github.com/openjdk/jdk/assets/71546117/af3cb37a-c575-445a-b8d4-967013dd3fb0">
### run() has infinite loop of split_thru_phi
// java -Xbatch -XX:-PartialPeelLoop -XX:CompileCommand=compileonly,Test::run Test.java
public class Test {
public static int cnt = 1;
public static void run() {
int j = 0;
do {
j = cnt;
for (int k = 0; k < 20000; k++) {
cnt += 2;
}
} while (++j < 10);
}
public static void main(String[] args) {
for (int i = 0; i < 10; i++ ) {
run();
}
}
}
`Compile::optimize_loops` ends up in an endless loop when optimizing run().
In each `PhaseIdealLoop iteration` a `LoadI` (e.g. 135) is split through a `Phi` (e.g. 119) and the newly created `LoadI` (e.g. 140) is again split through a `Phi` (e.g. 127) creating an endless repeating pattern of `LoadI` (e.g. 144) always connected to the circular dependent `Phis` (119, 127). This results in an every growing chain of new `Phis` (142, 139, 133). The process is only stopped because `LoopOptsCount` is reached after 43 iterations. The long chain of `Phis` stays throughout code generation.
<img width="596" alt="endless" src="https://github.com/openjdk/jdk/assets/71546117/5d0353a2-3212-403d-9f78-5bf60ac8df4e">
Lets look at the steps of `split_thru_phi` in more detail:
1. The graph of `run()` before the first `PhaseIdealLoop iterations` looks like the following:
![1 - init graph - split 135](https://github.com/openjdk/jdk/assets/71546117/b8663eb6-e10d-4450-9a48-e366c4b47f6e)
2. Running `build_and_optimize()`→`split_if_with_blocks()`→`split_if_with_blocks_pre(use)`→`split_thru_phi(Node* n, Node* region, int policy)` with **n =** `135 LoadI`, **region =** `139 Loop`.
- `135 LoadI` is split through `119 Phi` if it results in a "win".
- As a first step a clone of `135 LoadI` is created for each input of `119 Phi`:
- clone `140 LoadI` for input `127 Phi`
- `141 LoadI` for input `90 StoreI`
![2 - clone 140 141](https://github.com/openjdk/jdk/assets/71546117/e024e264-0b51-45da-9027-385d9750a8a3)
3. In this case we have a "win" because there exists an identity of `141 LoadI`:
- `88 AddI` = identity(`141 LoadI`)
![3 - 88=identity(141)](https://github.com/openjdk/jdk/assets/71546117/bb8a2e3e-1ccb-4ea7-b0fa-812436bd4cec)
4. As a final step of `split_thru_phi` (and only because we have a "win") `135 LoadI` is replaced with a new `139 Phi` that merges the clone `140 LoadI` and `88 AddI`, the identity of `141 LoadI` which gets cleaned up:
![4 - replace(135 with 139)](https://github.com/openjdk/jdk/assets/71546117/b861967e-9a2d-4056-bf9d-611377800a62)
5. Next `split_thru_phi(Node* n, Node* region, int policy)` is called with the previously created clone **n =** `140 LoadI`.
- `140 LoadI` is split through `127 Phi`
- Clone `143 LoadI` is created for `7 Parm`, the first input to `127 Phi`
![5 - split 140 and clone 143](https://github.com/openjdk/jdk/assets/71546117/ebbe989c-ffb9-4ad4-8f79-c1cb321c886a)
6. We have again a "win" for the clone `143 LoadI` because:
- `24 Loadl` = hash(`143 LoadI`)
![6 - 24=hash(143)](https://github.com/openjdk/jdk/assets/71546117/7bb9587e-8b28-4788-8a2f-cc1da97c2f14)
7. Next, clone `144 LoadI` is created for `119 Phi`, the second input to `127 Phi`. And finally, `140 LoadI` is replaced with a new `142 Phi`
![7 - replace(140 with 142)](https://github.com/openjdk/jdk/assets/71546117/72f0fc45-5898-42d8-a506-0ca08fd1c068)
If `split_thru_phi` was called on a LoadI, region is a Loop and we have a "win" , we do `set_major_progress()` (at the end of `PhaseIdealLoop::split_if_with_blocks_pre`). Therefore we do another `PhaseIdealLoop iteration`.
https://github.com/openjdk/jdk/blob/903b9e8dd966fbb61222c59048b752ed8b42b608/src/hotspot/share/opto/loopopts.cpp#L1171-L1174
8. In the second `PhaseIdealLoop iteration` we call `split_thru_phi(Node* n, Node* region, int policy)` with **n =** `144 LoadI`. This is the same as in the first iteration: `44 LoadI` has the same inputs as `135 LoadI` in the first iteration. We find the exact same wins again because the involved nodes have not changed in the meantime. We repeat the work of the first iteration and `set major_progress` to true. For each following loop opts round, we repeat the same steps again and keep setting `major_progress` to true until we reach the maximum number of loop opts iterations.
=> **We are stuck in a loop!**
![8 - split 144](https://github.com/openjdk/jdk/assets/71546117/ba340d10-5031-4b30-b97b-63793fc65118)
9. After another call to `split_thru_phi` we have a chain of 4 phis that grows until `PhaseIdealLoop iterations` runs out of iterations.
![9 - chain of phis](https://github.com/openjdk/jdk/assets/71546117/c282b4d7-57af-4fa4-a90d-6d69afba52ed)
# Proposed Fix
The problem is that `split_thru_phi` finds a win when in reality it is not a win: it does not make much sense to move the LoadI back into the inner loop - Because that means adding extra work to the most common path. The cost function in `split_thru_phi` is now adjusted to exclude such false wins. This automatically also prevents the infinite loop.
---------
### Progress
- [ ] Change must be properly reviewed (1 review required, with at least 1 [Reviewer](https://openjdk.org/bylaws#reviewer))
- [x] Change must not contain extraneous whitespace
- [x] Commit message must refer to an issue
### Reviewing
<details><summary>Using <code>git</code></summary>
Checkout this PR locally: \
`$ git fetch https://git.openjdk.org/jdk.git pull/15536/head:pull/15536` \
`$ git checkout pull/15536`
Update a local copy of the PR: \
`$ git checkout pull/15536` \
`$ git pull https://git.openjdk.org/jdk.git pull/15536/head`
</details>
<details><summary>Using Skara CLI tools</summary>
Checkout this PR locally: \
`$ git pr checkout 15536`
View PR using the GUI difftool: \
`$ git pr show -t 15536`
</details>
<details><summary>Using diff file</summary>
Download this PR as a diff file: \
<a href="https://git.openjdk.org/jdk/pull/15536.diff">https://git.openjdk.org/jdk/pull/15536.diff</a>
</details>
-------------
Commit messages:
- moved logic to a function
- remove comment
- Update loopnode.hpp
- Never move a LoadI from outer to inner loop
- revert
- JDK-8287284
Changes: https://git.openjdk.org/jdk/pull/15536/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=15536&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8287284
Stats: 33 lines in 2 files changed: 31 ins; 1 del; 1 mod
Patch: https://git.openjdk.org/jdk/pull/15536.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/15536/head:pull/15536
PR: https://git.openjdk.org/jdk/pull/15536
More information about the hotspot-compiler-dev
mailing list