RFR: 8364757: Missing Store nodes caused by bad wiring in PhaseIdealLoop::insert_post_loop

Roland Westrelin roland at openjdk.org
Fri Sep 12 09:24:28 UTC 2025


On Thu, 11 Sep 2025 13:05:21 GMT, Benoît Maillard <bmaillard at openjdk.org> wrote:

> This PR introduces a fix for wrong results caused by missing `Store` nodes in C2 IR due to incorrect wiring in `PhaseIdealLoop::insert_post_loop`.
> 
> ### Context
> 
> The issue was initially found by the fuzzer. After some trial and error, and with the help of @chhagedorn I was able to reduce the reproducer to something very simple. After being compiled by C2, the execution of the following method led to the last statement (`x = 0`) to be ignored:
> 
> 
>     static public void test() {
>         x = 0;
>         for (int i = 0; i < 20000; i++) {
>             x += i;
>         }
>         x = 0;
>     }
> 
> 
> After some investigation and discussions with @robcasloz and @chhagedorn, it appeared that this issue is linked to how safepoints are inserted into long running loops, causing the loop to be transformed into a nested loop with an `OuterStripMinedLoop` node. `Store` node are moved out of the inner loop when encountering this pattern, and the associated `Phi` nodes are removed in order to avoid inhibiting loop optimizations taking place later. This was initially adressed in [JDK-8356708](https://bugs.openjdk.org/browse/JDK-8356708) by making the necessary corrections in macro expansion. As explained in the next section, this is not enough here as macro expansion happens too late.
> 
> This PR aims at addressing the specific case of the wrong wiring of `Store` nodes in _post_ loops, but on the longer term further investigations into the missing `Phi` node issue are necessary, as they are likely to cause other issues (cf. related JBS issues).
> 
> ### Detailed Analysis
> 
> In `PhaseIdealLoop::create_outer_strip_mined_loop`, a simple `CountedLoop` is turned into a nested loop with an `OuterStripMinedLoop`. The body of the initial loop remains in the inner loop, but the safepoint is moved to the outer loop. Later, we attempt to move `Store` nodes after the inner loop in `PhaseIdealLoop::try_move_store_after_loop`.  When the `Store` node is moved to the outer loop, we also get rid of its input `Phi` node in order not to confuse loop optimizations happening later.
> 
> This only becomes a problem in `PhaseIdealLoop::insert_post_loop`, where we clone the body of the inner/outer loop for the iterations remaining after unrolling. There, we use `Phi` nodes to do the necessary rewiring between the original body and the cloned one. Because we do not have `Phi` nodes for the moved `Store` nodes, their memory inputs may end up being incorrect.
> 
> This is what the IR looks like after the creation of the post lo...

Not a review but a comment on the missing Phis. Your description makes it sound like if the `OuterStripMinedLoop` was created with `Phis` from the start,  there would be no issue. That's no true AFAICT. The current logic for pre/main/post loops creation would simply not work because it doesn't expect the `Phis` and it would need to be extended so things are rewired correctly with the outer loop `Phis`. The inner loop would still have no `Phi` for the sunk store. So the existing logic, once fixed, would not find it either and you would need some new logic to find it maybe using the outer loop `Phis`. The current shape of the outer loop (without the Phis) is very simple and there's only one location where the Store can be (on the exit projection of the inner loop right above the safepoint which is right below the exit of the inner loop and can't be anywhere else). So you added logic to find the Store relying on the current shape of the outer loop. If the outer loop had `Phis`, some alt
 ernate version of that logic could be used. They seem like 2 ways of doing the same thing to me and nothing tells us one is better than the other.  In short, I don't find this bug a good example of something that would work better if we had `Phi`s on the outer loop. I wouldn't say the root cause is that we don't have `Phi`s on the outer loop either.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/27225#issuecomment-3284472055


More information about the hotspot-compiler-dev mailing list