RFR: 8364757: Missing Store nodes caused by bad wiring in PhaseIdealLoop::insert_post_loop [v7]

Manuel Hässig mhaessig at openjdk.org
Wed Sep 24 08:35:06 UTC 2025


On Tue, 23 Sep 2025 14:52:19 GMT, Benoît Maillard <bmaillard at openjdk.org> wrote:

>> This PR introduces a fix for wrong results caused by missing `Store` nodes in C2 IR due to incorrect wiring in `PhaseIdealLoop::insert_post_loop`.
>> 
>> ### Context
>> 
>> The issue was initially found by the fuzzer. After some trial and error, and with the help of @chhagedorn I was able to reduce the reproducer to something very simple. After being compiled by C2, the execution of the following method led to the last statement (`x = 0`) to be ignored:
>> 
>> 
>>     static public void test() {
>>         x = 0;
>>         for (int i = 0; i < 20000; i++) {
>>             x += i;
>>         }
>>         x = 0;
>>     }
>> 
>> 
>> After some investigation and discussions with @robcasloz and @chhagedorn, it appeared that this issue is linked to how safepoints are inserted into long running loops, causing the loop to be transformed into a nested loop with an `OuterStripMinedLoop` node. `Store` node are moved out of the inner loop when encountering this pattern, and the associated `Phi` nodes are removed in order to avoid inhibiting loop optimizations taking place later. This was initially adressed in [JDK-8356708](https://bugs.openjdk.org/browse/JDK-8356708) by making the necessary corrections in macro expansion. As explained in the next section, this is not enough here as macro expansion happens too late.
>> 
>> This PR aims at addressing the specific case of the wrong wiring of `Store` nodes in _post_ loops, but on the longer term further investigations into the missing `Phi` node issue are necessary, as they are likely to cause other issues (cf. related JBS issues).
>> 
>> ### Detailed Analysis
>> 
>> In `PhaseIdealLoop::create_outer_strip_mined_loop`, a simple `CountedLoop` is turned into a nested loop with an `OuterStripMinedLoop`. The body of the initial loop remains in the inner loop, but the safepoint is moved to the outer loop. Later, we attempt to move `Store` nodes after the inner loop in `PhaseIdealLoop::try_move_store_after_loop`.  When the `Store` node is moved to the outer loop, we also get rid of its input `Phi` node in order not to confuse loop optimizations happening later.
>> 
>> This only becomes a problem in `PhaseIdealLoop::insert_post_loop`, where we clone the body of the inner/outer loop for the iterations remaining after unrolling. There, we use `Phi` nodes to do the necessary rewiring between the original body and the cloned one. Because we do not have `Phi` nodes for the moved `Store` nodes, their memory inputs may end up being incorrect.
>> 
>> This is wh...
>
> Benoît Maillard has updated the pull request incrementally with one additional commit since the last revision:
> 
>   More comments

Thank you for working on this and for the clear analysis of this tricky issue, @benoitmaillard!

Your solution seems good, but I have a few coding suggestions below.

src/hotspot/share/opto/loopTransform.cpp line 1672:

> 1670: 
> 1671: Node* PhaseIdealLoop::find_last_store_in_outer_loop(Node* store, IdealLoopTree* outer_loop) {
> 1672:   Node* out = store;

Since you want a store, you should probably assert that `store` is not null and actually a store.

src/hotspot/share/opto/loopTransform.cpp line 1694:

> 1692:     }
> 1693:     out = unique_next;
> 1694:   }

I found the loop a bit hard to read. Below is a proposal for a restructured loop. If you like it, take it, otherwise leave it.

Suggestion:

  Node* unique_next = store;
  do {
    out = unique_next;
    for (DUIterator_Fast imax, l = out->fast_outs(imax); l < imax; l++) {
      Node* use = out->fast_out(l);
      if (use->is_Mem() && use->in(MemNode::Memory) == out) {
        IdealLoopTree* use_loop = get_loop(get_ctrl(use));
        if (outer_loop->is_member(use_loop)) {
          assert(unique_next == out, "memory node should only have one usage in the loop body");
          unique_next = use;
        }
      }
    }
  } while (out != unique_next);

src/hotspot/share/opto/loopnode.hpp line 1384:

> 1382: 
> 1383:   // Find the last store in the body of an OuterStripMinedLoop when following memory uses
> 1384:   Node *find_last_store_in_outer_loop(Node* store, IdealLoopTree* outer_loop);

If I am not mistaken, this could be `const` since you are only using `PhaseIdealLoop::get_loop()`, which is also `const`.
Suggestion:

  Node *find_last_store_in_outer_loop(Node* store, IdealLoopTree* outer_loop) const;

-------------

Changes requested by mhaessig (Committer).

PR Review: https://git.openjdk.org/jdk/pull/27225#pullrequestreview-3261328829
PR Review Comment: https://git.openjdk.org/jdk/pull/27225#discussion_r2374645703
PR Review Comment: https://git.openjdk.org/jdk/pull/27225#discussion_r2374958370
PR Review Comment: https://git.openjdk.org/jdk/pull/27225#discussion_r2374676449


More information about the hotspot-compiler-dev mailing list