RFR: 8364757: Missing Store nodes caused by bad wiring in PhaseIdealLoop::insert_post_loop
Benoît Maillard
bmaillard at openjdk.org
Mon Sep 22 15:44:33 UTC 2025
On Tue, 16 Sep 2025 08:46:21 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>> This PR introduces a fix for wrong results caused by missing `Store` nodes in C2 IR due to incorrect wiring in `PhaseIdealLoop::insert_post_loop`.
>>
>> ### Context
>>
>> The issue was initially found by the fuzzer. After some trial and error, and with the help of @chhagedorn I was able to reduce the reproducer to something very simple. After being compiled by C2, the execution of the following method led to the last statement (`x = 0`) to be ignored:
>>
>>
>> static public void test() {
>> x = 0;
>> for (int i = 0; i < 20000; i++) {
>> x += i;
>> }
>> x = 0;
>> }
>>
>>
>> After some investigation and discussions with @robcasloz and @chhagedorn, it appeared that this issue is linked to how safepoints are inserted into long running loops, causing the loop to be transformed into a nested loop with an `OuterStripMinedLoop` node. `Store` node are moved out of the inner loop when encountering this pattern, and the associated `Phi` nodes are removed in order to avoid inhibiting loop optimizations taking place later. This was initially adressed in [JDK-8356708](https://bugs.openjdk.org/browse/JDK-8356708) by making the necessary corrections in macro expansion. As explained in the next section, this is not enough here as macro expansion happens too late.
>>
>> This PR aims at addressing the specific case of the wrong wiring of `Store` nodes in _post_ loops, but on the longer term further investigations into the missing `Phi` node issue are necessary, as they are likely to cause other issues (cf. related JBS issues).
>>
>> ### Detailed Analysis
>>
>> In `PhaseIdealLoop::create_outer_strip_mined_loop`, a simple `CountedLoop` is turned into a nested loop with an `OuterStripMinedLoop`. The body of the initial loop remains in the inner loop, but the safepoint is moved to the outer loop. Later, we attempt to move `Store` nodes after the inner loop in `PhaseIdealLoop::try_move_store_after_loop`. When the `Store` node is moved to the outer loop, we also get rid of its input `Phi` node in order not to confuse loop optimizations happening later.
>>
>> This only becomes a problem in `PhaseIdealLoop::insert_post_loop`, where we clone the body of the inner/outer loop for the iterations remaining after unrolling. There, we use `Phi` nodes to do the necessary rewiring between the original body and the cloned one. Because we do not have `Phi` nodes for the moved `Store` nodes, their memory inputs may end up being incorrect.
>>
>> This is wh...
>
> src/hotspot/share/opto/loopTransform.cpp line 1793:
>
>> 1791: for (DUIterator j = if_false->outs(); if_false->has_out(j); j++) {
>> 1792: Node* store = if_false->out(j)->isa_Store();
>> 1793: // We don't make changes if the memory input is in the loop body as well
>
> Why? I suppose that is because there must be a Phi in the loop then, right? Maybe state that in the comment here.
Having a memory input that is outside of the loop body is the situation where we would normally expect a `Phi`, and this is where we would like to intervene.
If the memory input is in the loop body as well, we can safely assume it is still correct as the whole body get cloned as a unit.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/27225#discussion_r2369078466
More information about the hotspot-compiler-dev
mailing list