RFR: 8361608: C2: assert(opaq->outcnt() == 1 && opaq->in(1) == limit) failed [v7]

Thu Oct 23 09:18:46 UTC 2025

On Wed, 22 Oct 2025 15:35:08 GMT, Marc Chevalier <mchevalier at openjdk.org> wrote:

>> Loop peeling works by cloning the loop body, which implies to replace the uses of the data in the loop to be replaced by a phi between the original loop and the clone. This is done by `PhaseIdealLoop::fix_data_uses` and can create a maze of phis. Multiple users of the same original data will get a fresh `PhiNode`, there is no logic trying to reuse them, or simplify. That's IGVN's job.
>> 
>> When we have something like
>> 
>> // any loop
>> while (...) { /* something involving limit */ }
>> // counted loop with zero trip guard
>> if (i < limit) {
>>     for (int i = init; i < limit; i++) { ... }
>> }
>> 
>> and we peel the first loop, the limits in the zero trip guard and in the counted loop condition are not the same node anymore but a fresh `PhiNode`.
>> 
>> But the method `PhaseIdealLoop::do_unroll` has the assert
>> 
>> https://github.com/openjdk/jdk/blob/444007fc234aeff75025831c2d1b5538c87fa8f1/src/hotspot/share/opto/loopTransform.cpp#L1929-L1930
>> 
>> requiring that both `limit` are the same node. But as explained, it might not be the case after peeling the first loop.
>> 
>> This situation doesn't happen if IGVN happens between peeling the first loop and unrolling the second. While there is no formal invariant that this must always be true, I couldn't reproduce the same situation without stress peeling: either peeling happens too early, or not at all, or something else happens so that major progress is set before unrolling, which always saves the day. I've tried to hack on an example to make the peeling decision happen "naturally" (using the normal heuristic), but in the right situation, not too early or too late. At this point it was so hardcoded that it's not significantly different than a run with stress peeling.
>> 
>> But with stress peeling, this situation seems to happen, rarely, but sometimes. What should we do?
>> 
>> By creating many `PhiNode`s `PhaseIdealLoop::fix_data_uses` is doing exactly what we expect. We could make it a lot smarter to try to reuse the `PhiNode`s previously constructed, but that would be hard because the inputs of the fresh phis are recursively adjusted, so we can't share ahead of time when inputs are the same. Duplicating when inputs start to differ would also lead to too many copies since phis look indeed different and some more top down clean up can actually collapse them all.
>> 
>> We could run IGVN to clean up the thing after each peeling: it was deemed not desirable as many things are expected to happen immediately after peeling.
>>...
>
> Marc Chevalier has updated the pull request incrementally with one additional commit since the last revision:
> 
>   review

I agree with the current solution, thanks Marc for considering alternative approaches and thanks Christian for putting in the work to find non-stress reproducers!

Regarding the tests, please consider
- merging all test cases into a single file (after all, we are testing different variations of the same scenario);
- removing all outdated references to stress peeling (i.e. we now know better than "It seems to happen only with stress peeling"), including file names; and
- removing redundant test summaries.

If the tests are slow to run (are they?) and you want to avoid running all of them under all different configurations after merging them into a single file, you can simply pass as an argument to `main` which test you want to run and use a switch statement or similar to run only that test, see e.g. https://github.com/openjdk/jdk/blob/master/test/hotspot/jtreg/compiler/codegen/TestGCMStorePlacement.java.

Also, please make sure to run extended testing before integration (e.g. additional rounds of fuzzing), it is easy to miss corner cases when we introduce new assertions etc.

src/hotspot/share/opto/loopTransform.cpp line 1929:

> 1927:       return;
> 1928:     }
> 1929:     // Zero-trip test uses an 'opaque' node which is not shared, otherwise bailout.

Suggestion:

    // Zero-trip test uses an 'opaque' node which is not shared, otherwise bail out.

src/hotspot/share/opto/loopnode.cpp line 4770:

> 4768: #ifdef ASSERT
> 4769:         // See PhaseIdealLoop::do_unroll
> 4770:         // This property is desirable, but it maybe not hold after cloning a loop.

Suggestion:

        // This property is desirable, but it may not hold after cloning a loop.

src/hotspot/share/opto/loopnode.cpp line 4771:

> 4769:         // See PhaseIdealLoop::do_unroll
> 4770:         // This property is desirable, but it maybe not hold after cloning a loop.
> 4771:         // In such a case, we bail out from unrolling, and rely on IGVN to cleanup stuff.

Suggestion:

        // In such a case, we bail out from unrolling, and rely on IGVN to clean up stuff.

src/hotspot/share/opto/loopnode.cpp line 4776:

> 4774:         // On the other hand, if this assert passes, bailing out in do_unroll means that
> 4775:         // this property was broken in the current round of loop optimization, which is
> 4776:         // acceptable.

This comment would be more useful if it was a bit more precise: could you clarify (in the comment) what do you mean by "desirable", "stuff", "mess", and "bad"?

-------------

Changes requested by rcastanedalo (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/27586#pullrequestreview-3368880961
PR Review Comment: https://git.openjdk.org/jdk/pull/27586#discussion_r2454403689
PR Review Comment: https://git.openjdk.org/jdk/pull/27586#discussion_r2454404879
PR Review Comment: https://git.openjdk.org/jdk/pull/27586#discussion_r2454405504
PR Review Comment: https://git.openjdk.org/jdk/pull/27586#discussion_r2454421215