RFR: 8361608: C2: assert(opaq->outcnt() == 1 && opaq->in(1) == limit) failed [v4]

Thu Oct 16 12:48:24 UTC 2025

On Thu, 16 Oct 2025 09:28:49 GMT, Marc Chevalier <mchevalier at openjdk.org> wrote:

>> Loop peeling works by cloning the loop body, which implies to replace the uses of the data in the loop to be replaced by a phi between the original loop and the clone. This is done by `PhaseIdealLoop::fix_data_uses` and can create a maze of phis. Multiple users of the same original data will get a fresh `PhiNode`, there is no logic trying to reuse them, or simplify. That's IGVN's job.
>> 
>> When we have something like
>> 
>> // any loop
>> while (...) { /* something involving limit */ }
>> // counted loop with zero trip guard
>> if (i < limit) {
>>     for (int i = init; i < limit; i++) { ... }
>> }
>> 
>> and we peel the first loop, the limits in the zero trip guard and in the counted loop condition are not the same node anymore but a fresh `PhiNode`.
>> 
>> But the method `PhaseIdealLoop::do_unroll` has the assert
>> 
>> https://github.com/openjdk/jdk/blob/444007fc234aeff75025831c2d1b5538c87fa8f1/src/hotspot/share/opto/loopTransform.cpp#L1929-L1930
>> 
>> requiring that both `limit` are the same node. But as explained, it might not be the case after peeling the first loop.
>> 
>> This situation doesn't happen if IGVN happens between peeling the first loop and unrolling the second. While there is no formal invariant that this must always be true, I couldn't reproduce the same situation without stress peeling: either peeling happens too early, or not at all, or something else happens so that major progress is set before unrolling, which always saves the day. I've tried to hack on an example to make the peeling decision happen "naturally" (using the normal heuristic), but in the right situation, not too early or too late. At this point it was so hardcoded that it's not significantly different than a run with stress peeling.
>> 
>> But with stress peeling, this situation seems to happen, rarely, but sometimes. What should we do?
>> 
>> By creating many `PhiNode`s `PhaseIdealLoop::fix_data_uses` is doing exactly what we expect. We could make it a lot smarter to try to reuse the `PhiNode`s previously constructed, but that would be hard because the inputs of the fresh phis are recursively adjusted, so we can't share ahead of time when inputs are the same. Duplicating when inputs start to differ would also lead to too many copies since phis look indeed different and some more top down clean up can actually collapse them all.
>> 
>> We could run IGVN to clean up the thing after each peeling: it was deemed not desirable as many things are expected to happen immediately after peeling.
>>...
>
> Marc Chevalier has updated the pull request incrementally with one additional commit since the last revision:
> 
>   driver -> main

It'd be nice to have such a test, but a lot of other transformations are happening successfully before, including some unrolling. I don't think it is possible, or at least not so directly, to write such a test with the current capabilities of the IR framework.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/27586#issuecomment-3410733313