[lworld] RFR: 8367242: [lworld] C2 compilation asserts with "dead loop detected"
Marc Chevalier
mchevalier at openjdk.org
Tue Sep 30 08:06:32 UTC 2025
On Tue, 30 Sep 2025 06:19:40 GMT, Tobias Hartmann <thartmann at openjdk.org> wrote:
>> ## Act I: dead loops
>>
>> [JDK-8336003: [lworld] TestLWorld::test151 triggers "Should have been buffered" assert](https://bugs.openjdk.org/browse/JDK-8336003), allows to replace a Phi node by a single input recursively through phis. This change was only added to Valhalla. In mainline, a Phi can be simplified to its single direct input, not through other phis. The valhalla version will work on loops, while the mainline version will reduce all the phis into a single input only if the blob of Phis is acyclic.
>>
>> When a loop is dead, it is possible that the said single input is also an output of the phi. After applying the identity, the phi's unique input becomes its own output (or input). For most nodes, that is not allowed. This is probably relatively harmless as the whole code around is dead and will be removed, but it makes some verification fail. It is also possible that some other idealization are not protected against data loops without phis on the way, and would not terminate. It is interesting to notice that applying [JDK-8336003](https://bugs.openjdk.org/browse/JDK-8336003) to mainline is enough to reproduce the bug there too.
>>
>> We could detect when it's about to happen, and handle the situation differently if we are about to create a very small loop that would be caught by the dead loop detection. It would be possible to make big dead data loop. How annoying that is? Immediately, there is the non-termination problem mentioned above. But also, maybe some nodes can be optimized away by IGVN and end up with a small loop and then the assert would strike again. Is the dead loop check too weak then? It depends what we think is the purpose of this check. During IGVN, we cannot clean up eagerly every dead loop since it would be too expensive to traverse everything. Avoiding dead data loop would also need a lot of traversal. My understanding is that it's rather a sanity check, to make sure that one doesn't mess up the graph surgery and create dead loops accidentally, when something else was meant, and detecting as soon as possible is helpful. So I'm not sure it's worth strengthening the check.
>>
>> A way to avoid the creation of dead data loop is to simply limit the simplification allowed by [JDK-8336003](https://bugs.openjdk.org/browse/JDK-8336003) to constant nodes: since they don't have input, they can't make a cycle! And it seems enough for the bug initially reported.
>>
>> Yet, that makes `test151` in `compiler/valhalla/inlinetypes/TestLWorld.java...
>
> src/hotspot/share/opto/inlinetypenode.cpp line 171:
>
>> 169:
>> 170: // Merges 'this' with 'top' by updating the input PhiNodes added by 'clone_with_phis'
>> 171: InlineTypeNode* InlineTypeNode::merge_with_top(PhaseGVN* gvn, int pnum, bool transform) {
>
> I don't like the code duplication with `InlineTypeNode::merge_with` here. Would it be feasible to merge the two and have `InlineTypeNode::merge_with` handle `top` for `other`? Maybe the `PhiNode` inputs and `val2`can be initialized to `top` by default (maybe already in `clone_with_phis`) such that we only need to guard the `set_req` calls by `!other->is_top()`?
If we initialize the inputs of the `PhiNodes` above `this` to top, we don't even need to merge with anything. We can just skip the call to `merge_with`, no? It seems mostly harmless to change `clone_with_phis` this way, except for the callsite in `Parse::ensure_phi` where I'm not sure of the consequences.
-------------
PR Review Comment: https://git.openjdk.org/valhalla/pull/1640#discussion_r2390241738
More information about the valhalla-dev
mailing list