[lworld] RFR: 8367242: [lworld] C2 compilation asserts with "dead loop detected"

Marc Chevalier mchevalier at openjdk.org
Fri Sep 26 14:13:35 UTC 2025


## Act I: dead loops

[JDK-8336003: [lworld] TestLWorld::test151 triggers "Should have been buffered" assert](https://bugs.openjdk.org/browse/JDK-8336003), allows to replace a Phi node by a single input recursively through phis. This change was only added to Valhalla. In mainline, a Phi can be simplified to its single direct input, not through other phis. The valhalla version will work on loops, while the mainline version will reduce all the phis into a single input only if the blob of Phis is acyclic.

When a loop is dead, it is possible that the said single input is also an output of the phi. After applying the identity, the phi's unique input becomes its own output (or input). For most nodes, that is not allowed. This is probably relatively harmless as the whole code around is dead and will be removed, but it makes some verification fail. It is also possible that some other idealization are not protected against data loops without phis on the way, and would not terminate. It is interesting to notice that applying [JDK-8336003](https://bugs.openjdk.org/browse/JDK-8336003) to mainline is enough to reproduce the bug there too.

We could detect when it's about to happen, and handle the situation differently if we are about to create a very small loop that would be caught by the dead loop detection. It would be possible to make big dead data loop. How annoying that is? Immediately, there is the non-termination problem mentioned above. But also, maybe some nodes can be optimized away by IGVN and end up with a small loop and then the assert would strike again. Is the dead loop check too weak then? It depends what we think is the purpose of this check. During IGVN, we cannot clean up eagerly every dead loop since it would be too expensive to traverse everything. Avoiding dead data loop would also need a lot of traversal. My understanding is that it's rather a sanity check, to make sure that one doesn't mess up the graph surgery and create dead loops accidentally, when something else was meant, and detecting as soon as possible is helpful. So I'm not sure it's worth strengthening the check.

A way to avoid the creation of dead data loop is to simply limit the simplification allowed by [JDK-8336003](https://bugs.openjdk.org/browse/JDK-8336003) to constant nodes: since they don't have input, they can't make a cycle! And it seems enough for the bug initially reported.

Yet, that makes `test151` in `compiler/valhalla/inlinetypes/TestLWorld.java` fail in another way: CCP does not reach a fixpoint.

## Act II: soundness, precision and monotony

The situation is a bit more messy, and doesn't make a great screenshot, but it boils down to:

`SomeNode` type:`MyValue2:NotNull` -> [... a lot of `PhiNode`s, with loops ...] -> `InlineTypeNode`

During IGVN, all the `PhiNode`s have type `MyValue2` (maybe null), since they always have at least one input that is a `PhiNode` of this maybe null type, and so does the `InlineTypeNode`. During CCP, because of

https://github.com/openjdk/valhalla/blob/7afbf295143dae458213513c30fb52245d75d7b4/src/hotspot/share/opto/inlinetypenode.cpp#L1640-L1643

when the inputs are all top at the beginning, the returned value for the `InlineTypeNode` is `MyValue2` (maybe null), which is the bottom type of this node. Yet, later the types of all the `PhiNode`s are increased from top to `MyValue2:NotNull`, which would allow to find a better type for the `InlineTypeNode`, but it is too late! `MyValue2` cannot be increased into `MyValue2:NotNull`. Replacing all the `PhiNode`s with the `SomeNode` obviously fixes it (but it introduces the dead loop that we are trying to solve here). The lack of monotonicity of `InlineTypeNode::Value` is dooming CCP.

Since starting CCP with the wrong `_type` in `InlineTypeNode` leads to the overapproximation, a solution could be to be more precise before CPP and rather refine during IGVN. It is possible to refine in many ways, inspired by [JDK-8336003](https://bugs.openjdk.org/browse/JDK-8336003) but they all come to more and more complicated failures, all similar to the original one. This really seems not to be a way to follow.

What I propose is to allow `InlineTypeNode` to give a top value, but no other constant. Indeed, if top, everything is dead anyway and on its way of dying, it is not the same as having a concrete, existing, constant value like having a constant buffer. Yet some care need to be taken in `InlineTypeNode* PhiNode::push_inline_types_down(PhaseGVN* phase, bool can_reshape, ciInlineKlass* inline_klass)`. This function changes a `Phi(InlineType(a, b, c), InlineType(d, e, f))` into `InlineType(Phi(a, d), Phi(b, e), Phi(c, f))`. This assumes that all the input of the original `Phi` are all `InlineTypeNode`, which is not the case after IGVN if `InlineTypeNode::Value` can return top. Thus, I extend `push_inline_types_down` to also accept top inputs to the phi, changing `Phi(InlineType(a, b, c), top)` into `InlineType(Phi(a, top), Phi(b, top), Phi(c, top))`, which is then cleaned up by IGVN. This is exercised by `test103` in `compiler/valhalla/inlinetypes/TestNullableInlineTypes.java`.

It is interesting to notice that does not prevent the same situation to happen again. Before, the scenario was:
1. input to the `InlineTypeNode` has type top
2. `InlineTypeNode::Value` is keeping itself from returning top, returns `MyValue` (from `_type`)
3. input is updated to `MyValue:NotNull`
4. `InlineTypeNode::Value` could return `MyValue:NotNull` but it's already bottom, too late.

Now, at step 2, `InlineTypeNode::Value` would return top, avoiding this to happen. But the slightly different scenario is still possible:
1. input to the `InlineTypeNode` has type top
2. `InlineTypeNode::Value` returns top
3. input is updated to some non-top constant
4. `InlineTypeNode::Value` is keeping itself from returning a constant, returns `MyValue` (from `_type`)
5. input is updated to `MyValue:NotNull`
6. `InlineTypeNode::Value` could return `MyValue:NotNull` but it's already bottom, too late.

It is not clear to me how likely this whole thing is, but I don't see how to avoid it without returning constant values from `InlineTypeNode::Value` or treating IGVN and CCP differently. For instance, it would be fine to return constant values for CCP, but not IGVN. It would also be fine, during CPP, to return the next immediately higher abstract value that we are allowed to return. For instance, if we have a lattice such that:
- `top` < all the constants < `MyValue:NotNull` < `MyValue`
we could return `MyValue:NotNull` instead of a non-top constant.

## Act III: ignoring the comment's wisdom

alright, so we get `InlineTypeNode::Value` possibly return top. Then what? The ominous

https://github.com/openjdk/valhalla/blob/7afbf295143dae458213513c30fb52245d75d7b4/src/hotspot/share/opto/inlinetypenode.cpp#L1641

is not there for no reason! It makes a couple of places assert, where they are doing some unprotected `Node::as_InlineType()`. While a constant buffer would be annoying, a constant top is relatively sensible to handle there. 

## Epilogue

I haven't added a special test as `test_004` in TestDeadIrreducibleLoopsMain.java does that very well, and I don't have a better idea of how to reproduce it without just duplicating the test. This case is pretty small already, there isn't much to reduce. All the other problems are also well covered with existing tests. I see little value in just copy pasting `test151` out of `TestLWorld.java` with no significant change or simplification.

Finally, tests seems passing not worse than on `lworld`.

Thanks,
Marc

-------------

Commit messages:
 - handle top in field_value_by_offset
 - Only simplify loop of phis into a constant + allow top for InlineTypeNode::Value

Changes: https://git.openjdk.org/valhalla/pull/1640/files
  Webrev: https://webrevs.openjdk.org/?repo=valhalla&pr=1640&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8367242
  Stats: 73 lines in 5 files changed: 62 ins; 0 del; 11 mod
  Patch: https://git.openjdk.org/valhalla/pull/1640.diff
  Fetch: git fetch https://git.openjdk.org/valhalla.git pull/1640/head:pull/1640

PR: https://git.openjdk.org/valhalla/pull/1640


More information about the valhalla-dev mailing list