[lworld] RFR: 8367245: [lworld] C2 compilation fails with "Missed optimization opportunity in PhaseIterGVN"

Marc Chevalier mchevalier at openjdk.org
Tue Nov 4 14:26:04 UTC 2025


# Analysis
## Obervationally
### IGVN
During IGVN, in `PhiNode::Value`, a `PhiNode` has 2 inputs. Their types are:

in(1): java/lang/Object * (speculative=compiler/valhalla/inlinetypes/MyValue2 (compiler/valhalla/inlinetypes/MyInterface):exact * (inline_depth=4))
in(2): java/lang/Object * (speculative=null)

We compute the join (HS' meet):
https://github.com/openjdk/valhalla/blob/412ec882767d3ee1792d1e0f98da54ff800c60ce/src/hotspot/share/opto/cfgnode.cpp#L1310-L1317

t=java/lang/Object * (speculative=compiler/valhalla/inlinetypes/MyValue2 (compiler/valhalla/inlinetypes/MyInterface):exact *)

But the current `_type` (of the `PhiNode` as a `TypeNode`) is

_type=java/lang/Object * (speculative=compiler/valhalla/inlinetypes/MyValue3 (compiler/valhalla/inlinetypes/MyInterface):exact *)

We filter `t` by `_type`
https://github.com/openjdk/valhalla/blob/412ec882767d3ee1792d1e0f98da54ff800c60ce/src/hotspot/share/opto/cfgnode.cpp#L1332
and we get

ft=java/lang/Object *

which is what we return. After the end of `Value`, the returned becomes the new `PhiNode`'s `_type`.
https://github.com/openjdk/valhalla/blob/412ec882767d3ee1792d1e0f98da54ff800c60ce/src/hotspot/share/opto/phaseX.cpp#L2150-L2164
and
https://github.com/openjdk/valhalla/blob/412ec882767d3ee1792d1e0f98da54ff800c60ce/src/hotspot/share/opto/node.cpp#L1127-L1133


### Verification
On verification, `in(1)`, `in(2)` have the same value, so does `t`. But this time

_type=java/lang/Object *

and so after filtering `t` by (new) `_type` and we get

ft=java/lang/Object * (speculative=compiler/valhalla/inlinetypes/MyValue2 (compiler/valhalla/inlinetypes/MyInterface):exact *)

which is retuned. Verification gets angry because the new `ft` is not the same as the previous one.

## But why?!
### Details on type computation
In short, we are doing

t = typeof(in(1)) / typeof(in(2))
ft  = t /\ _type (* IGVN *)
ft' = t /\ ft    (* Verification *)

and observing that `ft != ft'`. It seems our lattice doesn't ensure `(a /\ b) /\ b = a /\ b` which is problematic for this kind of verfication that will just "try again and see if something change".

To me, the surprising fact was that the intersection

java/lang/Object * (speculative=compiler/valhalla/inlinetypes/MyValue2 (compiler/valhalla/inlinetypes/MyInterface):exact *)
/\
_type=java/lang/Object * (speculative=compiler/valhalla/inlinetypes/MyValue3 (compiler/valhalla/inlinetypes/MyInterface):exact *)
~>
java/lang/Object *

What happened to the speculative type? Both `MyValue2` and `MyValue3` are inheriting `MyAbstract` (and implementing `MyInterface`). So the code correctly find that the intersection of these speculative type is

compiler/valhalla/inlinetypes/MyAbstract (compiler/valhalla/inlinetypes/MyInterface):AnyNull * (flat in array),iid=top

The interesting part is that it's `AnyNull`: indeed, what else is a `MyValue2` and `MyValue3` at the same time? And then, `above_centerline` decides it's not useful enough (too precise, too clone from HS' top/normal bottom) and remove the speculative type.
https://github.com/openjdk/valhalla/blob/412ec882767d3ee1792d1e0f98da54ff800c60ce/src/hotspot/share/opto/type.cpp#L2886-L2888
But on the verification run, `compiler/valhalla/inlinetypes/MyValue2 (compiler/valhalla/inlinetypes/MyInterface):exact *` is intersected with the speculative type of `java/lang/Object *`, which is unknown (HS' bottom/normal top), so we are simply getting `MyValue2`. If we did not discard `AnyNull` using `above_centerline`, we would have the intersection of `MyValue2` and `AnyNull`, giving `AnyNull`, which is indeed stable.

## Ok, but the types are weird?

Indeed, they are! How can we get a speculative type `MyValue3` on the `PhiNode` when inputs are both `Object`, and one is speculated to be a `MyValue2`? This comes from incremental inlining. It seems that we have some profiling information on the returned type of a callee, that happens to be `MyValue3`, which propagate to the `PhiNode`. Later, the callee is inlined, and we get new type information (`MyValue2`) from its body (from the returned type of a callee of our callee, if I remember well), that reaches the input of our `PhiNode`.

# Reproducing

## In Valhalla

This crash is quite rare because:
1. it needs a specific speculative type setup, which depends heavily on timing
2. if `PhiNode::Value` is called a second time, it will stabilize the `_type` field before verification.

To limitate the influence of 2., I've tested with an additional assert that would immediately do

const Type* ft_ = t->filter_speculative(ft);

in `PhiNode::Value` and compare `ft` and `ft_`. Indeed, we are never sure a run of `Value` is not the last one: it should always be legal to stop anywhere (even if in a particular case), it was going to run further.

With this extra check, the crash a bit more common, but still pretty rare. Tests that have been witness to crash then at least once:
- `compiler/valhalla/inlinetypes/TestCallingConvention.java`
- `compiler/valhalla/inlinetypes/TestIntrinsics.java`
- `compiler/valhalla/inlinetypes/TestArrays.java`
- `compiler/valhalla/inlinetypes/TestBasicFunctionality.java`

All in `compiler/valhalla/inlinetypes` while I was also testing with mainline tests. Suspicious, uh.

## In mainline

With the aforementioned extra check, I've tried to see if it could happen on mainline since the involved code seems not to be valhalla-specific. As we could expect given that only valhalla tests have been seen to crash, no such crash at all.

## Crafting an example

I've tried to craft an example that would create a similar situation, but without luck. I never managed to reach a correct setup of incremental inlining, conflicting type profiling...

# Fixing

I think changing the type system would be quite risky: it is all over the place. Also, fixing would require not to drop the speculative type when `above_centerline`. This is not like something missing or a corner case that one didn't think of, it's rather removing a feature that is explicitly here, so probably on purpose.

As a first approach, one could simply run `filter_speculative` twice, that should be enough as the second filter will simply select the non empty speculative type if there is only one, and this one won't be `above_centerline`, or it would not exist as a speculative type already.

To try to be a bit less aggressive, we can rather do that in case where we know it cannot be useful. If `ft` obtained from `filter_speculative` has no speculative type, but `t` has, we can know that it might be because it has been dropped, and computing `t->filter_speculative(ft)` could pick the speculative type of `t`. The speculative type can still be removed if the non-speculative type of `ft` is exact and non null for instance, but we've still reached a fixpoint, so it's correct, but a little bit too much work.

That being said, I'm not claiming it's a great solution: it seems that many parts do their job as expected, but the result is unfortunate. Ignoring speculative types in verification (or maybe of `PhiNode`s only) would also work. Anyway, the problem is an unprecise type, but it is still sound: working with it shouldn't be an issue.

I was told that maybe @rwestrel would have an opinion, or an idea to do differently.

Thanks,
Marc

-------------

Commit messages:
 - fix assert message
 - Assert after each filter
 - without the assert
 - With instrumentation
 - Tentative fix

Changes: https://git.openjdk.org/valhalla/pull/1717/files
  Webrev: https://webrevs.openjdk.org/?repo=valhalla&pr=1717&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8367245
  Stats: 76 lines in 1 file changed: 76 ins; 0 del; 0 mod
  Patch: https://git.openjdk.org/valhalla/pull/1717.diff
  Fetch: git fetch https://git.openjdk.org/valhalla.git pull/1717/head:pull/1717

PR: https://git.openjdk.org/valhalla/pull/1717


More information about the valhalla-dev mailing list