RFR: 8371716: C2: Phi node fails Value()'s verification when speculative types clash [v2]

Wed Nov 19 17:07:45 UTC 2025

On Wed, 19 Nov 2025 11:22:09 GMT, Marc Chevalier <mchevalier at openjdk.org> wrote:

>> Marc Chevalier has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Fix bug number
>
> I think reading the code and the comments to understand the situation might not be as good as reading the description of this PR. I regret I gave a reproducer and proposed a solution.
> 
> Given the very obvious lack of consensus on the Valhalla PR, it is clear that this issue might evolve radically and that the proposed solution may not be the final one. Therefore, I will not do any cleanup before agreeing on the way to go, as it might very well be erased and it would be a very poor use of everybody's time.

@marc-chevalier @merykitty @rwestrel I only just realized that the conversation from https://github.com/openjdk/valhalla/pull/1717 still had relevance here. I had a very nice call with Marc, and he graciously explained things to me. I'll summarize my thoughts below.

It seems to be possible that the local speculative type of the `phi` (`_type`) and the speculative type of its inputs (`t`) do not have an intersection that is either `null` or `top`. Initially surprising: `t->filter_speculative(_type)` does not actually produce an intersection of the speculative type in this case, but removes the speculative type in `cleanup_speculative` because the type is `above_centerline`. Marc told me that Roland added this, to avoid "over speculation" because the type is now too narrow, and more likely to be wrong.

My question was if it is sane that we get this kind of inconsistent speculation: it must mean that our profiling has delivered somewhat inconsistent results. We would have to dig a bit deeper into the reproducer now, but it seems that an inlined (inner) method found one type during profiling, but the caller (outer method) found another at the phi. It would be interesting to understand a bit more what this implies. I could see these options:
- We assume both assumptions (speculations are correct). i.e. if there is a path for the type of the inner method to the place of the outer method, then it must be a subtype of what we profiled at both places. And it happens that this is only null or top, so we conclude it is impossible for something non-null to come through this path. It would be reasonable to speculate on this.
- Maybe profiling is incomplete: we see one type in the inner, and another type in the outer context. So it might be likely that both show up eventually at the phi.
- One more thought: the more speculative assumptions are "intersected", the more likely this combined speculation is to be incorrect at runtime. Conceptually, each speculation has a probability of failure, and if you apply many of them, the success probability goes lower and lower.

We could now dig deep into if Roland's `cleanup_speculative` logic that prevents "over speculation" is reasonable, or if we should extend it. But that goes out of scope for this bug fix here, in my opinion.

The question is now what we should do here. There are at least 3 options:
- The speculative type should be `null`, the natural intersection of the two speculative types. To me, that sounds like the most consistent solution. But it would require reconsidering the `cleanup_speculative` logic, a lot of effort. But maybe @merykitty has the desire to tackle such a project?
- The speculative type should be `Object` (or the union of the two speculative types). That is the same as removing the speculative type. But currently we don't mark the type as "widened", and so it is possible to later narrow it again once we filter again - so we were not able to reach a fixpoint yet, and that is a problem.
- Just pick one of the two speculative types. It's an arbitrary choice. But it's not incorrect. Marc's approach of filtering again acheives exactly that: first step removes the speculative type, the second step picks the speculative type of the incoming type. It seems to be the smallest code change, and so that may be best for a relative edge case like this.

There is a bit of a question how much of an edge case this really is. Probably quite. But we also cannot rely on our fuzzers here, as they don't really produce Object Classes at all.

Anyway: I'm the least qualified compared to @merykitty and @rwestrel , but I would vote for Marc's current proposal. Plus the required code cleanups, of course.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/28331#issuecomment-3553781174