RFR: 8370914: C2: Reimplement Type::join [v4]

Fri Nov 21 09:17:09 UTC 2025

On Thu, 20 Nov 2025 10:14:07 GMT, Quan Anh Mai <qamai at openjdk.org> wrote:

>> Hi,
>> 
>> Currently, `Type::join` is implemented using `Type::dual`. The idea seems to be that the dual of a join would be the meet of the duals of the operands. This helps us avoid the need to implement a separate join operation. The comments also discuss the symmetry of the join and the meet operations, which seems to refer to the various fundamental laws of set union and intersection.
>> 
>> However, it requires us to find a representation of a `Type` class that is symmetric, which may not always be possible without outright dividing its value set into the normal set and the dual set, and effectively implementing join and meet separately (e.g. `TypeInt` and `TypeLong`).
>> 
>> In other cases, the existence of dual types introduces additional values into the value set of a `Type` class. For example, a pointer can be a nullable pointer (`BotPTR`), a not-null pointer (`NotNull`), a not-null constant (`Constant`), a null constant (`Null`), an impossible value (`TopPTR`), and `AnyNull`? This is really hard to conceptualize even when we know that `AnyNull` is the dual of `NotNull`. It also does not really work, which leads to us sprinkling `above_centerline` checks all over the place. Additionally, the number of combinations in a meet increases quadratically with respect to the number of instances of a `Type`. This makes the already hard problem of meeting 2 complicated sets a nightmare to understand.
>> 
>> This PR reimplements `Type::join` as a separate operation and removes most of the `dual` concept from the `Type` class hierachy. There are a lot of benefits of this:
>> 
>> - It makes the operation much more intuitive, and changes to `Type` classes can be made easier. Instead of thinking about type lattices and how the representation places the `Type` objects on the lattices, it is much easier to conceptualize a join when we think a `Type` as a set of possible values.
>> - It tightens the invariants of the classes in the hierachy. Instead of having 5 possible `ptr()` value when a `TypeInstPtr` participating in a meet/join operation, there are only 3 left (`AnyNull` is non-sensical and `TopPTR` must be an `AnyPtr`). This, in turns, reduces the number of combinations in a meet/join from 25 to 9, making it much easier to reason about.
>> 
>> This PR also tries to limit the interaction between unrelated types. For example, meeting and joining of a float and an int seem to happen only when we try to do those operations on the array types of those types. Limiting these p...
>
> Quan Anh Mai has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits:
> 
>  - Merge branch 'master' into typejoin
>  - Move dual to ASSERT only
>  - Keep old version for verification
>  - whitespace
>  - Reimplement Type::join

Honestly, I need proofs here.

Using normal lattice vocabulary there. As far as I understand CCP is a standard analysis that starts with bottom and join and widen until reaching an approximation of a fixpoint, hopefully, the least one. A widening is guaranteed to eventually be stationary (even if it means defaulting to top), which guarantee jumping above a fixpoint in finite time.

IGVN is the dual strategy: we start from top, and we refine using meet and narrowing, which is not a dual widening: we must stay on the same side of the fixpoint (since we started on the safe one).

I don't understand how the structure of the lattice would give any guarantee about the respective results of those. It seems to me that it entirely depends on the quality of our widening. For instance, a widening that would always map to top, and converge in one iteration, would make CCP terminates very fast, and be sound, while being provable worse (or equally bad) than any sound analysis. I'm not even convinced that the symmetry hypothesis is necessary to have the mentioned result. (It is clearly not sufficient from my previous example).

We probably need associativity, but then, if we have an abstract domain that is a lattice, that should be rather straightforward. If we start having simple posets instead, we can be very sound, but associativity might require more care...

-------------

PR Comment: https://git.openjdk.org/jdk/pull/28051#issuecomment-3562088359
PR Comment: https://git.openjdk.org/jdk/pull/28051#issuecomment-3562094419