RFR: 8370914: C2: Reimplement Type::join

Thu Oct 30 11:00:52 UTC 2025

On Thu, 30 Oct 2025 10:23:00 GMT, Manuel Hässig <mhaessig at openjdk.org> wrote:

>> Hi,
>> 
>> Currently, `Type::join` is implemented using `Type::dual`. The idea seems to be that the dual of a join would be the meet of the duals of the operands. This helps us avoid the need to implement a separate join operation. The comments also discuss the symmetry of the join and the meet operations, which seems to refer to the various fundamental laws of set union and intersection.
>> 
>> However, it requires us to find a representation of a `Type` class that is symmetric, which may not always be possible without outright dividing its value set into the normal set and the dual set, and effectively implementing join and meet separately (e.g. `TypeInt` and `TypeLong`).
>> 
>> In other cases, the existence of dual types introduces additional values into the value set of a `Type` class. For example, a pointer can be a nullable pointer (`BotPTR`), a not-null pointer (`NotNull`), a not-null constant (`Constant`), a null constant (`Null`), an impossible value (`TopPTR`), and `AnyNull`? This is really hard to conceptualize even when we know that `AnyNull` is the dual of `NotNull`. It also does not really work, which leads to us sprinkling `above_centerline` checks all over the place. Additionally, the number of combinations in a meet increases quadratically with respect to the number of instances of a `Type`. This makes the already hard problem of meeting 2 complicated sets a nightmare to understand.
>> 
>> This PR reimplements `Type::join` as a separate operation and removes most of the `dual` concept from the `Type` class hierachy. There are a lot of benefits of this:
>> 
>> - It makes the operation much more intuitive, and changes to `Type` classes can be made easier. Instead of thinking about type lattices and how the representation places the `Type` objects on the lattices, it is much easier to conceptualize a join when we think a `Type` as a set of possible values.
>> - It tightens the invariants of the classes in the hierachy. Instead of having 5 possible `ptr()` value when a `TypeInstPtr` participating in a meet/join operation, there are only 3 left (`AnyNull` is non-sensical and `TopPTR` must be an `AnyPtr`). This, in turns, reduces the number of combinations in a meet/join from 25 to 9, making it much easier to reason about.
>> 
>> This PR also tries to limit the interaction between unrelated types. For example, meeting and joining of a float and an int seem to happen only when we try to do those operations on the array types of those types. Limiting these p...
>
> Impressive work @merykitty!
> 
> It would be good if you could explain how your new lattice definition compares to our current definition. Especially, in regard to the completeness of join, which we currently get for free from symmetry and a complete meet, and associativity and distributivity and how we have to implement the relation in Value() to keep those properties.
> 
> While I see that your definition is potentially easier to reason about, I think that you first need to convince everyone that your definition has no significant drawbacks to what we currently have simply because this change is so fundamental.

@mhaessig Thanks for taking a look.

> It would be good if you could explain how your new lattice definition compares to our current definition.

The thing is that it is easier and more intuitive to think of a `Type` object as a set of possible values. For example, a `TypeInt` represents a set of `int` values, and a node (e.g. an `AddI`) at runtime must take values that are elements of the `TypeInt` of that node. Thinking of a `Type` as a lattice point is both more unintuitive and easier to misstep.

> Especially, in regard to the completeness of join, which we currently get for free from symmetry and a complete meet

It is not free, though. It requires us to prove that the type representation is symmetric and implement `xmeet` in a way that satisfies such symmetry. If the symmetry is not trivial, such as in the case of `TypeInstPtr`, it results in a meet that is harder to understand than implementing meet and join separately. It is because instead of reasoning about sets and their elements, we have to keep track of the core symmetry, and how our lattice points work with such symmetry, definitely not free. The core issue is that the existence of dual types doubles the set of values that can participate in a set. For example, in `TypeInstPtr`, instead of implementing a meet and a join, each with 9 possible combinations, all of which are intuitive (because they represent real sets), we have to implement a meet that takes into consideration 25 input combinations.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/28051#issuecomment-3467357470