[code-reflection] RFR: Add support for more types [v3]

Fri Apr 26 10:00:54 UTC 2024

On Thu, 25 Apr 2024 10:56:06 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:

>> This PR adds support for type-variables and wildcard type arguments in the code model `JavaType`'s hierarchy.
>> 
>> This allows the code model to reflect the source types much more accurately, as we no longer need to erase the source type at the first sign of a non-denotable type. Instead, we can use the a modified (see below) version of the `Types::upwards` function (type projection) to compute the closest **denotable** upper bound to the type found in the source code. This means that the type associated with every op in the model is a (denotable) supertype of the type in the javac AST. The fact that such type is denotable has three important consequences:
>> 
>> * the type can be expressed in the source code (in case the code model needs to be lifted back into Java source)
>> * the type must be expressible in the syntax of bytecode signature attributes (this is important e.g. for the local variable type attribute)
>> * the type can be resolved to its runtime counterpart in `j.l.r.Type` (not implemented in this PR), as explained below
>> 
>> Some parser changes were required to support this, so that we can serialize and deserialize the new types accordingly.
>> 
>> A new method has been added to `JavaType`, namely `JavaType::erasure`, which computes the erasure of a `JavaType`. This might come in handy when lowering the model into bytecode. Since supporting erasure is crucial, modelling of types has been carefully chosen as to facilitate that operation as much as possible: that is why, for example, `TypeVariableRef` contains the "principal" type-variable bound (so that we can define erasure for type-variables in a straightforward fashion, as the erasure of the primary bound).
>> 
>> #### Denotable projections
>> 
>> The code model type associated with an op result is computed by applying a modified version of `Types::upwards` - that is, the function that implements type projections as specified in [JLS 4.10.5](https://docs.oracle.com/javase/specs/jls/se22/html/jls-4.html#jls-4.10.5). The original projection algorithm is designed to leave intersection types in place - while this is handy, as it maximizes the applicability of the type inferred for local variables declared with `var`, for the code model use this is not suitable, as we'd like to get to a denotable type in the end (jshell has a similar problem, which was addressed in a more ad-hoc way).
>> 
>> It is generally possible to project an intersection type using only one of its bounds, e.g.
>> 
>> 
>> List<A & B>
>> ...
>
> Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Remove whitespaces

I gave this a try, but I don't think we should pursue this, at least not as part of this patch. Here's some code I put together:

https://github.com/mcimadamore/babylon/compare/projections...mcimadamore:babylon:java_type_argument?expand=1

I think I got the parser working, but then we're greeted with a death-by-thousands cuts situation where most classes use `JavaType` to mean "type argument" (and they use that to call the `JavaType.type` factory for parameterized types). If we tweak that factory to take `JavaType.Argument` instead (as I did in that branch), then several calls to the factory start failing, and we need to add casts instead. The situation is not helped by the fact that the `JavaType` factories are not always sharp (e.g. the factory `type` returns just `JavaType`. 

This is aggravated by the fact that there's no type to say "a JavaType that is also a type argument". As a result, `JavaType` casts too wide a net (because of primitive types), but `TypeArgument` is too sharp, as it contains stuff (wildcards) that are not `JavaType`.

Overall it wasn't clear to me that doing this refactoring would be beneficial, especially as part of a PR that is already relatively big - given that the refactoring doesn't seem the "slam dunk" we were looking for.

> > Yes, `TypeDefinition` is identifier plus `List<TypeDefinition>`. So we have some flexibility in there. I was assuming we wanted 1-1 relationship between TypeDefinition and JavaTypes, but that doesn't need to be the case.
> 
> Right, this enables us to serialize and parse non-Java-based code models with some non-Java-like type descriptions.

-------------

PR Comment: https://git.openjdk.org/babylon/pull/51#issuecomment-2079047507