[code-reflection] RFR: JavaRef extends TypeElement

Maurizio Cimadamore mcimadamore at openjdk.org
Wed Apr 30 08:41:56 UTC 2025


On Tue, 29 Apr 2025 22:12:34 GMT, Paul Sandoz <psandoz at openjdk.org> wrote:

> Each implementation of `JavaRef` supports an external form, using the sigil prefix `&` to indicate a reference of some kind.
> 
> The externalized form of `TypeVarRef` is changed to be uniformly structured. This required update to the code model text of a few tests. Arguably we are taking a step back in terms of readability, but we can address that later - it was easier to update a few tests rather than preserve the existing encoding in a few places.
> 
> We still rely on the bespoke string form for java refs, which is used as the value of an externalized attribute. Ideally such attribute values are either instances of `JavaRef` or `ExternalizedTypeElement` to be transformed into the appropriate `JavaRef`. We can address that later to further separate out parsing.
> 
> I believe we have what we need to further enhance the code model builder to not rely on bespoke parsing logic of types and refs. We can either construct `ExternalizedTypeElement` tree instances explicitly or parse from a very simple grammar. If we are careful i believe we can share the results of nested type elements if reused e.g. as in `List<Double>` and `Set<Double>`.
> 
> More generally it now means we can generate a simple s-expression-like tree for the whole code model, e.g., a string where the `(` and `)` characters represent tree structure and say `L` represents a leaf node, and a list of leaf node values in topological order.
> 
> --
> 
> Below is the type grammar, which i believe is consistent with what we have implemented.
> 
> # Type element grammar
> 
> ## General structure
> 
> 
> identifier
>   string
> 
> name
>   identifier
> 
> sigil
>   # | . | & | + | - | [
> 
> type
>   sigil
>   identifier
>   sigil identifier
>   identifier < types >
>   sigil identifier < types >
> 
> types
>    type
>    type , types
> 
> 
> ## Core types
> 
> 
> varType
>   "var" < type >
> 
> tupleType
>   "tuple"
>   "tuple" < types >
> 
> funcType
>   "func" < types >
> 
> 
> # Java types
> 
> 
> javaType
>   primitiveType | classType | wildCardType | arrayType | typeVariableType
> 
> javaType-no-wildCardType
>   primitiveType | classType | arrayType | typeVariableType
> 
> primitiveType
>   boolean | byte | ... | void
> 
> classType
>   name
>   name < paramTypes >
>   . < enclosingType , innerType >
> paramTypes
>   javaType
>   javaType , javaTypes
> enclosingType
>   classType
> innerType
>   classType
> 
> wildcardType
>   + < wildcardTypeBound >
>   - < wildcardTypeBound >
> wildcardTypeBound
>   javaType-no-wildCardType
> 
> arrayType
>   dims < javaType >
> dims
>  [
>  [ dims
> 
> typeVariableType
>   "#" name < typeVariableTypeOwner , typeVariableTypeBound >
> typeVa...

Re s-expressions, it struck me that an alternate, more verbose representation, would be one where we have no sigils, but we use function application, and the function name determines the type being constructed. E.g. instead of:

`Map<String, Integer>`

we could do:

`class<Map, String, Integer>`

Instead of:

`#T<List, Object>`

we could do:

`tvar<T, List, Object>`

There's not much between the two. But if we're after uniformity, the latter feels more uniform, as it treats the "name" as a true type factory, and gives all types a more uniform shape:


type
  identifier
  identifier < types >


I think we started off where we are now because we were attracted by the similarity with generic types. But it's becoming increasingly clear that we will need to display these types in a more human-readable fashion anyway -- at which point, do we still care if the grammar of a (class) type element is different from that of a generic Java type? (In fact, I'd consider replacing `<>` with `()`).

-------------

PR Comment: https://git.openjdk.org/babylon/pull/416#issuecomment-2841240483


More information about the babylon-dev mailing list