Evolving CONSTANT_Class
Brian Goetz
brian.goetz at oracle.com
Mon Jun 15 19:28:39 UTC 2020
>
> Here's a table listing all the type-flavored uses (where "X" means
> "allowed here" and "~" means "maybe not essential, but the semantics
> would be clear"):
More specifically, in the first two columns X means "allowed now", and
in the later columns, X means "proposed." Note too that the proposed
Species column is identical to the proposed Class name column.
The primitive column is interesting as we probably are going to
translate away all of these to some sort of `Qint` type when they appear
in these places, so in the JVM, are probably not needed.
> Another way to handle it is to distinguish between a *species*, which
> is a class-like entity, and a *species type*. It's helpful to remember
> that there may be inline types of species (that is, a "Q envelope" of
> a species).
I think this is a fruitful direction; I can have `ArrayList[T] extends
List[T]` where it is a class-like use, and I can have `Foo[T].x` where
it is a type-like use.
> 1) Treat everything in the class/interface table as a degenerate use
> of a type. A class name is always interpreted as an L type.
Given that a specializable class Foo<T> gives rise to species Foo[x] and
Foo[y], _and_ a class type Foo such that Foo[t] <: Foo for all t, the
duality between class and type here seems inevitable.
> - When a Class constant is viewed as a type (for (1) that's always,
> for (2) that's for type-flavored references), the implicit L envelope
> is a historical wart. Do we also support explicit L descriptors? Do we
> try to migrate the world away from the implicit envelopes?
I would love to migrate away, but I suspect the cost/benefit isn't
there. Historical warts are OK.
> - Should we add primitive types? How are they spelled? (The standard
> descriptor syntax for primitives is already interpreted as a bare
> class name.)
Given the way we are thinking for translation, where there is going to
be some Q type that stands in for primitives when used in class-y
contexts (if for no other reason than the double-slot thing), I don't
think this is needed.
> - How do we handle type variables, both top-level and nested? Either
> we embed constant pool pointers in Utf8 entries (yuck!), or we need to
> extend Class constants to support references both to Utf8 entries and
> to [some new thing].
This is the stringy-vs-tree problem we've been wrestling with for a long
time. The solution to this problem seems to hinge on the solution to
that one.
> - Should we revisit "naked" descriptor references, allowing them to
> point to either bare Utf8 entries or Class constants and
> MethodType/[something else] constants? Do we try to migrate the world
> away from naked descriptor references?
I think this may well fall out of the "trees vs strings" discussion.
> I'm appealing here to a design principle that seems to have driven the original constant pool design: Class constants are for things that get resolved (and can be cached); descriptor strings are little more than fancy names. This principle doesn't always get followed: the verifier sometimes loads classes named by descriptors; array type class constants resolve their element types without a separate entry; more recently, StackMapTables use Class constants to represent types, and MethodTypes resolve method descriptors "as if" there were class constants for all of the parameter types. But I think these, especially the recent ones, are mistakes, and I still think the original notion is a useful separation of concerns that we should try to follow in our design.
The tension that comes up here is that we want to be able to match
descriptors between clients and declarations. I don't want to invent
one way to describe class constants for species, and another way to
embed species in descriptors.
Now, it may be possible (depending on our translation strategy) that we
don't need to embed species in descriptors, because we're just going to
erase descriptors, and put the specialization information somewhere
else, for the VM to use opportunistically. That would make the
splitting strategy more appealing.
> - For bare descriptors (type of a field), it's fine to use something like "LList[QVal;];". Or maybe it's useful to describe descriptors in terms of Class/Species constants. In any case, there's still a need to figure out how to parameterize a descriptor with live constants ("LList[$T];"), but I think this can be set aside as a separate problem.
This is the one I'm alluding to above.
> So I think we need CONSTANT_SpecializedMethodref, which has 1) a pointer to a Methodref constant, and 2) pointers to some resolvable constants (typically, but maybe not exclusively, representing types). (Caveat: there are some details about the interaction between type arguments, overriding, and method resolution that I'm hand-waving about. Maybe the encoding will be stacked a little differently.)
We've been around this merry go round a few times too, going back and
forth between cramming stuff into the descriptor string and putting the
method types somewhere else. Again, the translation story (can we leave
descriptors alone) impinges on this.
Don't forget that when you have a local generic class nested in a
generic method, the method args implicitly parameterize the nested
class. Which means that when we refer to a species of the local class,
we have to supply the type arguments for both the method and for the
local class (and any other enclosing classes.) Again, there is a
lump/split choice here; we can smoosh together the arguments, or provide
a trail of witnesses to the enclosing arguments. If we choose the
latter, then it might be mix of C_SMRef and C_Species.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/valhalla-spec-experts/attachments/20200615/45b8e4b1/attachment.htm>
More information about the valhalla-spec-experts
mailing list