Model 3 classfile design document
Brian Goetz
brian.goetz at oracle.com
Wed Feb 17 17:09:03 UTC 2016
Having discussed the classfile representation and sketched out some
plausibility arguments about how the VM can efficiently manage
specialization, let's step back and look at the consequences for what
this means for the language (both Java and other languages.)
Type -> Class mapping. With erased generics, all parameterizations
Foo<T> map to a single class Foo. In the Model 3 model, the classfile
for Foo is essentially a template; we can request parameterizations of
Foo via the ParamType constant (the Class constant Class[Foo] becomes
retconned to mean ParamType[Foo, erased].)
Reflection. In the current prototype, Foo<int> and Foo<String> are
distinct classes; each will respond with distinct .getClass() results.
We don't yet have a means to express that Foo<int> and Foo<String> are
different "species" of Foo; instead each get their own class mirror.
Reflective operations like Foo<int>.class.getName() currently yield ugly
results. Lots of open questions here.
Reification. The question on everyone's mind will be: are we "finally
getting reified generics"? And the answer is: sort of. (This question
also comes with a lot of baggage; there are a lot of people who assume
that erasure is somehow "smelly" and therefore bad, and so of course
reification must be better. But erasure is a pragmatic compromise, and
the alternative is not always better. Let's try and leave the baggage
at the door for now.)
To add to the confusion, not everyone means the same thing by "reified
generics". To some, reification means "types are checked at runtime";
to others, it may merely mean "types are reflectively available at
runtime." Even within the first category, there's a range of what sort
of type checking we might mean, since the VM type system may not be
exactly the same type system as the language-level type system -- and
for good reason. (What if we ask for a reified ArrayList<? extends
List<? extends Foo> & Serializable>? Do we get runtime subtype checking
for wildcards and intersections every time we try to put something in
this List? Would we even want that? Are we sure such checks are
decidable?)
In Model 3, specialization is clearly a form of reification; when we
specialize ArrayList to E=int, the backing store is an int[], and
therefore we get all the type checking that entails. We can clearly
layer additional support for reflectively exposing the bindings of type
parameters in a number of ways.
The Model 3 classfile design explicitly admits both reified and erased
generics at the VM level, by allowing a concrete type descriptor *or*
the 'erased' token as a type parameter to a ParameterizedType. (Note
that 'erased' is not a type, it is merely an allowed type
parameterization -- similar to wildcards in in the Java language.)
There is nothing in the classfile design that encodes the rule
"reference parameterizations are erased"; that's the choice of the
language compiler. In this way, we can consider any non-erased
parameterization to be reified; a ParamType[ArrayList, LString] will
throw ArrayStoreException at runtime if you try to cram something other
than a String into it.
So, does that mean generics are reified? Sort of... For multiple
reasons (including, but not exclusively compatibility), the current plan
is for the Java language to continue to use erasure for reference
parameterizations of generics. But other languages are free to use full
reification where it suits them (and if their Java interop requirements
let them.) If someone uses reflection to reflect over a List<String>
and ask for its type parameter, it will come back as "erased"
(reflection has to support this answer anyway, if only for compatibility
with legacy code.)
So the punchline is, at the Java language, generics are erased *and*
reified; generics over references are erased (as they are today) and
generics over values are reified. I suspect people will be about as
jarred by this as they were by erasure in the first place; I expect
we'll get some degree of "You idiots, you ran 99 yards only to fumble
the ball on the 1 yard line." But looking past this (which is mostly
the above-mentioned baggage), the model seems sound enough; existing
reference generics work as they always have, and new value generics work
"better" (in that there are additional things you can do with them.)
In fact, it gives us a chance to be more honest about erasure, because
"erased" can appear as a first-class member of the programming model. I
believe much of the complaints about erasure stem from the fact that it
is inevitably a surprise when you first discover it.
On 1/22/2016 11:52 AM, Brian Goetz wrote:
> Please find a document here:
>
> http://cr.openjdk.java.net/~briangoetz/valhalla/eg-attachments/model3-01.html
>
>
> that describes our current thinking for evolving the classfile format
> to clearly and efficiently represent parametric polymorphism. The
> early concepts of this approach were outlined in my talk at JVMLS last
> year; this represents a refinement of those ideas, and a reasonable
> "stake in the ground" description of what seems the most sensible way
> to balance preserving parametric information in the classfile without
> imposing excessive runtime costs for loading specializations.
More information about the valhalla-spec-observers
mailing list