"Model 2" prototype status
Brian Goetz
brian.goetz at oracle.com
Fri Jul 31 19:50:32 UTC 2015
Over the past few months, we've been building what we call our "Model 2"
prototype, and incrementally checking it in. It now seems stable enough
for brave early adopters to check out. (You'll need to build the JDK
from source.)
The previous version ("Model 1") was focused on exploring the
practicality of specialization on the JVM architecture we have today.
The compiler augmented classes with type metadata that otherwise would
be erased away, which could be ignored by the JVM but consumed by the
specializer. Specialized classes were identified using a name mangling
scheme (strictly an expedience for prototyping, not a long-term plan.)
The class loader recognizes the name mangling scheme and, if not found
in the class path, the class loader invokes the specializer to generate
the specialized class on the fly.
With these many hacks (name mangling, abuse of class loaders), the
result was mixed. On the good side, it worked! It was possible to
write specializable generic classes and run them on an
only-lightly-hacked JVM. On the bad, the resulting language had a
significant usability issue -- the lack of a nontrivial common supertype
between Foo<int> and Foo<String>.
Of course, we didn't pursue this approach because we thought
half-killing wildcards was a great idea; we pursued it because it was
what worked on the JVM we had. So with the Model 1 prototype in hand,
we set out to see what could be done about providing a reasonable common
supertype for all instantiations of an any-generic type. (We explored a
number of possible mechanisms and approaches, including several that are
more radical than where we landed. Hopefully we will find time to write
up some of these roads-not-taken.)
Type variables. Type variables are divided into two categories; "ref"
(ordinary) and "any". Any tvars ("avars") are identified by the keyword
"any" at their declaration site (for both generic classes and generic
methods). If a type variable is not modified by "any", it is implicitly
an ordinary tvar, and treated just as in Java 8.
class Foo<T> { ... } // T is an ordinary tvar
class Bar<any T> { ... } // T is an avar
Class hierarchies can be any-fied from the top down. So it's OK to do:
class A<any T> { ... }
class B<T> extends A<T> { ... } // T is an ordinary tvar here
but not OK to do:
class A<T> { ... }
class B<any T> extends A<T> { ... }
The rationale for this should be clear enough; specializing a class
entails specializing its superclasses, and if the superclass is not
specializable, this won't work. (Alternately, you can interpret "any T"
as a union bound ("T extends Object | value"), and its OK to use a
narrower bound than your supertype, but not a wider one.)
Restrictions on avars. Some operations that are allowed on ordinary
tvars are not allowed on avars, such as assigning a T to an Object,
assigning null to a T, etc. These have not changed from Model 1.
Wildcards. The big change in Model 2 is the addition of support for
wildcards over avars. The problem with wildcards has two facets;
translational (how do we represent a wildcard type over avars in
bytecode?) and interpretation (Foo<?> has always been a shorthand for
Foo<? extends Object>; on the other hand, the "intuitive" intepretation
of Foo<?> is "any instantiation of Foo.") The translational issues
require some help from the JVM to solve (not yet implemented in the
prototype.) The interpretive issues are subtle. While we explored
trying to automatically interpret Foo<?> according to the common user
intuition, this ran afoul of numerous compatibility issues.
So, where we landed is: just as one must specify any-ness at the
declaration site for a type variable, one must do the same for a
wildcard (which is essentially declaring an anonymous type variable.)
So there are two forms of wildcard:
Foo<ref> -- describes any reference instantiation
Foo<any> -- describes any instantiation
and Foo<?> will be retconned to mean Foo<ref>.
Raw types. Raw types have not changed at all from Java 8. As such, they
are limited to reference instantiations.
The upshot of this is we can achieve perfect source compatibility; the
raw type Foo and the wildcard type Foo<?> continues to mean exactly what
they always did.
As a proof of concept, I've checked in a limited version of Streams
(java.anyutil.Stream) that has been ported to the new model. (Some
things are still not hooked up yet, but the basic functionality for
Stream<any T> works.)
The Model 2 approach needs some (targeted) help from the VM, which is
not yet in place. (Specifically, protected/package methods in
interfaces, and some help for access to private members across
specializations of the same class.) Until we have this,
protected/package methods in anyfied classes will be problematic.
There are also some compiler limitations, the most significant of which
is that inner classes don't yet work as instantiations of anyfied
classes; you have to refactor the inner class to a static named class.
(There's no fundamental issue here, its just not done yet.)
Most helpfully, the latest IntelliJ has some early support for anyfied
generics! (It's not perfect, but its been really helpful -- thanks
guys.) Select language level "X" (for experimental) and it will mostly
deal with any-tvars and any-wildcards, instead of painting your whole
program red.
We'd like to get feedback from people *using the prototype* to anyfy
their own code. (I'm sure people have a zillion "why did you / why
didn't you" questions; we'll try to get to those in a writeup.) I'll
also be putting together a writeup on the translation strategy (though
brave explorers can and surely will reverse engineer this with javap
before this happens.)
More information about the valhalla-dev
mailing list