"Model 2" prototype status

Fri Jul 31 19:50:32 UTC 2015

Over the past few months, we've been building what we call our "Model 2" 
prototype, and incrementally checking it in.  It now seems stable enough 
for brave early adopters to check out.  (You'll need to build the JDK 
from source.)

The previous version ("Model 1") was focused on exploring the 
practicality of specialization on the JVM architecture we have today. 
The compiler augmented classes with type metadata that otherwise would 
be erased away, which could be ignored by the JVM but consumed by the 
specializer.  Specialized classes were identified using a name mangling 
scheme (strictly an expedience for prototyping, not a long-term plan.) 
The class loader recognizes the name mangling scheme and, if not found 
in the class path, the class loader invokes the specializer to generate 
the specialized class on the fly.

With these many hacks (name mangling, abuse of class loaders), the 
result was mixed.  On the good side, it worked!  It was possible to 
write specializable generic classes and run them on an 
only-lightly-hacked JVM.  On the bad, the resulting language had a 
significant usability issue -- the lack of a nontrivial common supertype 
between Foo<int> and Foo<String>.

Of course, we didn't pursue this approach because we thought 
half-killing wildcards was a great idea; we pursued it because it was 
what worked on the JVM we had.  So with the Model 1 prototype in hand, 
we set out to see what could be done about providing a reasonable common 
supertype for all instantiations of an any-generic type.  (We explored a 
number of possible mechanisms and approaches, including several that are 
more radical than where we landed.  Hopefully we will find time to write 
up some of these roads-not-taken.)

Type variables.  Type variables are divided into two categories; "ref" 
(ordinary) and "any".  Any tvars ("avars") are identified by the keyword 
"any" at their declaration site (for both generic classes and generic 
methods).  If a type variable is not modified by "any", it is implicitly 
an ordinary tvar, and treated just as in Java 8.

     class Foo<T> { ... }      // T is an ordinary tvar
     class Bar<any T> { ... }  // T is an avar

Class hierarchies can be any-fied from the top down.  So it's OK to do:

     class A<any T> { ... }
     class B<T> extends A<T> { ... } // T is an ordinary tvar here

but not OK to do:

  class A<T> { ... }
  class B<any T> extends A<T> { ... }

The rationale for this should be clear enough; specializing a class 
entails specializing its superclasses, and if the superclass is not 
specializable, this won't work.  (Alternately, you can interpret "any T" 
as a union bound ("T extends Object | value"), and its OK to use a 
narrower bound than your supertype, but not a wider one.)

Restrictions on avars.  Some operations that are allowed on ordinary 
tvars are not allowed on avars, such as assigning a T to an Object, 
assigning null to a T, etc.  These have not changed from Model 1.

Wildcards.  The big change in Model 2 is the addition of support for 
wildcards over avars.  The problem with wildcards has two facets; 
translational (how do we represent a wildcard type over avars in 
bytecode?) and interpretation (Foo<?> has always been a shorthand for 
Foo<? extends Object>; on the other hand, the "intuitive" intepretation 
of Foo<?> is "any instantiation of Foo.")  The translational issues 
require some help from the JVM to solve (not yet implemented in the 
prototype.)  The interpretive issues are subtle.  While we explored 
trying to automatically interpret Foo<?> according to the common user 
intuition, this ran afoul of numerous compatibility issues.

So, where we landed is: just as one must specify any-ness at the 
declaration site for a type variable, one must do the same for a 
wildcard (which is essentially declaring an anonymous type variable.) 
So there are two forms of wildcard:

    Foo<ref> -- describes any reference instantiation
    Foo<any> -- describes any instantiation

and Foo<?> will be retconned to mean Foo<ref>.

Raw types. Raw types have not changed at all from Java 8. As such, they 
are limited to reference instantiations.

The upshot of this is we can achieve perfect source compatibility; the 
raw type Foo and the wildcard type Foo<?> continues to mean exactly what 
they always did.

As a proof of concept, I've checked in a limited version of Streams 
(java.anyutil.Stream) that has been ported to the new model.  (Some 
things are still not hooked up yet, but the basic functionality for 
Stream<any T> works.)

The Model 2 approach needs some (targeted) help from the VM, which is 
not yet in place.  (Specifically, protected/package methods in 
interfaces, and some help for access to private members across 
specializations of the same class.)  Until we have this, 
protected/package methods in anyfied classes will be problematic.

There are also some compiler limitations, the most significant of which 
is that inner classes don't yet work as instantiations of anyfied 
classes; you have to refactor the inner class to a static named class. 
(There's no fundamental issue here, its just not done yet.)

Most helpfully, the latest IntelliJ has some early support for anyfied 
generics!  (It's not perfect, but its been really helpful -- thanks 
guys.)  Select language level "X" (for experimental) and it will mostly 
deal with any-tvars and any-wildcards, instead of painting your whole 
program red.

We'd like to get feedback from people *using the prototype* to anyfy 
their own code.  (I'm sure people have a zillion "why did you / why 
didn't you" questions; we'll try to get to those in a writeup.)  I'll 
also be putting together a writeup on the translation strategy (though 
brave explorers can and surely will reverse engineer this with javap 
before this happens.)