Classfile representation of specializable classes

Fri Jul 11 16:58:32 UTC 2014

We can think of a specializable class:

class Box<any T> {
     T t;

     public Box(T t) { this.t = t; }

     public T get() { return t; }
}

as being both a class (we're used to compiling the above into an erased 
class for reference instantiations of T) and a template (for specialized 
instantiations.)

We have a range of options for what javac could produce this class:

  - A template file only, from which both specialized instantiations 
(Box<int>) and erased instantiations are derived;

  - An erased class file (like we do today) plus a template file (where 
the erased version is technically unnecessary, but statically generated 
as an startup-time optimization)

  - An erased class file with a template embedded in it, accessible via 
reflection

  - An erased class file with attributes identifying how to transform it 
into a specialized class (this was the example given in the writeup.)

Further, the notion of "template" itself ranges over many possible 
options.  (Note that it is not a goal to produce a feature like 
Expression Trees.)

The example in the writeup has many pleasing properties (one artifact, 
so can't get out of sync; encode minimal additional metadata needed to 
efficiently transform an erased classfile to a specialized classfile; 
the "template" can be used as an erased classfile as is, just by 
ignoring the specialization metadata attributes.)

It also has an unpleasant property: the bytecode for an already loaded 
class is not available easily through its classloader or reflection, 
meaning that a runtime specialization mechanism would have to jump 
through hoops (e.g., retransform-classes) to get at the bytecode.  Since 
specializing a class under this scheme requires, as input, the bytecode 
of the original class, this could become problematic for classes that 
are themselves generated (i.e., not findable with 
ClassLoader.getResource("Foo.class")).

The factors being traded off here are:
  - classfile size -- the classfile+template approach would be ~2x the 
size of the classfile alone; other approaches might be as little as a 
few percent bigger
  - startup cost -- deriving the erased classfile from a template costs 
us startup time, and the "load erased classfile" use case isn't going 
away any time soon
  - specialization cost -- some template forms are cheaper than others 
to generate bytecode from
  - security -- exposing the bytecode directly may have security 
consequences

For the initial prototype, which will operate by offline classfile 
transformation, any of these will do, so for that we should do the 
simplest thing that gets us to a prototype.  But we should also continue 
to evolve the representation story.