Classes, specializations, and statics

Mon Feb 15 18:11:16 UTC 2016

M3 leaves us in a position to check off one of the outstanding issues, which is that of specialization-specific statics.  

Members of Java classes have historically been divided into static and instance members; static members are associated with a class, and instance members with an instance of a class.  

When Java 5 extended the type system to support multiple TYPES that are represented by the same CLASS (e.g., ArrayList<String> and ArrayList<Number>), we had a choice; treat static members as belonging to the CLASS, or as belonging to the TYPE.  We chose the former, as that was consistent with the translation strategy of erasure, and also maximized compatibility with existing code.  (When .NET did reified generics, they chose the opposite; Foo<Number>.staticMember and Foo<String>.staticMember refer different variables.  That’s also a valid choice.)  

The following program would be sensitive to this distinction:

class Foo<any T> {
    static int count;

    public Foo() { ++count; }
}

new Foo<String>();
new Foo<Number>();

In Java, this program would increment the common counter twice; under the alternative interpretation, the counters Foo<String>.count and Foo<Number>.count would each be incremented once.  

Statics in Java work the way they do, and we’re not proposing we change that.  However, once we break the assumption that all instantiations of a parameterized type are reference instantiations, we run into some issues with existing code idioms.  What follows is a proposed generalization of static members in the spirit of Model 3.  

The Problem
-----------

Java code frequently uses tricks like the following, that exploit the assumption of erasure: 

// Cached instance of an empty collection
private static final Collection<?> c = new EmptyCollection<?>();

// Factory method that dispenses the cached empty collections, suitably casted
public static<T> Collection<T> emptyCollection() { return (Collection<T>) c; }

The above trick works because of erasure; a Collection<?> has the same representation as a Collection<String>, Collection<Integer>, etc, so we can freely cast it about with no loss of type safety.  But once we anyfy emptyCollection(), we’re now hosed; we can’t cast a Collection<?> to a Collection<int>.  This leaves us without a means of coding this common idiom, because static members currently are per-class, not per-instantiation.  

Obviously, the above code must continue to mean what it means today.  But we’d also like a means of extending the above idiom more broadly than erased generics.  

Extending Statics to Specializations
------------------------------------

Our current model treats parameterizations of template classes like classes; anywhere in the bytecode that one can refer to a Constant_Class, one can refer to a Constant_ParameterizedType.  (Whether they are actually classes, or more like “species”, is an open question, but whatever they are, there’s a way to write their name in the classfile.)  

The existing prototype places static members of Foo<any T> on the erased species Class[Foo], and translates access to static member m of Foo<any T> as:

    xxxstatic Class[Foo].m

However, we are free to assign meaning to xxxstatic as applied to a member reference whose owner is a parameterized type.  Suppose we extend the current set of member ownerships from { instance, static } to { instance, static, specialization }.  We could then access a per-specialization member using xxxstatic on a member reference whose owner is a specialization.  

The syntactic story is mostly a bikeshed; we’ll need some token to indicate “per-specialization”; we’ll use the silly token __SpecializationStatic for now.  

The access story is simple: static members continue to only be able to reference other static members (and not class type variables); __SS members can access static members and other __SS members, as well as class type variables; instance members can reference static, __SS, and instance members.  

The translation / classfile story is simple.  Assume we have a spare flag bit (we can synthesize one) for ACC_SPECIALIZATION_STATIC (ACC_SS for short.)  Static members are marked with ACC_STATIC; __SS members are marked with ACC_SS.  Accesses to static members continue to be translated as xxxstatic Class[Foo].m; accesses to __SS members are translated as xxxstatic ParamType[Foo,params].m.  

The specialization / runtime story is simple.  Static members are treated as if they are restricted to the erased species (this is a natural choice, since Class[Foo] and ParamType[Foo, erased] describe the same class.)  __SS members become static members on each parameterization.  (Both of these are one-line changes to the existing specializer prototype.)  TypeVar constants used in the signature / bodies of __SS members are specialized as usual, and just work.  

Example:

class Collection<any T> {
   private __SS Collection<T> emptyCollection = …
   // ACC_SS field emptyCollection : ParamType[Collection, TypeVar[T]]

   private __SS Collection<T> emptyCollection() { return emptyCollection; }
   ACC_SS emptyCollection()ParamType[Collection, TypeVar[T]] {
       getstatic ParamType[Collection, TypeVar[T]].emptyCollection : ParamType[Collection, TypeVar[T]]]
       areturn
   }

When we specialize Collection<int>, the field type, method return type, etc, will all collapse to Collection<int> by the existing mechanisms.