Conditional members

Tue Mar 29 19:52:58 UTC 2016

Yet another in a series of disconnected, bottom-up (starting at the VM) 
memos laying the groundwork for the enhanced generics model.

Basic Problem
=============

It may be desirable, for purposes of expressiveness or migration 
compatibility, to declare class members that are only members of a 
specific subset of parameterizations of a generic class.  Examples include:

  - Reference-specific API assumptions.  In our analysis of the 
Collection classes, we identified various methods that fail to make the 
jump to any-generics for various reasons.  These include methods like 
Collection.toArray(), whose signature makes no sense for primitive 
parameterizations, or Map.get(), which uses `null` (not in the domain of 
primitives) to indicate "not present."  We can't take these methods away 
from reference instantiations, but we don't want to propagate them into 
primitive instantiations.

  - Better implementations enabled by known type parameters. Generic 
classes will provide generic implementations, but sometimes better 
implementations are possible when concrete types are known.  In this 
case, an implementation would provide a generic implementation and zero 
or more implementations that are restricted to more specific 
implementations.

  - Functionality available only on specific implementations.  For 
example, List<int> could have a sum() method even though sum() does not 
make sense on all instantiations.  (This is the declaration-site version 
of what C# enables at the use site with extension methods -- allowing 
methods to be injected into types, rather than classes.)

We've not yet spent a lot of time identifying the proper way to surface 
this in the language.  For methods, one possibility is to use receiver 
parameters (added in Java SE 8) to qualify the receiver type:

     int sum(List<int> this) { ... }

This gets the point across clearly enough (and is analogous to how C# 
does extension methods), but has several drawbacks: doesn't scale to 
fields, nor does it scale well to a conditional-membership model that is 
anything other than "I am a member of parameterization X".  (Where this 
might fall down, for example, would be when we want members declared as 
"I am *not* a member of parameterization X".)

Note that in the second motivating example, there will be two members 
signatures with the same name and signature; we want one to take 
precedence over the other.

We call these "conditional" or "restricted" members.

Classfile Strawman
==================

Here's a strawman of how we might represent this at the VM level.

We define a new attribute, `Where`, which can be applied to instance 
fields, instance methods, and constructors:

|    Where {
         u2 name_index;
         u4 length;
         u2 restrictionDomain;|// refers to a ParamType constant
     }

The restriction domain indicates the parameterization to which this 
member is restricted; in the absence of Where attribute, it is assumed 
to be ThisClass<any, any, ...>.

When loading a parameterization of a generic class, we perform an 
applicability check for each member as we encounter it; in the model 
outlined here, this is a straight subtyping check of the current 
parameterization against the restriction domain.

It is possible there could be duplicate applicable methods; this arises 
when we have a specialization-specific "override", as in:

class Foo<any T> {
     // total method m(T)
     void m(T t) { }

     // Specialization of m(T) for T=int
     void m(Foo<int> this, int i) { ... }
}

When we find a duplicate applicable member, we perform a "more specific" 
check comparing the restriction domains; in this case, the second method 
has a restriction domain of Foo<int>, which is more specific than the 
(implicit) Foo<any> restriction domain of the generic method, so we 
prefer the second member.

This procedure is strictly linear; as each member is read from the 
classfile, we can make a quick determination as to whether to keep or 
discard it; if we keep it, we might replace it later with a more 
specific one as we find it.  Modulo cases where there are multiple 
applicable overloads that are equally specific, it is also 
deterministic; whether we find the generic version of m() or its 
specialization first, we'll end up with the same set of members.

If there are duplicate applicable members in a classfile where neither's 
restriction domain is more specific than the other's, then the VM is 
permitted to make an arbitrary choice (as they are both applicable and 
equally specific.)  The static compiler can work to filter out such 
situations, if desired, such as imposing a "meet rule"; if we had:

     void foo(Foo<int,any> this)
     void foo(Foo<any,int> this)

a meet rule would require the additional overload

     void foo(Foo<int,int> this)