Species-static members vs singletons

Mon May 23 10:05:32 UTC 2016

Hi Peter,
are you sure we need special treatment for 'it = st' ? After all, the 
compiler will issue unchecked warnings every time you'll try to access a 
species static from a non-reifiable type i.e.

Foo<String>.st = ""; //warn
Foo<int>.st = 42; //no warn

In other words, can we put the burden of heap pollution-ness on the 
client and be happy?

Maurizio

On 22/05/16 23:58, Peter Levart wrote:
> Hi Brian,
>
> I agree that "species" placement is a better, less verbose option. But 
> how to solve the language problem of having "species" and "instance" 
> members of the same "type-variable" type be assignable to one-another? 
> For example:
>
> class Foo<any T> {
>     species T st;
>     T it;
>
>     void m() {
>         it = st; // this can not be allowed
>         st = it; // this can be allowed
>
>         // maybe this could be allowed?
>         @SuppressWarnings("unchecked")
>         it = (T) st;
>     }
>
>
> Singleton abstraction has the same problem.
>
> So while technically possible, it would be weird to have 'T' sometimes 
> not be assignable to 'T'. Can we live with that?
>
> Regards, Peter
>
> On 05/19/2016 04:36 PM, Brian Goetz wrote:
>> We discussed two primary means to surface species-specific members in 
>> the language: a "species" placement (name TBD) as distinct from 
>> static and instance, or a "singleton" abstraction (a la Scala's 
>> "object" abstraction, as Peter L suggested).  We've done some 
>> experiments comparing the two approaches.
>>
>> Separately, we discussed two strategies for handling this at the VM 
>> level: having three separate placements (ACC_STATIC, ACC_SPECIES, and 
>> instance) or retconning ACC_STATIC to mean "species" and using 
>> compiler trickery to simulate traditional statics.  In recent 
>> discussions with Oracle and IBM VM folks, they seemed happy enough 
>> with having a new placement (and possibly new bytecodes, 
>> {get,put,invoke}species, or overloading these onto *static with 
>> ParamTypes in the owner field of the various XxxRef constants.)
>>
>>
>> There are several places where the language itself can take advantage 
>> of species members:
>>
>> 1.  Reifying type variables.  For an any-generic class Foo<T,U>, the 
>> compiler can generate public static final reflection-thingie-valued 
>> fields called "T" and "U", which means that "aFoo.T" (as an ordinary 
>> field ref!) would evaluate to the reflective mirror for the reified T 
>> -- if present, otherwise it would evaluate to the reflective mirror 
>> for 'erased'.
>>
>> 2.  Representation of generic methods.  The current translation 
>> strategy has us translating any-generic methods to classes; a static 
>> method
>>
>>     static<any T> void foo(T t) { }
>>
>> translates to a class (plus an erased bridge):
>>
>>     bridge static foo(Object o) { ... invoke erased specialization ... }
>>
>>     static class Xxx$foo<any T> {
>>         void foo(T t) { ... }
>>     }
>>
>> This means that an instance of Xxx$foo is needed to invoke the method 
>> -- but serves solely to carry the type variables -- which is 
>> unfortunate.  If instead we translate as:
>>
>>     static class Xxx$foo<any T> {
>> *species-static *void foo(T t) { ... }
>>     }
>>
>> then we can invoke this method via invokespecies:
>>
>>     invokespecies ParamType[Xxx$foo, T_inf].foo(T_inf)
>>
>> where T_inf is the erasure-normalized type inferred for T (reified if 
>> value, `erased` reference.)  No fake receiver required.
>>
>> The translation for generic instance methods is still somewhat 
>> messier (will post separately), but still less messy than if we also 
>> had to manage / cache a receiver.
>>
>>
>> We also drafted some examples of how such a facility would be used, 
>> writing them both with species-static and with singleton.  Examples 
>> and notes below; the summary is that in all cases, the species-static 
>> version is either better or about as good.
>>
>>
>>
>> 1.  The old favorite, caching an instantiated instance.
>>
>> Species
>> 	Singleton
>> class Collections {
>>     private static class Holder<any T> {
>>         private species List<T> empty = new EmptyList<T>();
>>     }
>>
>>     static<any T> List<T> emptyList() { return Holder<T>.empty; }
>> }
>> 	class Collections {
>>     private singleton Holder<any T> {
>>         private empty = new EmptyList<T>();
>>     }
>>
>>     static<any T> List<T> emptyList() { return Holder<T>.empty; }
>> }
>>
>>
>> Note that in this case, species by itself isn't enough -- we still 
>> need a holder class, and its a bit ugly.  Arguably we could merge 
>> Holder into EmptyList (if that's under our control) but because 
>> Collections is an old-style "static bag" class (aka "sin bin"), we 
>> would still need a holder class for state.  (Collections could share 
>> a single holder for multiple things; empty list, empty set, etc.)
>>
>> Neither the left nor the right seems particularly better than the 
>> other here.  (If we were putting this method on Collection, where it 
>> would likely go in new code since now interfaces can have statics, 
>> the species approach would win, since we'd not need the holder class 
>> any more.)
>>
>>
>> 2.  Instantiation tracking.
>>
>> Species
>> 	Singleton
>> class Foo<any T> {
>>     private species int count;
>>     private species List<Foo<T>> foos;
>>
>>     public Foo() {
>>         ++count;
>>         foos.add(this);
>>     }
>> }
>> 	class Foo<any T> {
>>     private singleton FooStuff<T> {
>>         private int count;
>>         private List<Foo<T>> foos;
>>     }
>>
>>     public Foo() {
>>         ++Foo<T>.count;
>>         Foo<T>.foos.add(this);
>>     }
>> }
>>
>>
>> Because the state is directly tied to the instantiation, the left 
>> seems more attractive -- doesn't require an extra artifact, and the 
>> constructor body seems more straightforward.
>>
>>
>> 3.  Implicit-like associations.  Here, we're caching type 
>> associations.  For example, suppose we have a Box<T>, and we want to 
>> cache the associated class for List<T>.
>>
>>
>> Species
>> 	Singleton
>> class Box<any T> {
>>     private species Class<List<T>> listClass
>>         = Class.forSpecialization(List, T.crass);
>> }
>> 	class Box<any T> {
>>     private singleton ListBuddy<any T> {
>>         Class<List<T>> clazz
>>             = Class.forSpecialization(List, T.crass);
>>     }
>> }
>>
>>
>> The extra singleton declaration feels like "noise" here, because 
>> again the association is with the full set of type args for the class.
>>
>>
>> 4.  Static factories.  Arguably, it makes sense to move factories to 
>> the types they describe.
>>
>> Species
>> 	Singleton
>> interface List<any T> {
>>     private species List<T> empty = new EmptyList<>();
>>     species List<T> emptyList() { return empty; }
>> }
>> 	interface List<any T> {
>>     private singleton Stuff<any T> {
>>         List<T> empty = new EmptyList<>();
>>     }
>>     species List<T> emptyList() { return Stuff<T>.empty; }
>> }
>>
>>
>> In this model, you'd get an empty list with
>>
>>     List<T> aList = List<T>.empty()
>> rather than
>> List<T> aList = Collections.<T>empty();
>>
>> In the latter, the type witnesses can be omitted; in the former they 
>> probably can be as well but that's something new.
>>
>>
>> 5.  Typevar shredding.  Here, we have separate state for different 
>> subsets of variables.  This should be the place where the singleton 
>> approach shines.
>>
>>
>> Species
>> 	Singleton
>> class HashMap<any K, any V> {
>>     private static class Keys<any K> {
>>         species Set<K> allKeys = ...
>>     }
>>
>>     private static class Vals<any V> {
>> species Set<V> allVals = ...
>>     }
>>
>>     void put(K k, V v) {
>>         Keys<K>.allKeys.add(k);
>> Vals<V>.allVals.add(v);
>>     }
>> }
>> 	class HashMap<any K, any V> {
>>     private singleton Keys<any K> {
>>         Set<K> allKeys = ...
>>     }
>>
>>     private singleton Vals<any V> {
>> Set<V> allVals = ...
>>     }
>>
>>     void put(K k, V v) {
>>         Keys<K>.allKeys.add(k);
>> Vals<V>.allVals.add(v);
>>     }
>> }
>>
>>
>>
>> But, it doesn't really shine that much; the left is not really much 
>> worse than the right, just a little more fussy.
>>
>> In cases where the singleton approach is more natural, the 
>> corresponding "species in static class" idiom isn't so bad either.  
>> But in cases where the species approach is more natural, there's 
>> something unappealing about creating classes (both in source and 
>> runtime footprint) in cases 2/3/4 when we don't need one. The only 
>> place where the singleton approach seems to win big is when there are 
>> multiple variables in the same scope bound by invariants -- here, the 
>> singleton having a ctor is a big win -- but how often does this happen?
>>
>>
>> So our conclusion is that the species-placement is as good or better 
>> for the identified use cases -- and it also fits cleanly into the 
>> existing model for member placement.
>