Species-static members vs singletons

Mon May 23 14:24:22 UTC 2016

Also, can you think of cases where the first parameter will be something 
else other than the receiver class? I.e. do we want to encourage a more 
OO-like style where you can ask a Complex things like compareTo etc. 
(rarher than calling a static helper somewhere to do the job) ?

Maurizio

On 23/05/16 15:18, Maurizio Cimadamore wrote:
> Sorry - I now realize that the point I made in my earlier email was 
> unclear.
>
> What I'm suggesting is to have a single rule for generating unchecked 
> warnings that goes like this:
>
> "If the qualifier of a species static access is not reifiable, an 
> unchecked warning should occur".
>
> In the example Peter sent, the only thing worth mentioning is that the 
> qualifier is 'implicit' (i.e. can be omitted and be assumed to be the 
> current class Foo<T>); now since Foo<T> is not reifiable, every 
> unqualified access to 'st' from Foo<T> will get a warning - excluding, 
> of course, accesses occurring in a context where T is restricted (i.e. 
> __WhereVal(T)).
>
> Maurizio
>
> On 23/05/16 14:56, Brian Goetz wrote:
>> Note that we have this same problem with unchecked warnings today in 
>> many of the use cases.  For example, in the “cached empty list” case, 
>> we always have to use an unchecked cast to cast the cached list to 
>> the desired type.  When we use species-static to do the same, and it 
>> is possible that the species could correspond to more than one T, we 
>> still have to do the same unchecked warning (and as you mention, the 
>> singleton form has the same problem.)  I think its an unescapable 
>> consequence of erasure, but one we’re already sort of comfortable with.
>>
>> If you use a more constrained type selector (e.g., List<int>), you 
>> won’t get a warning, as the compiler will know that st is exactly int.
>>
>>> On May 23, 2016, at 3:05 AM, Maurizio Cimadamore 
>>> <maurizio.cimadamore at oracle.com 
>>> <mailto:maurizio.cimadamore at oracle.com>> wrote:
>>>
>>> Hi Peter,
>>> are you sure we need special treatment for 'it = st' ? After all, 
>>> the compiler will issue unchecked warnings every time you'll try to 
>>> access a species static from a non-reifiable type i.e.
>>>
>>> Foo<String>.st = ""; //warn
>>> Foo<int>.st = 42; //no warn
>>>
>>> In other words, can we put the burden of heap pollution-ness on the 
>>> client and be happy?
>>>
>>> Maurizio
>>>
>>> On 22/05/16 23:58, Peter Levart wrote:
>>>> Hi Brian,
>>>>
>>>> I agree that "species" placement is a better, less verbose option. 
>>>> But how to solve the language problem of having "species" and 
>>>> "instance" members of the same "type-variable" type be assignable 
>>>> to one-another? For example:
>>>>
>>>> class Foo<any T> {
>>>>     species T st;
>>>>     T it;
>>>>
>>>>     void m() {
>>>>         it = st; // this can not be allowed
>>>>         st = it; // this can be allowed
>>>>
>>>>         // maybe this could be allowed?
>>>>         @SuppressWarnings("unchecked")
>>>>         it = (T) st;
>>>>     }
>>>>
>>>>
>>>> Singleton abstraction has the same problem.
>>>>
>>>> So while technically possible, it would be weird to have 'T' 
>>>> sometimes not be assignable to 'T'. Can we live with that?
>>>>
>>>> Regards, Peter
>>>>
>>>> On 05/19/2016 04:36 PM, Brian Goetz wrote:
>>>>> We discussed two primary means to surface species-specific members 
>>>>> in the language: a "species" placement (name TBD) as distinct from 
>>>>> static and instance, or a "singleton" abstraction (a la Scala's 
>>>>> "object" abstraction, as Peter L suggested).  We've done some 
>>>>> experiments comparing the two approaches.
>>>>>
>>>>> Separately, we discussed two strategies for handling this at the 
>>>>> VM level: having three separate placements (ACC_STATIC, 
>>>>> ACC_SPECIES, and instance) or retconning ACC_STATIC to mean 
>>>>> "species" and using compiler trickery to simulate traditional 
>>>>> statics.  In recent discussions with Oracle and IBM VM folks, they 
>>>>> seemed happy enough with having a new placement (and possibly new 
>>>>> bytecodes, {get,put,invoke}species, or overloading these onto 
>>>>> *static with ParamTypes in the owner field of the various XxxRef 
>>>>> constants.)
>>>>>
>>>>>
>>>>> There are several places where the language itself can take 
>>>>> advantage of species members:
>>>>>
>>>>> 1.  Reifying type variables.  For an any-generic class Foo<T,U>, 
>>>>> the compiler can generate public static final 
>>>>> reflection-thingie-valued fields called "T" and "U", which means 
>>>>> that "aFoo.T" (as an ordinary field ref!) would evaluate to the 
>>>>> reflective mirror for the reified T -- if present, otherwise it 
>>>>> would evaluate to the reflective mirror for 'erased'.
>>>>>
>>>>> 2.  Representation of generic methods.  The current translation 
>>>>> strategy has us translating any-generic methods to classes; a 
>>>>> static method
>>>>>
>>>>>     static<any T> void foo(T t) { }
>>>>>
>>>>> translates to a class (plus an erased bridge):
>>>>>
>>>>>     bridge static foo(Object o) { ... invoke erased specialization 
>>>>> ... }
>>>>>
>>>>>     static class Xxx$foo<any T> {
>>>>>         void foo(T t) { ... }
>>>>>     }
>>>>>
>>>>> This means that an instance of Xxx$foo is needed to invoke the 
>>>>> method -- but serves solely to carry the type variables -- which 
>>>>> is unfortunate.  If instead we translate as:
>>>>>
>>>>>     static class Xxx$foo<any T> {
>>>>> *species-static *void foo(T t) { ... }
>>>>>     }
>>>>>
>>>>> then we can invoke this method via invokespecies:
>>>>>
>>>>>     invokespecies ParamType[Xxx$foo, T_inf].foo(T_inf)
>>>>>
>>>>> where T_inf is the erasure-normalized type inferred for T (reified 
>>>>> if value, `erased` reference.)  No fake receiver required.
>>>>>
>>>>> The translation for generic instance methods is still somewhat 
>>>>> messier (will post separately), but still less messy than if we 
>>>>> also had to manage / cache a receiver.
>>>>>
>>>>>
>>>>> We also drafted some examples of how such a facility would be 
>>>>> used, writing them both with species-static and with singleton. 
>>>>> Examples and notes below; the summary is that in all cases, the 
>>>>> species-static version is either better or about as good.
>>>>>
>>>>>
>>>>>
>>>>> 1.  The old favorite, caching an instantiated instance.
>>>>>
>>>>> Species
>>>>> 	Singleton
>>>>> class Collections {
>>>>>     private static class Holder<any T> {
>>>>>         private species List<T> empty = new EmptyList<T>();
>>>>>     }
>>>>>
>>>>>     static<any T> List<T> emptyList() { return Holder<T>.empty; }
>>>>> }
>>>>> 	class Collections {
>>>>>     private singleton Holder<any T> {
>>>>>         private empty = new EmptyList<T>();
>>>>>     }
>>>>>
>>>>>     static<any T> List<T> emptyList() { return Holder<T>.empty; }
>>>>> }
>>>>>
>>>>>
>>>>> Note that in this case, species by itself isn't enough -- we still 
>>>>> need a holder class, and its a bit ugly.  Arguably we could merge 
>>>>> Holder into EmptyList (if that's under our control) but because 
>>>>> Collections is an old-style "static bag" class (aka "sin bin"), we 
>>>>> would still need a holder class for state.  (Collections could 
>>>>> share a single holder for multiple things; empty list, empty set, 
>>>>> etc.)
>>>>>
>>>>> Neither the left nor the right seems particularly better than the 
>>>>> other here. (If we were putting this method on Collection, where 
>>>>> it would likely go in new code since now interfaces can have 
>>>>> statics, the species approach would win, since we'd not need the 
>>>>> holder class any more.)
>>>>>
>>>>>
>>>>> 2.  Instantiation tracking.
>>>>>
>>>>> Species
>>>>> 	Singleton
>>>>> class Foo<any T> {
>>>>>     private species int count;
>>>>>     private species List<Foo<T>> foos;
>>>>>
>>>>>     public Foo() {
>>>>>         ++count;
>>>>>         foos.add(this);
>>>>>     }
>>>>> }
>>>>> 	class Foo<any T> {
>>>>>     private singleton FooStuff<T> {
>>>>>         private int count;
>>>>>         private List<Foo<T>> foos;
>>>>>     }
>>>>>
>>>>>     public Foo() {
>>>>>         ++Foo<T>.count;
>>>>>         Foo<T>.foos.add(this);
>>>>>     }
>>>>> }
>>>>>
>>>>>
>>>>> Because the state is directly tied to the instantiation, the left 
>>>>> seems more attractive -- doesn't require an extra artifact, and 
>>>>> the constructor body seems more straightforward.
>>>>>
>>>>>
>>>>> 3.  Implicit-like associations.  Here, we're caching type 
>>>>> associations.  For example, suppose we have a Box<T>, and we want 
>>>>> to cache the associated class for List<T>.
>>>>>
>>>>>
>>>>> Species
>>>>> 	Singleton
>>>>> class Box<any T> {
>>>>>     private species Class<List<T>> listClass
>>>>>         = Class.forSpecialization(List, T.crass);
>>>>> }
>>>>> 	class Box<any T> {
>>>>>     private singleton ListBuddy<any T> {
>>>>>         Class<List<T>> clazz
>>>>>             = Class.forSpecialization(List, T.crass);
>>>>>     }
>>>>> }
>>>>>
>>>>>
>>>>> The extra singleton declaration feels like "noise" here, because 
>>>>> again the association is with the full set of type args for the 
>>>>> class.
>>>>>
>>>>>
>>>>> 4.  Static factories.  Arguably, it makes sense to move factories 
>>>>> to the types they describe.
>>>>>
>>>>> Species
>>>>> 	Singleton
>>>>> interface List<any T> {
>>>>>     private species List<T> empty = new EmptyList<>();
>>>>>     species List<T> emptyList() { return empty; }
>>>>> }
>>>>> 	interface List<any T> {
>>>>>     private singleton Stuff<any T> {
>>>>>         List<T> empty = new EmptyList<>();
>>>>>     }
>>>>>     species List<T> emptyList() { return Stuff<T>.empty; }
>>>>> }
>>>>>
>>>>>
>>>>> In this model, you'd get an empty list with
>>>>>
>>>>>     List<T> aList = List<T>.empty()
>>>>> rather than
>>>>> List<T> aList = Collections.<T>empty();
>>>>>
>>>>> In the latter, the type witnesses can be omitted; in the former 
>>>>> they probably can be as well but that's something new.
>>>>>
>>>>>
>>>>> 5.  Typevar shredding.  Here, we have separate state for different 
>>>>> subsets of variables.  This should be the place where the 
>>>>> singleton approach shines.
>>>>>
>>>>>
>>>>> Species
>>>>> 	Singleton
>>>>> class HashMap<any K, any V> {
>>>>>     private static class Keys<any K> {
>>>>>         species Set<K> allKeys = ...
>>>>>     }
>>>>>
>>>>> private static class Vals<any V> {
>>>>> species Set<V> allVals = ...
>>>>>     }
>>>>>
>>>>>     void put(K k, V v) {
>>>>> Keys<K>.allKeys.add(k);
>>>>> Vals<V>.allVals.add(v);
>>>>>     }
>>>>> }
>>>>> 	class HashMap<any K, any V> {
>>>>>     private singleton Keys<any K> {
>>>>>         Set<K> allKeys = ...
>>>>>     }
>>>>>
>>>>> private singleton Vals<any V> {
>>>>> Set<V> allVals = ...
>>>>>     }
>>>>>
>>>>>     void put(K k, V v) {
>>>>> Keys<K>.allKeys.add(k);
>>>>> Vals<V>.allVals.add(v);
>>>>>     }
>>>>> }
>>>>>
>>>>>
>>>>>
>>>>> But, it doesn't really shine that much; the left is not really 
>>>>> much worse than the right, just a little more fussy.
>>>>>
>>>>> In cases where the singleton approach is more natural, the 
>>>>> corresponding "species in static class" idiom isn't so bad 
>>>>> either.  But in cases where the species approach is more natural, 
>>>>> there's something unappealing about creating classes (both in 
>>>>> source and runtime footprint) in cases 2/3/4 when we don't need 
>>>>> one. The only place where the singleton approach seems to win big 
>>>>> is when there are multiple variables in the same scope bound by 
>>>>> invariants -- here, the singleton having a ctor is a big win -- 
>>>>> but how often does this happen?
>>>>>
>>>>>
>>>>> So our conclusion is that the species-placement is as good or 
>>>>> better for the identified use cases -- and it also fits cleanly 
>>>>> into the existing model for member placement.
>>>>
>>>
>>
>