Nullable ValueType, Generics and other vexations

Thu Aug 30 23:36:39 UTC 2018

----- Mail original -----
> De: "Brian Goetz" <brian.goetz at oracle.com>
> À: "Remi Forax" <forax at univ-mlv.fr>, "valhalla-spec-experts" <valhalla-spec-experts at openjdk.java.net>
> Envoyé: Jeudi 30 Août 2018 23:51:43
> Objet: Re: Nullable ValueType, Generics and other vexations

> The emotional types (T? and T!) are tempting here.  But they are also
> intrusive.  Once you let them in the house, they want to go everywhere.
> And not just on the main floor (language), they want to roam the
> dungeons too (JVM).  I am a little wary of inviting them into the house
> just as a way of talking about nullability.

agree !

> 
> An alternate approach, which I don't yet have a proposals for but am
> thinking about, is to denote these slightly differently, without putting
> nulls quote so prominently.  Which is, doubling down on an existing
> precedent: boxes.  Primitives have boxes, but they're crappy boxes --
> they have accidental identity, they are spelled weirdly, they are
> ad-hoc.  Call them "prototype boxes" -- the hand made ones that you did
> before you could automate.  The machine-crafted, automated boxes could
> be sleeker and more uniform.  Then in LW10 you just generify over the
> box and you're good, since its nullable.  (For LW100, more help is needed.)

This kind of boxes are really close to a reified java/lang/Box container.

class Box<any T> {
  private T value;

  T unwrap() { return value; }
  static Box<T> wrap(T value) { return new Box<T>(value); }
}

so in LW100, you will have trouble to explain the differences between V+Box and Box<V> if V+Box is a magic construction from the VM. 

[...]

Another idea, if we restrict ourselves to just a discussion about the surface language, you may not have to introduce T? or T+Box, because as i said at the end of my previous message you can force users to treat null locally and doesn't allow it to escape. So Complex? will be a type that exists internally for the compiler but that can not be written by a user.

so
 Map<String,Complex> map = ...
 Complex complex = map.get("a-key-that-does-not-exist");
will be rejected by the compiler because Complex is a value type so it doesn't accept null but
 var complex = map.get("a-key-that-does-not-exist");
is Ok and if the compiler has special code to handle Optional correctly, the code below will be valid
 var complexOrNull = map.get("a-key-that-does-not-exist");
 Complex complex = Optional.ofNullable(complexOrNull).orElseThrow(); 

And the compiler will erase Complex? to Object (or an interface if Complex implements that interface).

Rémi

> 
> On 8/29/2018 8:02 PM, Remi Forax wrote:
>> Hi all,
>> just to formalize a little more my thinking about the interaction between
>> (nullable) value types and erased generics.
>>
>> With LW1, the VM enforces that a value type can not be null, so each time there
>> is a conversion between an Object to a value type, a nullcheck is inserted.
>> This works great until you start to use already existing code that give a
>> meaning to null, like j.u.Map.get is specified to return a V or null,
>> so a code like this (with Complex a value type)
>>
>>    Map<String,Complex> map = ...
>>    Complex complex = map.get("a-key-that-does-not-exist");
>>
>> throw a NPE before a user can even check if complex is null or not.
>>
>>  From Java the language point of view, a solution is to have a way to express
>>  that a value type can be nullable, by mandating a users to write
>>    Complex? complex = map.get("a-key-that-does-not-exist");
>> and teach javac how to do a null analysis (guaranteeing that you can not call a
>> method on a Type? without a test of null first).
>>
>> The question is how to translate to bytecode something like Complex?.
>> We have two choices, one is to teach the VM what Complex? is by adding a bit
>> along with field/method descriptor the other is to erase Complex? like we erase
>> T (to Object or an interface).
>>
>> I believe is that we should choose the latter solution
>> - reifying the null information for the VM means solving the null problem of
>> Java not only for value types but for every types, because if we come with a
>> partial solution now, it will hamper our design choices if we want to extend it
>> latter. And solving the nullablity problem is not one of the goal of valhalla,
>> valhalla is hard enough, making it twice hard make no sense.
>> - having nullable value types reified in the VM is not enough to allow the
>> migration between a reference type to a value type, programs will still choke
>> on identity, synchronized, etc. But it helps for value based classes, yes, but
>> it's the tail wagging the dog. There are few value based classes defined in the
>> JDK and we know from the whole module 'experience' that a contract defined in
>> the javadoc and not enforced by javac or the VM means nothing.
>> - erasure works because either a value type comes from a type parameter so it is
>> already erased or it comes from a new code so it can be erased because if there
>> is a signature clash due to erasure, it's in a new code so the user can figure
>> out a way to workaround that.
>>
>> Note2: because of separate compilations the '?' information need to be available
>> in a new attribute (or in an old one by extending it) but only the compiler and
>> the reflection will consume it, not the VM.
>>
>> Note2: we can also restrict further the use of '?' by disallowing to use it in
>> method parameter unless it's either on a type variable or in a method
>> overriding another one that uses T?, but that a language issue.
>>
> > Rémi