value type hygiene

Tue May 8 10:06:53 UTC 2018

----- Mail original -----
> De: "John Rose" <john.r.rose at oracle.com>
> À: "Paul Sandoz" <paul.sandoz at oracle.com>
> Cc: "valhalla-spec-experts" <valhalla-spec-experts at openjdk.java.net>
> Envoyé: Mardi 8 Mai 2018 06:25:53
> Objet: Re: value type hygiene

> On May 7, 2018, at 6:06 PM, Paul Sandoz <paul.sandoz at oracle.com> wrote:
>> 
>> Thanks for sharing this!
>> 
>> I like the null containment approach. It recognizes that nulls (for better or
>> worse) are a thing in the ref world but stops the blighters from infecting the
>> value world at the borders.
>> 
>> We will need to extend this hygiene to javac and the libraries.
> 
> For any particular null-rejecting behavior, we have a choice of
> doing in the VM only, doing it in javac only (Objects.requireNN),
> or doing it in both.  I think this is true for all of the null checks
> proposed.  (Fourth choice:  Neither; don't reject nulls and hope
> they don't sneak in.)
> 
> A good example is checkcast.  javac knows each value type
> that is the subject of a cast, so the checkcast bytecode doesn't
> necessarily have to include a null check; javac could follow each
> checkcast by a call to O.rNN.  A few considerations tip me towards
> putting it into the instruction rather than letting javac have the job:
> 
> - When wouldn't we add O.rNN after checkcast Q?  If it's always
> there, isn't it safer to fold it into the checkcast?
> - No temptation to bytecode generators to fool around with
> "optimizing away" the null check.
> - The JVM has more complete information about APIs and schemas,
> so it can better optimize away the checks than javac can.
> - The amended checkcast corresponds better to the source-level
> operation.
> - Code complexity is better (smaller methods) if the bytecode has
> a higher-level behavior.
> 
> These same considerations apply to all the cases of dynamic null
> rejection:
> - checkcast
> - null check of incoming parameters
> - null check of received return types (after invoke* to legacy code)
> - null check of field reads (from legacy fields, which might be null).
> 
> Although it's easy to imagine javac putting O.rNN in all of
> these places, it will become annoying enough that we'll want
> to let the JVM do it.  The JVM will have on-line information about
> some things, such as which methods or fields might be legacy
> code, which allow it to omit checks many times.
> 
> If the bytecodes conspire to prevent nulls from getting into
> verifiable Q-types, then the verifier can add robustness by
> excluding mixes of aconst_null with Q-types, which is a value
> add.  This checking is inherently after javac, so it's not possible
> if we require javac to do some or all of the null checking explicitly.
> 
>> Javac could fail to compile when it knows enough, and in other cases place in
>> explicit null checks if not otherwise performed by existing instructions so as
>> to fail fast.
> 
> I think it's the case that if we adopt all of the null-checking proposals
> I wrote, then javac won't need any explicit ones.  I think that's one
> possible sweet spot for the design.  I currently don't see a sweeter
> spot, in fact.
> 
>> Certain APIs that rely on null as a signal will need careful reviewing and
>> possible adaption if the prevention has some side effects, and maybe
>> errors/warnings from javac. The poster child being Map.get, but others like
>> Map.compute are problematic too (if a value is not present for a key, then a
>> null value is passed to the remapping function).
> 
> Yep.  The sooner we implement (a) a JVM that has a clean upward
> model for value types, and (b) a javac with a null-sniffing lint mode,
> then we can being experimenting with this stuff.

I agree.

> 
> We can even experiment with Map, List, Function, etc., in their current
> erased forms.  The idea would be "values work like ints, so use these
> APIs with values just like you would with nullable Integer wrappers".
> 
> For List.of (and any other null-rejecting API) this will just work out
> of the box.
> 
> For Map we'll have to train ourselves to avoid Map.get and use methods
> like computeIfAbsent.  We might want to add more methods to make
> it easier to avoid nulls, such as Map.getOrElse(K,V) which returns V
> if the map entry is not present.

Map.getOrDefault() already exists :)

> 
> We can also define an interface type which is somehow related to
> concrete value types (this should be a template but it also works
> in the erased world):
> 
> interface ValueRef<V extends ValueRef>  {
>  default V byValue() { return (V) this; }
> }
> 
> The idea would be to make every value type explicitly (via javac)
> or implicitly (via JVM magic) implement this interface, instantiated
> to itself, of course.  This interface would then stand for a nullable
> version of the value type itself, and could be used safely with
> Map.
> 
> Map<String, ValueRef<ComplexDouble>> m = …;
> val ref = m.get("pi");
> if (ref == null)  return "no pi for you";
> val pi = ref.byValue();  // NPE if we didn't check first!
> return "pi="+pi;
> 
> Here, ComplexDouble to ValueRef<ComplexDouble> is like
> int is to Integer.  …Pretty much, although the conversion between
> CD and VR<CD> is one way only while int and Integer convert
> both ways.  Also VR still has no object identity, which is fine.
> More importantly, in L-world erased generics will accept both
> ComplexDouble and ValueRef<ComplexDouble>.

Its' not clear to me that we have to do something, when you do a List.add or a Map.put, because those method takes an Object, the value type will be buffered,
then when the value type is stored in an array (ArrayList) or a field (HashMap), the value type is then boxed.
Now when you call, List.get() or Map.get(), the boxed object is returned, and retrandformed to a value type when it hit the cast inserted by the compiler due to the erasure.
If someone call Map.get() with an unknown key, null is returned and the cast fails (with the extended semantics for cat) because null is not a valid value for a value type. 

It's more problematic for the API that send null as an argument of a lambda if this lambda ask for a value type because at runtime the user will get a NPE in the bridge.

So the question, Should the cast do a nullcheck or should introduce a nullcheck implicitly, reduces itself to, Is sending null to a lambda a common case or not.

And BTW, another solution is to introduce a new opcode vcheckcast, that does nullcheck+checkcast while checkcast don't, so the compiler will use vcheckcast in caller code due to the erasure and checkcast in the bridge code to allow a user to be able to do a nullcheck. This model is more complex for javac because it means that a value type taken as parameter may be null while a value type that comes from a return value can not be null.

> 
>> How we proceed might depend on whether specialized generics progresses at a
>> slower rate rate than value types.
> 
> Indeed.  And I'm pretty sure we will be ready to ship a workable value
> type system before we have finished figuring out the specialized generics.

i agree.

> 
> — John

Rémi