value type hygiene

Tue May 8 04:25:53 UTC 2018

On May 7, 2018, at 6:06 PM, Paul Sandoz <paul.sandoz at oracle.com> wrote:
> 
> Thanks for sharing this!
> 
> I like the null containment approach. It recognizes that nulls (for better or worse) are a thing in the ref world but stops the blighters from infecting the value world at the borders.
> 
> We will need to extend this hygiene to javac and the libraries. 

For any particular null-rejecting behavior, we have a choice of
doing in the VM only, doing it in javac only (Objects.requireNN),
or doing it in both.  I think this is true for all of the null checks
proposed.  (Fourth choice:  Neither; don't reject nulls and hope
they don't sneak in.)

A good example is checkcast.  javac knows each value type
that is the subject of a cast, so the checkcast bytecode doesn't
necessarily have to include a null check; javac could follow each
checkcast by a call to O.rNN.  A few considerations tip me towards
putting it into the instruction rather than letting javac have the job:

 - When wouldn't we add O.rNN after checkcast Q?  If it's always
there, isn't it safer to fold it into the checkcast?
 - No temptation to bytecode generators to fool around with
"optimizing away" the null check.
 - The JVM has more complete information about APIs and schemas,
so it can better optimize away the checks than javac can.
 - The amended checkcast corresponds better to the source-level
operation.
 - Code complexity is better (smaller methods) if the bytecode has
a higher-level behavior.

These same considerations apply to all the cases of dynamic null
rejection:
 - checkcast
 - null check of incoming parameters
 - null check of received return types (after invoke* to legacy code)
 - null check of field reads (from legacy fields, which might be null).

Although it's easy to imagine javac putting O.rNN in all of
these places, it will become annoying enough that we'll want
to let the JVM do it.  The JVM will have on-line information about
some things, such as which methods or fields might be legacy
code, which allow it to omit checks many times.

If the bytecodes conspire to prevent nulls from getting into
verifiable Q-types, then the verifier can add robustness by
excluding mixes of aconst_null with Q-types, which is a value
add.  This checking is inherently after javac, so it's not possible
if we require javac to do some or all of the null checking explicitly.

> Javac could fail to compile when it knows enough, and in other cases place in explicit null checks if not otherwise performed by existing instructions so as to fail fast.

I think it's the case that if we adopt all of the null-checking proposals
I wrote, then javac won't need any explicit ones.  I think that's one
possible sweet spot for the design.  I currently don't see a sweeter
spot, in fact.

> Certain APIs that rely on null as a signal will need careful reviewing and possible adaption if the prevention has some side effects, and maybe errors/warnings from javac. The poster child being Map.get, but others like Map.compute are problematic too (if a value is not present for a key, then a null value is passed to the remapping function).

Yep.  The sooner we implement (a) a JVM that has a clean upward
model for value types, and (b) a javac with a null-sniffing lint mode,
then we can being experimenting with this stuff.

We can even experiment with Map, List, Function, etc., in their current
erased forms.  The idea would be "values work like ints, so use these
APIs with values just like you would with nullable Integer wrappers".

For List.of (and any other null-rejecting API) this will just work out
of the box.

For Map we'll have to train ourselves to avoid Map.get and use methods
like computeIfAbsent.  We might want to add more methods to make
it easier to avoid nulls, such as Map.getOrElse(K,V) which returns V
if the map entry is not present.

We can also define an interface type which is somehow related to
concrete value types (this should be a template but it also works
in the erased world):

interface ValueRef<V extends ValueRef>  {
  default V byValue() { return (V) this; }
}

The idea would be to make every value type explicitly (via javac)
or implicitly (via JVM magic) implement this interface, instantiated
to itself, of course.  This interface would then stand for a nullable
version of the value type itself, and could be used safely with
Map.

Map<String, ValueRef<ComplexDouble>> m = …;
val ref = m.get("pi");
if (ref == null)  return "no pi for you";
val pi = ref.byValue();  // NPE if we didn't check first!
return "pi="+pi;

Here, ComplexDouble to ValueRef<ComplexDouble> is like
int is to Integer.  …Pretty much, although the conversion between
CD and VR<CD> is one way only while int and Integer convert
both ways.  Also VR still has no object identity, which is fine.
More importantly, in L-world erased generics will accept both
ComplexDouble and ValueRef<ComplexDouble>.

> How we proceed might depend on whether specialized generics progresses at a slower rate rate than value types.

Indeed.  And I'm pretty sure we will be ready to ship a workable value
type system before we have finished figuring out the specialized generics.

— John