a new contract for reference types

Wed May 8 17:16:22 UTC 2019

On May 7, 2019, at 11:37 PM, Peter Levart <peter.levart at gmail.com> wrote:
> 
> On 5/2/19 12:29 AM, John Rose wrote:
>> Regarding subtyping, I don't see (from these considerations)
>> a firm reason to declare that V? is a super of V.  The value
>> set of V?*might*  have one more point than that of V,
>> or it*might not*.  The reason we are doing V? is not the
>> value set, but the whole contract, which includes the
>> value set as an obvious, but ultimately non-determinative part.
> 
> Just one observation...
> 
> If inline class V was declared to support a "kind" of null (default, sentinel) value by itself, then how such value would be denoted?
> 
> Is this a way?
> 
> V v = null;

Yes, because there's only one null in the whole world.

(The possibility of having many kinds of null, perhaps
one per type, is something we are trying to avoid.
Null is a costly feature, sometimes referred to as a
billion dollar mistake[1].  We have to deal with it
in Java.  Adding more of them would be even more
costly IMO.)

[1]: https://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare

> If this was possible, then what would be the distinction between the following two then?
> 
> V? vInd1 = null;
> V? vInd2 = v;
> 
> Would vInd1 and vInd2 represent the same "null" value?

Yes, please please please.

A related question is, if V is nullable, then what is the
possible use of `V?`?  If V's value set already has null
in it, then `V?` can't add another null, or re-add the
existing (unique please please) null.

A decent answer to this is, "V?" means "a classic indirection"
as well as "can be null".  I.e., it's the New Contract I've been
writing about.  Usually the user of a JVM doesn't know or care
about where the JVM is using indirections or not, but for
those corner cases where indirections *might* show up
(somehow) "V?" might be a useful way to say, "make
indirections show up here" where just "V" means "use
inline layouts here".  One place where there users might
plausibly wish to make the choice is in array layouts.
As Doug pointed out, some array algorithms work better
one way and others another way.  The concurrency effects
are different with the two versions, also.

So if V is a nullable inline class (perhaps migrated from a
classic identity class) you almost never say V?, but you
might do this:

   V[] a1 = new V[1000];  // inline layout
   V[] a2 = new V?[1000];  // indirect layout

You wouldn't even need to make `V?[]` be a user visible type
in order for `new V?[]` to be valid syntax, although it would
be a stretch to design things this way.

My bottom line opinion:  We want to make `V?` be a real type
for users even if V is nullable.  But we want to make it a rare
occasion that a user would want to reach for `V?`.  Certainly
the `?` token must *not* become a new "register" keyword,
to be sprinkled superstitiously over our programs.

— John

P.S. Back to Doug's array example:  I would be tempted to
refactor sort algorithms to work on a temporary array of type
int[].  Even for classic object references this might go faster
if it's profitable to eliminate GC barrier effects absent on int[]
but present on Object[].  For multi-word inline array elements,
working on int[] as a proxy for the larger flattened array of
inlines might reduce data movement overall, without too much
extra burden on cache.