Null support for inline class reference projections through bitmaps

Sat Dec 5 13:44:31 UTC 2020

> Since there are types like e.g. LocalDate which would like to be inline classes but don't have a good default value, I was thinking about how the default value for value projections could be avoided and null could be used as default for reference projections efficiently.

For a reference projection, null is a valid value, and is the default.  But, there’s no free lunch; if we have to represent them as references on the stack, then there’s going to be some cost (usually to the calling conventions.)  

The key thing to realize is that null is a property of reference-ness.  And while it is possible for the VM to still scalarize in this case (inflate an extra boolean channel for nullity), this is starting to approach heroics.  

> 1.2. After value projection array creation, array elements are initialized with default values.
> 
> What if the creation of value projection arrays is only allowed for inline classes that have a default constructor(in the newarray bytecode the VM initializes the cached default value as created by the default constructor) and through an intrinsic method e.g. <any T> T.val[] Arrays.unbox(T.ref[]) that does null checks?

The issue of whether there is _any_ good default, and whether all zero bits is a good default, are frequently conflated.  The “no arg constructor” direction is only helpful in the case where there is a good default, but zeroes isn’t it.  But that’s the easier of the two problems to solve (though already not easy.)  

Consider a tuple of (String, String) which represents a name.  Is there any good default?  (null, null) is clearly crappy, but (“”, “”) isn’t much better, nor is (“name unavailable”, “name unavailable”).  So there’s no no-arg constructor that could help here.  But, do we want to say “sorry, no flattened arrays of names?”  

The approach of having separate types for the reference and value projections gives users a choice, where cost and safety can be balanced with an understanding of the context.  

> 2. Support nulls for reference projection (efficiently?)
> 
> Maybe you already work on something like that, but so far, I was of the impression that reference projections for inline classes wouldn't be that much better than normal reference types, so I thought about how reference projections could be modeled efficiently.

It’s in the middle.  Since there are pervasive implicit conversions between the two, and the JIT can more easily remove these conversions than it can with boxing, you can use Foo.val in the heap (arrays, fields) and Foo.ref in APIs.  This gets you flattening and density, but not the full benefit of scalarization in calls.  (As mentioned earlier, it might be possible to rescue scalarization in the presence of nulls, but that’s a long way off, if ever.)  

> The general idea I have is to introduce a bitmap to track the nullness of all (transitive) reference projections within a container like an array or object.

Essentially, if you want to adjoin an additional value to the value set of a type, you have several choices:

 - find an unused bit pattern (usually requires help from the programmer)
 - use more space, either contiguously (jam in an extra boolean field) or on the side (like your bitmap)

Languages that support an “undefined” value (e.g., Perl) have the same problem.  

Stepping back, let’s remember how we got here: primitive classes (note the new name!) are intended to model _values_.  This kind of suggests that all their bits should be able to travel together.  

I think the reality is that if you want nullity in the sense of “nothing there", then what you want is a reference type, because nulls only make sense when we’re talking about references to objects.  (And saying “I want nullity, but I want a primitive class because it’s faster” is understandable, but mostly wishful thinking.)  

I think the more important question here is the other one, which is: what do we do about primitive classes that _are_ values, but which have no good default, such as Date.   Jan 1 1972 is not a good default; countless bugs have resulted from pretending it is.  (Using -1 to initiatialize dates just moves the problem to Dec 31.)  So there’s really no good default here, any more than (“not a name”, “not a name”) is a good default for a Name tuple of (String, String).  So, what do we do?  

There’s a range of options here, trading off runtime cost, intrusiveness to the implementor, and intrusiveness to the client for cases where there is no good default.  

1.  Just use an identity class.  Null is the sensible default.  This isn’t a terrible choice.  Performance of Java today for business applications is pretty darn good. Easy for everyone, but we pay for it.  

2.  Use the val projection in implementation (fields, arrays) and ref projection in APIs.  This ensure safety at the boundary while offering the opportunity to get more flattening internally.  We get most of the performance benefits, relatively unintrusive, except someone has to stare at that ugly “.ref” token.  

3.  Code your primitive class to (a) have no exposed fields and (b) for methods to be defensive about checking for zero on entry.  This gives the user everything they want — fail-fast in the event they accidentially used an uninitialized value, and you get all the flatness, density, scalarlization, etc.  Higher intrusiveness to the implementor, no externally visible differences, full runtime cost savings.