Default-zeroness as a special case of non-nullability

Wed May 4 16:46:13 UTC 2022

Reading the user model stacking thread of April, the proposed splitting 
of the three buckets into individual "knobs" does sound like a promising 
solution, albeit I wonder whether some options might sound scary (or not 
scary enough!) to us developers who prefer not to think about the 
intrinsics of memory layout in the JVM. Factoring out the atomicity 
decision is a good idea; tearing could lead to some nasty surprises.

However, I want to address the topic of zero-defaultness of (former?) B3 
primitives, since it introduces an enticing property previously denied 
to user types: non-nullability. It's something people wanted for years 
from Java, and makes them look jealously towards other languages. 
Primitive classes will not only be considered for their performance 
improvements, but also for their null safety. Some people *will* ignore 
all the red flags (all-zero default, non-atomicity in the old user 
model) and declare primitives just because of this property, even though 
their default makes no sense or even violates the class's invariant. But 
one uninitialized field later, instead of a NullPointerException, they 
will get a good old January 1st, 1970 for their CustomInstant, which is 
not that much better.

So: why not offer full choice of nullability at the user site, where we 
can force the user to pick a value?

Obviously, this is out of scope of Valhalla, but I'd like to ask whether 
the proposed user model can leave that door open for the future, seeing 
that Brian Goetz has thrown 'T!' into the ring as a possible spelling of 
.val/.zero (I know, bikeshedding!).

In a world where non-nullabilty and zero-default are separate things, 
'Instant!' would just mean the former; the only difference would be how 
fields and arrays of this type are created. Zero-default types get away 
with not initializing any of them. On the other hand, mere non-nullable 
type declarations could require the user to specify a value following 
the same rules as final fields. Trying to assign null at any point in 
time to a _non-null field or variable will fail. The tricky part are 
arrays; but here Kevin Bourrillion proposed requiring a fill value:

On Wed, Apr 27, 2022 at 9:36 PM Kevin Bourrillion <kevinb at google.com  <https://mail.openjdk.java.net/mailman/listinfo/valhalla-spec-experts>> wrote:

 > By the way, as we talk about this zero problem, these are the example 
cases
 > that go through my head: <draft, please help refine this>

> (Type R) e.g. Rational, EmployeeId: the default value is illegal; can't
> even construct it on purpose. Every method on it *should* call
> `checkValid()` first. Might as well repurpose it as a pseudo-null. Bugs
> could be prevented by some analogue of aftermarket nullness analysis.

> (Type I) e.g. Instant: the default value is legal, but it's a bad default
> value (while moderately guessable, it's arbitrary/meaningless). This makes
> the strongest case for being reference-only. Or it has to add a `boolean
> isValid` field (always set to true) to join Type R above.

> (Type C) e.g. Complex: the default value is a decent choice -- guessable,
> but probably not the identity of the *most* common reduction op (which I
> would guess is multiplication).

> (Type O) e.g. Optional, OptionalInt, UnsignedLong: the default value is the
> best possible kind -- guessable, and the identity of the presumably most
> common reduction operation.

> For type I, we would probably ban nonempty array instance creation
> expressions! This would force the arrays to be created by
> `Collection.toArray()` or by new alternative value-capable versions of
> `Arrays.fill()` and `Arrays.setAll()` which accept a size instead of a
> premade array. Actually, if the new Arrays.fill() could short-circuit when
> passed `TheType.default` then we might want to do this for types C and O
> too; why not make users be explicit.

(Personally, I would prefer _zero-ok types to always require explicitly assigning the default, not only for arrays. It always felt wrong that linters and IDEs would yell at me for making "0" and "false" explicit.)

There is, of course, the topic of serialization. While missing primitive fields have a fallback, arbitrary _non-null types do not.

Summarizing the proposal:

* a _non-null B1 does not differ from a nullable B1, except for additional constraints on declaration and assignment
* a _non-null B2 does not need a null channel if the compiler can prove that access of an uninitialized value is impossible (no leaking 'this' before assignment, no subclass/static access), increasing the chance of being flattened on the heap
* a _non-null B3 is a _non-null B2 with an optimized default value that may be accessed before initialization; atomicity is a separate knob