Consolidating the user model

Tue Nov 2 21:18:46 UTC 2021

We've been grinding away, and we think we have a reduced-complexity user 
model.  This is all very rough, and there's lots we need to write up 
more carefully first, but I'm sharing this as a preview of how we can 
simplify past where JEPs 401 and 402 currently stand.

# Consolidating the user model

As the mechanics of primitive classes have taken shape, it is time to take
another look at the user model.

Valhalla started with the goal of providing user-programmable classes which
could be flat and dense in memory.  Numerics are one of the motivating use
cases, but adding new primitive types directly to the language has a 
very high
barrier.  As we learned from [Growing a Language][growing] there are 
infinitely
many numeric types we might want to add to Java, but the proper way to 
do that
is as libraries, not as language features.

In the Java language as we have today, objects and primitives are 
different in
almost every way: objects have identity, primitives do not; objects are 
referred
to through references, primitives are not; object references can be null,
primitives cannot; objects can have mutable state, primitive can not; 
classes
can be extended, primitive types cannot; loading and storing of object
references is atomic, but loading and storing of large primitives is 
not.  For
obvious reasons, the design center has revolved around the 
characteristics of
primitives, but the desire to have it both ways is strong; developers 
continue
to ask for variants of primitive classes that have a little more in 
common with
traditional classes in certain situations.  These include:

  - **Nullability.**  By far the most common concern raised about primitive
    classes, which "code like a class", is the treatment of null; many 
developers
    want the benefits of flattening but want at least the option to have 
`null`
    as the default value, and getting an exception when an uninitialized 
instance
    is used.

  - **Classes with no sensible default.**  Prior to running the 
constructor, the
    JVM initializes all memory to zero.  Since primitive classes are 
routinely
    stored directly rather than via reference, it is possible that users 
might be
    exposed to instances in this initial, all-zero state, without a 
constructor
    having run.  For numeric classes such as complex numbers, zero is a fine
    default, and indeed a good default.  But for some classes, not only 
is zero
    not the best default, but there _is no good default_. Storing dates as
    seconds-since-epoch would mean uninitialized dates are interpreted 
as Jan 1,
    1970, which is more likely to be a bug than the desired behavior.  
Classes
    may try to reject bad values in their constructor, but if a class has no
    sensible default, then they would rather have a default that behaves 
more
    like null, where you get an error if you dereference it.  And if the 
default
    is going to behave like null, it's probably best if the default _is_ 
null.

  - **Migration**.  Classes like `Optional` and `LocalDate` today are
    _value-based_, meaning they already disavow the use of object 
identity and
    therefore are good candidates for being primitive classes. However, 
since
    these classes exist today and are used in existing APIs and client 
code, they
    would have additional compatibility constraints.  They would have to 
continue
    to be passed by object references to existing API points (otherwise the
    invocation would fail to link) and these types are already nullable.

  - **Non-tearability.**  64-bit primitives (`long` and `double`) risk 
_tearing_
    when accessed under race unless they are declared `volatile`.  However,
    objects with final fields offer special initialization-safety guarantees
    under the JMM, even under race.  So should primitive classes be more 
like
    primitives (risking being seen to be in impossible states), or more like
    classes (consistent views for immutable objects are guaranteed, even 
under
    race)?  Tear-freedom has potentially signficant costs, and tearing has
    signficant risks, so it is unlikely one size fits all.

  - **Direct control over flattening.**  In some cases, flattening is
    counterproductive.  For example, if we have a primitive class with many
    fields, sorting a flattened array may be more expensive than sorting 
an array
    of references; while we don't pay the indirection costs, we do pay for
    increased footprint, as well as increased memory movement when swapping
    elements.  Similarly, if we want to permute an array with a side 
index, it
    may well be cheaper to maintain an array of references rather than 
copying
    all the data into a separate array.

These requests are all reasonable when taken individually; its easy to 
construct
use cases where one would want it both ways for any given 
characteristic.  But
having twelve knobs (and 2^12 possible settings) on primitive classes is 
not a
realistic option, nor does it result in a user model that is easy to reason
about.

In the current model, a primitive class is really like a primitive -- no 
nulls,
no references, always flattened, tearable when large enough. Each primitive
class `P` comes with a companion reference type (`P.ref`), which behaves 
much as
boxes do today (except without identity.)  There is also, for migration, an
option (`ref-default`) to invert the meaning of the unqualified name, so 
that by
default `Optional` means `Optional.ref`, and flattening must be explicitly
requested which, in turn, is the sole motivation for the `P.val` 
denotation.) We
would like for the use of the `.ref` and `.val` qualifiers to be rare, but
currently they are not rare enough for comfort.

Further, we've explored but have not committed to a means of declaring 
primitive
classes which don't like their zero value, for primitive classes with no 
good
default, so that dereferencing a zero value would result in some sort of
exception.  (The nullability question is really dominated by the 
initialization
safety question.)  This would be yet another variant of primitive class.

A serious challenge to this stacking is the proliferation of options; 
there are
knobs for nullability, zero-hostility, migration, tear-resistence, etc.
Explaining when to use which at the declaration site is already 
difficult, and
there is also the challenge of when to use `ref` or `val` at the use 
site.  The
current model has done well at enumerating the requirements (and, helping us
separate the real ones from the wannabes), so it is now time to consolidate.

## Finding the buckets

Intuitively, we sense that there are three buckets here; traditional 
identity
classes in one bucket, traditional primitives (coded like classes) in 
another,
and a middle bucket that offers some "works like an int" benefits but 
with some
of the affordances (e.g., nullability, non-tearability) of the first.

Why have multiple buckets at all?  Project Valhalla has two main goals: 
better
performance (enabling more routine flattening and better density), and 
unifying
the type system (healing the rift between primitives and objects.)  It's 
easy to
talk about flattening, but there really are at least three categories of
flattening, and different ones may be possible in different situations:

  - **Heap flattening.**  Inlining the layout of one object into another 
object
    (or array) layout; when class `C` has a field of type `D`, rather than
    indirecting to a `D`, we inline D's layout directly into C.

  - **Calling convention flattening.**  Shredding a primitive class into its
    fields in (out-of-line) method invocations on the call stack.

  - **IR flattening.**  When calling a method that allocates a new 
instance and
    returns it, eliding the allocation and shredding it into its fields 
instead.
    This only applies when we can inline through from the allocation to the
    consumption of its fields.  (Escape analysis also allows this form of
    flattening, but only for provably non-escaping objects.  If we know the
    object is identity free, we can optimize in places where EA would fail.)

#### Nullability

Variables in the heap (fields and array elements) must have a default 
value; for
all practical purposes it is a forced move that this default value is the
all-zero-bits value.  This zero-bits value is interpreted as `null` for
references, zero for numerics, and `false` for booleans today.

If primitives are to "code like a class", the constructor surely must be 
able to
reject bad proposed states.  But what if the constructor thinks the default
value is a bad state?  The desire to make some primitive classes 
nullable stems
from the reality that for some classes, we'd like a "safe" default -- 
one that
throws if you try to use it before it is initialized.

But, the "traditional" primitives are not nullable, and for good reason; 
zero is
a fine default value, and the primitives we have today typically use all 
their
bit patterns, meaning that arranging for a representation of null 
requires at
least an extra bit, which in reality means longs would take at least 65 bits
(which in reality means 128 bits most of the time.)

So we see nullability is a tradeoff; on the one hand, it gives us protection
from uninitialized variables, but also has costs -- extra footprint, extra
checks.  We experimented with a pair of modifiers `null-default` and
`zero-default`, which would determine how the zero value is 
interpreted.  But
this felt like solving the problem at the wrong level.

#### Tearing

The Java Memory Model includes special provisions for visibility of final
fields, even with the reference to their container object is shared via 
a data
race.  These initialization safety guarantees are the bedrock of the Java
security model; a String being seen to change its value -- or to not respect
invariants established by its constructor -- would make it nearly 
impossible to
reason about security.

On the other hand, longs and doubles permit tearing when shared via data 
races.
This isn't great, but preventing tearing has a cost, and the whole 
reason we got
primitives in 1995 was driven by expectations and tradeoffs around 
arithmetical
performance.  Preventing tearing is still quite expensive; above 64 
bits, atomic
instructions have a significant tax, and often the best way to manage 
tearing is
via an indirection when stored in the heap (which is precisely what 
flattening
is trying to avoid.)

When we can code primitives "like a class", which should they be more 
like?  It
depends!  Classes that are more like numerics may be willing to tolerate 
tearing
for the sake of improved performance; classes that are more like 
"traditional
classes" will want the initialization safety afforded to immutable objects
already.

So we see tearability is a tradeoff; on the one hand, it protects invariants
from data races, but also has costs -- expensive atomic instructions, or 
reduced
heap flattening.  We experimented with a modifier that marks classes as
non-tearable, but this would require users to keep track of which primitive
classes are tearable and which aren't.  This felt like solving the 
problem at
the wrong level.

#### Migration

There are some classes -- such as `java.lang.Integer`, or 
`java.util.Optional`
-- that meet all the requirements to be declared as (nullable) primitive
classes, but which exist today in as identity classes.  We would like to 
be able
to migrate these to primitives to get the benefits of flattening, but are
constrained that (at least for non-private API points) they must be 
represented
as `L` descriptors for reasons of binary compatibility.  Our existing
interpretation of `L` descriptors is that they represent references as 
pointers;
this means that even if we could migrate these types, we'd still give up 
on some
forms of flattening (heap and stack), and our migration would be less than
ideal.

Worse, the above interpretation of migration suggests that sometimes a 
use of
`P` is translated as `LP`, and sometimes as `QP`.  To the degree that 
there is
uncertainty in whether a given source type translates to an `L` or `Q`
descriptor, this flows into either uncertainty of how to use reflection 
(users
must guess as to whether a given API point using `P` was translated with 
`LP` or
`QP`), or uncertainty on the part of reflection (the user calls
`getMethod(P.class)`, and reflection must consider methods that accept 
both `LP`
and `QP` descriptors.)

## Restacking for simplicity

The various knobs on the user model (which may flow into translation and
reflection) risk being death by 1000 cuts; they not only add complexity 
to the
implementation, but they add complexity for users.  This prompted a 
rethink of
assumptions at every layer.

#### Nullable primitives

The first part of the restacking involved relaxing the assumption that 
primitive
classes are inherently non-nullable.  We shied away from this for a long 
time,
knowing that there would be significant VM complexity down this road, 
but in the
end concluded that the complexity is better spent here than elsewhere.  
These
might be translated as `Q` descriptors, or might be translated as `L`
descriptors with a side channel for preloading metadata -- stay tuned for a
summary of this topic.

 > Why Q?  The reason we have `Q` descriptors at all is that we need to know
things about classes earlier than we otherwise would, in order to make 
decisions
that are hard to unmake later (such as layout and calling convention.)  
Rather
than interpreting `Q` as meaning "value type" (as the early prototypes 
did), `Q`
acquired the interpretation "go and look."  When the JVM encounters a 
field or
method descriptor with a `Q` in it, rather than deferring classloading 
as long
as possible (as is the case with `L` descriptors), we load the class 
eagerly, so
we can learn all we need to know about it.  From classloading, we might 
not only
learn that it is a primitive class, but whether it should be nullable or 
not.
(Since primitive classes are monomorphic, carrying this information 
around on a
per-linkage basis is cheap enough.)

So some primitive classes are marked as "pure" primitives, and others as
supporting null; when the latter are used as receivers, `invokevirtual` 
does a
null check prior to invocation (and NPEs if the receiver is null).  When 
moving
values between the heap and the stack (`getfield`, `aastore`, etc), these
bytecodes must check for the "flat null" representation in the heap and 
a real
null on the stack.  The VM needs some help from the classfile to help 
choose a
bit pattern for the flat null; the most obvious strategy is to inject a
synthetic boolean, but there are others that don't require additional 
footprint
(e.g., flow analysis that proves a field is assigned a non-default 
value; using
low-order bits in pointers; using spare bits in booleans; using pointer 
colors;
etc.)  The details are for another day, but we would like for this to not
intrude on the user model.

#### L vs Q

The exploration into nullable primitives prompted a reevaluation of the 
meaning
of L vs Q.  Historically we had interpreted L vs Q as being "pointer vs 
flat"
(though the VM always has the right to unflatten if it feels like it.)  
But over
time we've been moving towards Q mostly being about earlier loading (so 
the VM
can learn what it needs to know before making hard-to-reverse decisions, 
such as
layout.)  So let's go there fully.

A `Q` descriptor means that the class must be loaded eagerly (Q for "quick")
before resolving the descriptor; an `L` descriptor means it _must not 
be_ (L for
"lazy"), consistent with current JVMS treatment.  Since an `L` descriptor is
lazily resolved, we have to assume conservatively that it is nullable; a Q
descriptor might or might not be nullable (we'll know once we load the 
class,
which we do eagerly.)

What we've done is wrested control of flatness away from the language, 
and ceded
it to the VM, where it belongs.  The user/language expresses semantic
requirements (e.g., nullability) and the VM chooses a representation.  
That's
how we like it.

#### It's all about the references

The rethink of L vs Q enabled a critical restack of the user model.  
With this
reinterpretation, Q descriptors can (based on what is in the classfile) 
still be
reference types -- and these reference types can still be flattened; 
alternately,
with side-channels for preload metadata on `L` descriptors, we may be 
able to get
to non-flat references under `L` descriptors.

Returning to the tempting user knobs of nullability and tearability, we 
can now
put these where they belong: nullability is a property of _reference 
types_ --
and some primitive classes can be reference types.  Similarly, the
initialization safety of immutable objects derives from the fact that object
references are loaded atomically (with respect to stores of the same 
reference.)
Non-tearability is also a property of reference types.  (Similar with layout
circularity; references can break layout circularities.)  So rather than the
user choosing nullability and non-tearability as ad-hoc choices, we 
treat them
as affordances of references, and let users choose between reference-only
primitive classes, and the more traditional primitive classes, that come 
in both
reference and value flavors.

 > This restack allows us to eliminate `ref-default` completely (we'll 
share more
 > details later), which in turn allows us to eliminate `.val` completely.
 > Further, the use cases for `.ref` become smaller.

#### The buckets

So, without further ado, let's meet the new user model.  The names may 
change,
but the concepts seem pretty sensible.  We have identity classes, as before;
let's call that the first bucket.  These are unchanged; they are always
translated with L descriptors, and there is only one usable `Class` 
literal for
these.

The second bucket are _identity-free reference classes_.  They come with the
restrictions on identity-free classes: no mutability and limited 
extensibility.
Because they are reference types, they are nullable and receive tearing
protection.  They are flattenable (though, depending on layout size and 
hardware
details, we may choose to get tearing protection by maintaining the
indirection.)  These might be with Q descriptors, or with modified L
descriptors, but there is no separate `.ref` form (they're already 
references)
and there is only one usable `Class` literal for these.

The third bucket are the _true primitives_.  These are also identity-free
classes, but further give rise to both value and reference types, and 
the value
type is the default (we denote the reference type with the familiar 
`.ref`.)
Value types are non-nullable, and permit tearing just as existing 
primitives do.
The `.ref` type has all the affordances of reference types -- 
nullability and
tearing protection.  The value type is translated with Q; the reference 
type is
translated with L.  There are two mirrors (`P.class` and `P.ref.class`) to
reflect the difference in translation and semantics.

A valuable aspect of this translation strategy is that there is a 
deterministic,
1:1 correspondence between source types and descriptors.

How we describe the buckets is open to discussion; there are several 
possible
approaches.  One possible framing is that the middle bucket gives up 
identity,
and the third further gives up references (which can be clawed back with
`.ref`), but there are plenty of ways we might express it. If these are
expressed as modifiers, then they can be applied to records as well.

Another open question is whether we double down, or abandon, the 
terminology of
boxing.  On the one hand, users are familiar with it, and the new 
semantics are
the same as the old semantics; on the other, the metaphor of boxing is 
no longer
accurate, and users surely have a lot of mental baggage that says "boxes are
slow."  We'd like for users to come to a better understanding of the 
difference
between value and reference types.

#### Goodbye, direct control over flattening

In earlier explorations, we envisioned using `X.ref` as a way to explicitly
ask for no flattening.  But in the proposed model, flattening is entirely
under the control of the VM -- where we think it belongs.

#### What's left for .ref?

A pleasing outcome here is that many of the use cases for `X.ref` are 
subsumed
into more appropriate mechanisms, leaving a relatively small set of 
corner-ish
cases.  This is what we'd hoped `.ref` would be -- something that stays 
in the
corner until summoned.  The remaining reasons to use `X.ref` at the use site
include:

  - Boxing.  Primitives have box objects; strict value-based classes need
    companion reference types for all the same situations as today's 
primitives
    do.  It would be odd if the box were non-denotable.
  - Null-adjunction.  Some methods, like `Map::get`, return null to 
indicate no
    mapping was present.  But if in `Map<K,V>`, `V` is not nullable, 
then  there
    is no way to express this method.  We envision that such methods 
would return
    `V.ref`, so that strict value-based classes would widened to their 
"box" on
    return, and null would indicate no mapping present.
  - Cycle-breaking.  Primitives that are self-referential (e.g., linked 
list node
    classes that have a next node field) would have layout 
circularities; using a
    reference rather than a value allows the circularity to be broken.

This list is (finally!) as short as we would like it to be, and devoid of
low-level control over representation; users use `X.ref` when they need
references (either for interop with reference types, or to require 
nullability).
Our hope all along was that `.ref` was mostly "break glass in case of
emergency"; I think we're finally there.

#### Migration

The topic of migration is a complex one, and I won't treat it fully here 
(the
details are best left until we're fully agreed on the rest.) Earlier 
treatments
of migration were limited, in that even with all the complexity of
`ref-default`, we still didn't get all the flattening we wanted, because the
laziness of `L` descriptors kept us from knowing about potential 
flattenability
until it was too late.  Attempts to manage "preload lists" or "side preload
channels" in previous rounds foundered due to complexity or corner 
cases, but
the problem has gotten simpler, since we're only choosing representation 
rather
than value sets now -- which means that the `L*` types might work out here.
Stay tuned for more details.

## Reflection

Earlier designs all included some non-intuitive behavior around reflection.
What we'd like to do is align the user-visible types with reflection 
literals
with descriptors, following the invariant that

      new X().getClass() == X.class

## TBD

Stay tuned for some details on managing null encoding and detection,
reference types under either Q or modified L descriptors, and some
thoughts on painting the bikeshed.

growing: https://dl.acm.org/doi/abs/10.1145/1176617.1176621