<div dir="ltr"><div>I got through half of it, maybe more, so far.</div><div><br></div><div>Several of my suggestions are of a similar form, "I would also point out X here and now", because in those places I suspect a nontrivial number of readers may have "but wait a minute" reactions that will be distracting.<br></div><div><br></div><div>Of course, I am happy if this is the end of "primitive classes". :-)</div><div><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Jul 26, 2022 at 11:18 AM Brian Goetz <<a href="mailto:brian.goetz@oracle.com" target="_blank">brian.goetz@oracle.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<font size="4"><font face="monospace">Yet another attempt at
updating SoV to reflect the current thinking. Please review.<br>
<br>
<br>
# State of Valhalla<br>
## Part 2: The Language Model {.subtitle}<br>
<br>
#### Brian Goetz {.author}<br>
#### July 2022 {.date}<br>
<br>
> _This is the second of three documents describing the
current State of<br>
Valhalla. The first is [The Road to Valhalla](01-background);
the<br>
third is [The JVM Model](03-vm-model)._<br>
<br>
This document describes the directions for the Java _language_
charted by<br>
Project Valhalla. (In this document, we use "currently" to
describe the<br>
language as it stands today, without value classes.)<br>
<br>
Valhalla started with the goal of providing user-programmable
classes which can<br>
be flat and dense in memory. Numerics are one of the motivating
use cases;<br>
adding new primitive types directly to the language has a very
high barrier. As<br>
we learned from [Growing a Language][growing] there are
infinitely many numeric<br>
types we might want to add to Java, but the proper way to do
that is via<br>
libraries, not as a language feature.<br>
<br>
## Primitive and objects today<br>
<br>
Java currently has eight built-in primitive types. Primitives
represent pure<br>
_values_; any `int` value of "3" is equivalent to, and
indistinguishable from,<br>
any other `int` value of "3". Because primitives are "just
their bits" with no<br>
ancillarly state such as object identity, they are _freely
copyable_; whether<br>
there is one copy of the `int` value "3", or millions, doesn't
matter to the<br>
execution of the program. With the exception of the unusual
treatment of exotic<br>
floating point values such as `NaN`, the `==` operator on
primitives performs a<br>
_substitutibility test_ -- it asks "are these two values the
same value".<br></font></font></div></blockquote><div><br></div><div>I've said this before, but I think both "substitutability" and "sameness" just lead to more questions, and I'm not sure why we don't appeal to distinguishability instead.</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><font size="4"><font face="monospace">
Java also has _objects_, and each object has a unique _object
identity_. This<br>
means that each object must live in exactly one place (at any
given time), and<br>
this has consequences for how the JVM lays out objects in
memory. Objects in<br>
Java are not manipulated or accessed directly, but instead
through _object<br>
references_. Object references are also a kind of value -- they
encode the<br>
identity of the object to which they refer,</font></font></div></blockquote><div><br></div><div>Do we really want to invoke identity here? That surprises me. That suggests that a `ValueClass.ref` instance will have identity too.</div><div>Isn't it really only about the object being addressable or locatable (some term like that)?</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><font size="4"><font face="monospace"> and the `==`
operator on object<br>
references also performs a substitutibility test, asking "do
these two<br>
references refer to the same object." Accordingly, object
_references_ (like<br>
other values) can be freely copied, but the objects they refer
to cannot. <br>
<br>
This dichotomy -- that the universe of values consists of
primitives and object<br>
references -- has long been at the core of Java's design. JVMS
2.2 (Data Types)<br>
opens with:<br>
<br>
> There are two kinds of values that can be stored in
variables, passed as<br>
> arguments, returned by methods, and operated upon:
primitive values and<br>
> reference values.<br>
<br>
Primitives and objects currently differ in almost every
conceivable way:<br>
<br>
| Primitives |
Objects |<br>
| ------------------------------------------ |
---------------------------------- |<br>
| No identity (pure values) |
Identity |<br>
| `==` compares values | `==` compares
object identity |<br>
| Built-in | Declared in
classes |<br>
| No members (fields, methods, constructors) | Members
(including mutable fields) |<br>
| No supertypes or subtypes | Class and
interface inheritance |<br>
| Accessed directly | Accessed via
object references |<br>
| Not nullable |
Nullable |<br>
| Default value is zero | Default value is
null |<br>
| Arrays are monomorphic | Arrays are
covariant |<br>
| May tear under race | Initialization
safety guarantees |<br>
| Have reference companions (boxes) | Don't need
reference companions |<br>
<br>
Primitives embody a number tradeoffs aimed at maximizing the
performance and<br>
usability of the primitive types. Reference types default to
`null`, meaning<br>
"referring to no object", and must be initialized before use;
primitives default<br>
to a usable zero value (which for most primitives is the
additive identity) and<br>
therefore may be used without initialization. (If primitives
were nullable like<br>
references, not only would this be less convenient in many
situations, but they<br>
would likely consume additional memory footprint to accomodate
the possibility<br>
of nullity, as most primitives already use all their bit
patterns.) Similarly,<br>
reference types provide initialization safety guarantees for
final fields even<br>
under a certain category of data races (this is where we get the
"immutable<br>
objects are always thread-safe" rule from); primitives allow
tearing under race<br>
for larger-than-32-bit values. We could characterize the design
principles<br>
behind these tradeoffs are "make objects safer, make primitives
faster."<br>
<br>
The following figure illustrates the current universe of Java's
types. The<br>
upper left quadrant is the built-in primitives; the rest of the
space is<br>
reference types. In the upper-right, we have the abstract
reference types --<br>
abstract classes, interfaces, and `Object` (which, though
concrete, acts more<br>
like an interface than a concrete class). The built-in
primitives have wrappers<br>
or boxes, which are reference types.<br>
<br>
<figure><br>
<a href="field-type-zoo.pdf" title="Click for PDF"><br>
<img src="field-type-zoo-old.png" alt="Current universe
of Java field types"/><br>
</a><br>
</figure><br>
<br>
Valhalla aims to unify primitives and objects such that both are
declared with<br>
classes, but maintains the special runtime characteristics --
flatness and<br>
density -- that primitives currently enjoy. <br>
<br>
### Primitives and boxes today<br>
<br>
The built-in primitives are best understood as _pairs_ of types:
the primitive<br>
type (`int`) and its reference companion type (`Integer`), with
built-in<br>
conversions between the two. The two types have different
characteristics that<br>
makes each more or less appropriate for a given situations.
Primitives are<br>
optimized for efficient storage and access: they are
monomorphic, not nullable,<br>
tolerate uninitialized (zero) values, and larger primitive types
(`long`,<br>
`double`) may tear under racy access. The box types add back
the affordances of<br>
references -- nullity, polymorphism, interoperation with
generics, and<br>
initialization safety -- but at a cost. <br>
<br>
Valhalla generalizes this primitive-box relationship, in a way
that is more<br>
regular and extensible and reduces the "boxing tax".<br>
<br>
## Eliminating unwanted object identity<br>
<br>
Many impediments to optimization stem from _unwanted object
identity_. For many<br>
classes, not only is identity not directly useful, it can be a
source of bugs.<br>
For example, due to caching, `Integer` can be accidentally
compared correctly<br>
with `==` just often enough that people keep doing it.
Similarly, [value-based<br>
classes][valuebased] such as `Optional` have no need for
identity, but pay the<br>
costs of having identity anyway. <br>
<br>
Valhalla allows classes to explicitly disavow identity by
declaring them as<br>
_value classes_. The instances of a value class are called
_value objects_. <br>
<br>
```<br>
value class Point implements Serializable {<br>
int x;<br>
int y;<br>
<br>
Point(int x, int y) { <br>
this.x = x;<br>
this.y = y;<br>
}<br>
<br>
Point scale(int s) { <br>
return new Point(s*x, s*y);<br>
}<br>
}<br>
```<br>
<br>
This says that an `Point` is a class whose instances have no
identity. As a<br>
consequence, it must give up the things that depend on identity;
the class and<br>
its fields are implicitly final. Additionally, operations that
depended on<br>
identity must either be adjusted (`==` on value objects compares
state, not<br>
identity) or disallowed (it is illegal to lock on a value
object.)<br></font></font></div></blockquote><div><br></div><div>Just for broad understandability, you might want to address here "but then how could a reference 'identify' what object it's pointing to?" </div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><font size="4"><font face="monospace">Value classes can still have most of the affordances of classes
-- fields,<br>
methods, constructors, type parameters, superclasses (with some
restrictions),<br>
nested classes, class literals, interfaces, etc. The classes
they can extend<br>
are restricted: `Object` or abstract classes with no instance
fields, empty<br>
no-arg constructor bodies, no other constructors, no instance
initializers, no<br>
synchronized methods, and whose superclasses all meet this same
set of<br>
conditions. (`Number` is an example of such an abstract class.)<br>
<br>
Because `Point` has value semantics, `==` compares by state
rather than<br>
identity. This means that value objects, like primitives, are
_freely<br>
copyable_; we can explode them into their fields and
re-aggregate them into<br>
another value object, and we cannot tell the difference. <br></font></font></div></blockquote><div><br></div><div>It feels like if this wants to rest some stuff on "comparing by state" it ought to explain here what that means? Or, I guess at least a forward reference.</div><div>It seems pretty important to understand that it means shallow fieldwise delegation back to `==` again, meaning that fields of identity types are still identity-compared.</div><div>In many contexts "value semantics" and "comparing by state" tend to only make sense if done recursively/deeply.</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><font size="4"><font face="monospace">So far we've addressed the first two lines in our table of
differences; rather<br>
than all objects having identity, classes can opt into, or out
of, object<br>
identity for their instances. By allowing classes to exclude
unwanted identity,<br>
we free the runtime to make better layout and compilation
decisions.<br>
<br>
### Example: immutable cursors<br>
<br>
Collections today use `Iterator` to facilitate traversal through
the collection,<br>
which store iteration state in mutable fields. While heroic
optimizations such<br>
as _escape analysis_ can sometimes eliminate the cost associated
with iterators,<br>
such optimizations are fragile and hard to rely on. Value
objects offer an<br>
iteration approach that is more reliably optimized: immutable
cursors. (Without<br>
value objects, immutable cursors would be prohibitively
expensive for<br>
iteration.)<br>
<br>
```<br>
value class ArrayCursor<T> { <br>
T[] array;<br>
int offset;<br>
<br>
public ArrayCursor(T[] array, int offset) { <br>
this.array = array;<br>
this.offset = offset;<br>
}<br>
<br>
public ArrayCursor(T[] array) { <br>
this(array, 0);<br>
}<br>
<br>
public boolean hasNext() { <br>
return offset < array.length;<br>
}<br>
<br>
public T next() { <br>
return array[offset];<br>
}<br>
<br>
public ArrayCursor<T> advance() { <br>
return new ArrayCursor(array, offset+1);<br>
}<br>
}<br>
```<br>
<br>
In looking at this code, we might mistakenly assume it will be
inefficient, as<br>
each loop iteration appears to allocate a new cursor:<br>
<br>
```<br>
for (ArrayCursor<T> c = new ArrayCursor<>(array); <br>
c.hasNext(); <br>
c = c.advance()) {<br>
// use c.next();<br>
}<br>
```<br>
<br>
In reality, we should expect that _no_ cursors are actually
allocated here. An<br>
`ArrayCursor` is just its two fields, and the runtime is free to
scalarize the<br>
object into its fields and hoist them into registers. The
calling convention<br>
for `advance` is optimized so that both receiver and return
value are<br>
scalarized. Even without inlining `advance`, no allocation will
take place,<br>
just some shuffling of the values in registers. And if
`advance` is inlined,<br>
the client code will compile down to having a single register
increment and<br>
compare in the loop header. <br>
<br>
### Migration<br>
<br>
The JDK (as well as other libraries) has many [value-based
classes][valuebased]<br>
such as `Optional` and `LocalDateTime`. Value-based classes
adhere to the<br>
semantic restrictions of value classes, but are still identity
classes -- even<br>
though they don't want to be. Value-based classes can be
migrated to true value<br>
classes simply by redeclaring them as value classes, which is
both source- and<br>
binary-compatible.</font></font></div></blockquote><div><br></div><div>This gave me a slight "huh, then what's the catch?" reaction. It might make more sense by adding the fact right away that any errant usages (that don't adhere to the VBC requirements) will start failing at runtime, and might cause compilation warnings?</div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><font size="4"><font face="monospace">
We plan to migrate many value-based classes in the JDK to value
classes.<br>
Additionally, the primitive wrappers can be migrated to value
classes as well,<br>
making the conversion between `int` and `Integer` cheaper; see
"Migrating the<br>
legacy primitives" below. (In some cases, this may be
_behaviorally_<br>
incompatible for code that synchronizes on the primitive
wrappers. [JEP<br>
390][jep390] has supported both compile-time and runtime
warnings for<br>
synchronizing on primitive wrappers since Java 16.) <br></font></font></div></blockquote><div><br></div><div>Putting this in parens under the topic of the primitive wrappers feels like "pulling a fast one". Like it's pretending that this incompatibility problem is somehow unique to those 8 classes, hoping people won't notice "wait a minute, *any* class hopeful of future migration would have the same desire to opt into such warnings in advance." (And for more than just synchronization.) I get that there is no current plan to solve that problem, but we could be more up-front about that?</div><div><br></div><div>(Cross-reference my earlier agitations about this in a thread called "we need help migrating from bucket 1 to 2...", maybe a couple months ago.)</div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><font size="4"><font face="monospace">
<figure><br>
<a href="field-type-zoo.pdf" title="Click for PDF"><br>
<img src="field-type-zoo-mid.png" alt="Java field types
adding value classes"/><br>
</a><br>
</figure><br>
<br>
### Identity-sensitive operations<br>
<br>
Certain operations are currently defined in terms of object
identity. As we've<br>
already seen, some of these, like equality, can be sensibly
extended to cover<br>
all instances. Others, like synchronization, will become
partial. <br>
Identity-sensitive operations include:<br>
<br>
- **Equality.** We extend `==` on references to include
references to value<br>
objects. Where it currently has a meaning, the new
definition coincides<br>
with that meaning.<br>
<br>
- **System::identityHashCode.** The main use of
`identityHashCode` is in the<br>
implementation of data structures such as
`IdentityHashMap`. We can extend<br>
`identityHashCode` in the same way we extend equality --
deriving a hash on<br>
value objects from the hash of all the fields.<br>
<br>
- **Synchronization.** This becomes a partial operation. If
we can<br>
statically detect that a synchronization will fail at
runtime (including<br>
declaring a `synchronized` method in a value class), we can
issue a<br>
compilation error; if not, attempts to lock on a value
object results in<br>
`IllegalMonitorStateException`. This is justifiable because
it is<br>
intrinsically imprudent to lock on an object for which you
do not have a<br>
clear understanding of its locking protocol; locking on an
arbitrary<br>
`Object` or interface instance is doing exactly that.<br>
<br>
- **Weak, soft, and phantom references.** Capturing an exotic
reference to a<br>
value object becomes a partial operation, as these are
intrinsically tied to<br>
reachability (and hence to identity). However, we will
likely make<br>
enhancements to `WeakHashMap` to support mixed identity and
value keys. <br>
<br>
### Value classes and records<br>
<br>
While records have a lot in common with value classes -- they
are final and<br>
their fields are final -- they are still identity classes.
Records embody a<br>
tradeoff: give up on decoupling the API from the representation,
and in return<br>
get various syntactic and semantic benefits. Value classes
embody another<br>
tradeoff: give up identity, and get various semantic and
performance benefits.<br>
If we are willing to give up both, we can get both sets of
benefits, by<br>
declaring a _value record_. <br>
<br>
```<br>
value record NameAndScore(String name, int score) { }<br>
```<br>
<br>
Value records combine the data-carrier idiom of records with the
improved <br>
scalarization and flattening benefits of value classes. <br>
<br>
In theory, it would be possible to apply `value` to certain
enums as well, but<br>
this is not currently possible because the `java.lang.Enum` base
class that<br>
enums extend do not meet the requirements for superclasses of
value classes (it<br>
has fields and non-empty constructors).<br>
<br>
### Value and reference companion types<br>
<br>
Value classes are generalizations of primitives. Since
primitives have a<br>
reference companion type, value classes actually give rise to
_pairs_ of types:<br>
a value type and a reference type. We've seen the reference
type already; for<br>
the value class `ArrayCursor`, the reference type is called
`ArrayCursor`, just<br>
as with identity classes. The full name for the reference type
is<br>
`ArrayCursor.ref`; `ArrayCursor` is just a convenient alias for
that. (This<br>
aliasing is what allows value-based classes to be compatibly
migrated to value<br>
classes.)</font></font></div></blockquote><div><br></div><div>It's more than just that: it's what unifies all classes together! They all define a reference type, always with the same name as the class. That's nice, unchanging solid ground under our feet while all the Valhalla shifts are going on.</div><div><br></div><div>It would make more sense to me if `ArrayCursor.ref` were the alias to `ArrayCursor`, and it would be appropriate for the reader to wonder "why do we even need that alias?".</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><font size="4"><font face="monospace"> The value type is called `ArrayCursor.val`, and the
two types have the<br>
same conversions between them as primitives do today with their
boxes. The<br>
default value of the value type is the one for which all fields
take on their<br>
default value; the default value of the reference type is, like
all reference<br>
types, null. We will refer to the value type of a value class
as the _value<br>
companion type_.<br></font></font></div></blockquote><div><br></div><div>... because it acts as a companion to the reference type you've always known.</div><div>(At least, *I* still really don't want people to think that both the value type and the reference types are "companions" to the class that defined them.)</div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><font size="4"><font face="monospace">Just as with today's primitives and their boxes, the reference
and value<br>
companion types of a value class differ in their support for
nullity,<br>
polymorphism, treatment of uninitialized variables, and safety
guarantees under<br>
race. Value companion types, like primitive types, are
monomorphic,<br>
non-nullable, tolerate uninitialized (zero) values, and (under
some<br>
circumstances) may tear under racy access. Reference types are
polymorphic,<br>
nullable, and offer the initialization safety guarantees for
final fields that<br>
we have come to expect from identity objects. <br>
<br>
Unlike with today's primitives, the "boxing" and "unboxing"
conversions between<br>
the reference and value companion types are not nearly as heavy
or wasteful,<br>
because of the lack of identity. A variable of type `Point.val`
holds a "bare"<br>
value object; a variable of type `Point.ref` holds a _reference
to_ a value<br>
object. For many use cases, the reference type will offer good
enough<br>
performance; in some cases, it may be desire to additionally
give up the<br>
affordances of reference-ness to make further flatness and
footprint gains. See<br>
[Performance Model](05-performance-model) for more details on
the specific<br>
tradeoffs.<br>
<br>
In our diagram, these new types show up as another entity that
straddles the<br>
line between primitives and identity-free references, alongside
the legacy<br>
primitives: <br>
<br>
** UPDATE DIAGRAM **<br>
<br>
<figure><br>
<a href="field-type-zoo.pdf" title="Click for PDF"><br>
<img src="field-type-zoo-new.png" alt="Java field types
with extended primitives"/><br>
</a><br>
</figure><br>
<br>
### Member access<br>
<br>
Both the reference and value companion types have the same
members.</font></font></div></blockquote><div><br></div><div>Maybe worth acknowledging "(even those, like `wait()` inherited from `Object`, that don't make sense and will fail at runtime, for simplicity's sake)".</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><font size="4"><font face="monospace"> Unlike<br>
today's primitives, value companion types can be used as
receivers to access<br>
fields and invoke methods (subject to the usual accessibility
constraints): <br>
<br>
```<br>
Point.val p = new Point(1, 2);<br>
assert p.x == 1;<br>
<br>
p = p.scale(2);<br>
assert p.x == 2;<br>
```<br></font></font></div></blockquote><div><br></div><div>I think it is worth acknowledging that this does lead to `5.toString()` becoming valid and functioning code, which happens just for consistency and not because it was a goal in itself.</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><font size="4"><font face="monospace">
### Polymorphism<br>
<br>
An identity class `C` that extends `D` sets up a subtyping
(is-a) relationship<br>
between `C` and `D`. For value classes, the same thing happens
between its<br>
_reference type_ and the declared supertypes. (Reference types
are<br>
polymorphic; value types are not.) This means that if we
declare:<br>
<br>
```<br>
value class UnsignedShort extends Number <br>
implements
Comparable<UnsignedShort> { <br>
...<br>
}<br>
```<br>
<br>
then `UnsignedShort` is a subtype of `Number` and
`Comparable<UnsignedShort>`,<br>
and we can ask questions about subtyping using `instanceof` or
pattern matching.<br>
What happens if we ask such a question of the value companion
type?<br>
<br>
```<br>
UnsignedShort.val us = ...<br>
if (us instanceof Number) { ... }<br>
```<br>
<br>
Since subtyping is defined only on reference types, the
`instanceof` operator<br>
(and corresponding type patterns) will behave as if both sides
were lifted to<br>
the appropriate reference type (unboxed), and then we can appeal
to subtyping.<br>
(This may trigger fears of expensive boxing conversions, but in
reality no<br>
actual allocation will happen.)<br>
<br>
We introduce a new relationship between types based on `extends`
/ `implements`<br>
clauses, which we'll call "extends": we define `A extends B` as
meaning `A <: B`<br>
when A is a reference type, and `A.ref <: B` when A is a
value companion type.<br>
The `instanceof` relation, reflection, and pattern matching are
updated to use<br>
"extends".<br>
<br>
### Array covariance<br>
<br>
Arrays of reference types are _covariant_; this means that if `A
<: B`, then<br>
`A[] <: B[]`. This allows `Object[]` to be the "top array
type" -- but only for<br>
arrays of references. Arrays of primitives are currently left
out of this<br>
story. We unify the treatment of arrays by defining array
covariance over the<br>
new "extends" relationship; if A _extends_ B, then `A[] <:
B[]`. This means<br>
that for a value class P, `P.val[] <: P.ref[] <:
Object[]`; when we migrate the<br>
primitive types to be value classes, then `Object[]` is finally
the top type for<br>
all arrays. (When the built-in primitives are migrated to value
classes, this<br>
means `int[] <: Integer[] <: Object[]` too.)<br></font></font></div></blockquote><div><br></div><div>I think it's worth addressing that this does mean there will be `Integer[]` and `Object[]` instances that can't store null, failing at runtime, but that this is consistent with the existing quirks of array covariance.</div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><font size="4"><font face="monospace">### Equality<br>
<br>
For values, as with primitives, `==` compares by state rather
than by identity.<br>
Two value objects are `==` if they are of the same type and
their fields are<br>
pairwise equal, where equality is defined by `==` for primitives
(except `float`<br>
and `double`, which are compared with `Float::equals` and
`Double::equals` to<br>
avoid anomalies), `==` for references to identity objects, and
recursively with<br>
`==` for references to value objects. In no case is a value
object ever `==` to<br>
an identity object.<br>
<br>
When comparing two object _references_ with `==`, they are equal
if they are<br>
both null, or if they are both references to the same identity
object, or they<br>
are both references to value objects that are `==`. (When
comparing a value<br>
type with a reference type, we treat this as if we convert the
value to a<br>
reference, and proceed as per comparing references.) This means
that the<br>
following will succeed: <br>
<br>
```<br>
Point.val p = new Point(3, 4);<br>
Point pr = p;<br>
assert p == pr;<br>
```<br>
<br>
The base implementation of `Object::equals` delegates to `==`,
which is a<br>
suitable default for both reference and value classes.</font></font></div></blockquote><div><br></div><div>This is where you could appeal to the idea that `==` has always meant "strictly indistinguishable by any means" and this preserves that meaning (modulo float/double weirdness).</div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><font size="4"><font face="monospace">### Serialization<br>
<br>
If a value class implements `Serializable`, this is also really
a statement<br>
about the reference type. Just as with other aspects described
here,<br>
serialization of value companions can be defined by converting
to the<br>
corresponding reference type and serializing that, and reversing
the process at<br>
deserialization time.<br></font></font></div></blockquote><div><br></div><div>It's nonobvious to me why the reference type is being elevated as the primary one here, except that of course a method like `writeObject` is only going to be fed the reference type. I would have expected just that serializability applies equally to both types in the same way, much like invoking some method on both types.</div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><font size="4"><font face="monospace">
Serialization currently uses object identity to preserve the
topology of an<br>
object graph. This generalizes cleanly to objects without
identity, because<br>
`==` on value objects treats two identical copies of a value
object as equal. <br>
So any observations we make about graph topology prior to
serialization with<br>
`==` are consistent with those after deserialization.<br>
<br>
## Refining the value companion<br>
<br>
Value classes have several options for refining the behavior of
the value<br>
companion type and how they are exposed to clients.<br>
<br>
### Classes with no good default value<br>
<br>
For a value class `C`, the default value of `C.ref` is the same
as any other<br>
reference type: `null`. For the value companion type `C.val`,
the default value<br>
is the one where all of its fields are initialized to their
default value (0 for<br>
numbers, false for boolean, null for references.)<br>
<br>
The built-in primitives reflect the design assumption that zero
is a reasonable<br>
default. The choice to use a zero default for uninitialized
variables was one<br>
of the central tradeoffs in the design of the built-in
primitives. It gives us<br>
a usable initial value (most of the time), and requires less
storage footprint<br>
than a representation that supports null (`int` uses all 2^32 of
its bit<br>
patterns, so a nullable `int` would have to either make some 32
bit signed<br>
integers unrepresentable, or use a 33rd bit). This was a
reasonable tradeoff<br>
for the built-in primitives, and is also a reasonable tradeoff
for many other<br>
potential value classes (such as complex numbers, 2D points,
half-floats, etc).<br></font></font></div></blockquote><div><br></div><div>You might not want to go into the following. But I hope that users will understand that the numeric types really do clear a pretty high bar here. They are fortunate that for the *two* most popular reduction operations over those types, zero happens to be the correct identity for one of them, and absolutely destructive to the other (i.e., making it at least easy to detect the bug). If not for *both* of those facts we would have more and worse bugs in the world.</div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><font size="4"><font face="monospace">But for other potential value classes, such as `LocalDate`,
there simply _is_ no<br>
reasonable default. If we choose to represent a date as the
number of days<br>
since some some epoch, there will invariably be bugs that stem
from<br>
uninitialized dates; we've all been mistakenly told by computers
that something<br>
that never happened actually happened on or near 1 January
1970. Even if we<br>
could choose a default other than the zero representation as a
default, an<br>
uninitialized date is still likely to be an error -- there
simply is no good<br>
default date value. <br>
<br>
For this reason, value classes have the choice of
_encapsulating_ their value<br>
companion type. If the class is willing to tolerate an
uninitialized (zero)<br>
value, it can freely share its `.val` companion with the world;
if uninitialized<br>
values are dangerous (such as for `LocalDate`), the value
companion can be<br>
encapsulated to the class or package, and clients can use the
reference<br>
companion. Encapsulation is accomplished using ordinary access
control. By<br>
default, the value companion is `private` to the value class (it
need not be<br>
declared explicitly); a class that wishes to share its value
companion more<br>
broadly can do so by declaring it explicitly:<br>
<br>
```<br>
public value record Complex(double real, double imag) { <br>
public value companion Complex.val;<br>
}<br>
```<br></font></font></div></blockquote><div><br></div><div>I think you should add that the name `Complex.val` can't be changed here, much like you can't change the name of a constructor even though it *looks* like you could.</div><div><br></div></div>-- <br><div dir="ltr"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div style="line-height:1.5em;padding-top:10px;margin-top:10px;color:rgb(85,85,85);font-family:sans-serif"><span style="border-width:2px 0px 0px;border-style:solid;border-color:rgb(213,15,37);padding-top:2px;margin-top:2px">Kevin Bourrillion |</span><span style="border-width:2px 0px 0px;border-style:solid;border-color:rgb(51,105,232);padding-top:2px;margin-top:2px"> Java Librarian |</span><span style="border-width:2px 0px 0px;border-style:solid;border-color:rgb(0,153,57);padding-top:2px;margin-top:2px"> Google, Inc. |</span><span style="border-width:2px 0px 0px;border-style:solid;border-color:rgb(238,178,17);padding-top:2px;margin-top:2px"> <a href="mailto:kevinb@google.com" target="_blank">kevinb@google.com</a></span></div></div></div></div></div></div></div></div>