<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body><div style="font-family: sans-serif;"><div class="markdown" style="white-space: normal;">
<p dir="auto">On 26 Jul 2022, at 11:18, Brian Goetz wrote:</p>
</div><div class="plaintext" style="white-space: normal;"><blockquote style="margin: 0 0 5px; padding-left: 5px; border-left: 2px solid #777777; color: #777777;"><p dir="auto">Yet another attempt at updating SoV to reflect the current thinking. Please review.</p>
<p dir="auto"> # State of Valhalla
<br>
## Part 2: The Language Model {.subtitle}</p>
<p dir="auto"> #### Brian Goetz {.author}
<br>
#### July 2022 {.date}</p>
</blockquote></div>
<div class="markdown" style="white-space: normal;">
<p dir="auto">Here’s a big diff on the MD file. (I scraped the MD out of my mailer, which is an iffy proposition.)</p>
<pre style="margin-left: 15px; margin-right: 15px; padding: 5px; background-color: #F7F7F7; border-radius: 5px 5px 5px 5px; overflow-x: auto; max-width: 90vw;"><code style="margin: 0; border-radius: 3px; background-color: #F7F7F7; padding: 0px;">> --- a/Users/jrose/Projects/openjdk/valhalla-docs/site/design-notes/state-of-valhalla/02-object-model-take-3.md.~1~
> +++ b/Users/jrose/Projects/openjdk/valhalla-docs/site/design-notes/state-of-valhalla/02-object-model-take-3.md
> @@ -24,7 +24,7 @@ libraries, not as a language feature.
> Java currently has eight built-in primitive types. Primitives represent pure
> _values_; any `int` value of "3" is equivalent to, and indistinguishable from,
> any other `int` value of "3". Because primitives are "just their bits" with no
> -ancillarly state such as object identity, they are _freely copyable_; whether
> +ancillary state such as object identity, they are _freely copyable_; whether
> there is one copy of the `int` value "3", or millions, doesn't matter to the
> execution of the program. With the exception of the unusual treatment of exotic
> floating point values such as `NaN`, the `==` operator on primitives performs a
> @@ -53,10 +53,10 @@ Primitives and objects currently differ in almost every conceivable way:
> | Primitives | Objects |
> | ------------------------------------------ | ---------------------------------- |
> | No identity (pure values) | Identity |
> -| `==` compares values | `==` compares object identity |
> +| Operator `==` compares values | Operator `==` compares object identity <!-- leading `==` looks awkward, like markup --> |
> | Built-in | Declared in classes |
> | No members (fields, methods, constructors) | Members (including mutable fields) |
> -| No supertypes or subtypes | Class and interface inheritance |
> +| No inherited supertypes or subtypes | Class and interface inheritance <!-- sadly, `int` <: `long` --> |
> | Accessed directly | Accessed via object references |
> | Not nullable | Nullable |
> | Default value is zero | Default value is null |
> @@ -64,7 +64,7 @@ Primitives and objects currently differ in almost every conceivable way:
> | May tear under race | Initialization safety guarantees |
> | Have reference companions (boxes) | Don't need reference companions |
> -Primitives embody a number tradeoffs aimed at maximizing the performance and
> +Primitives embody a number of tradeoffs aimed at maximizing the performance and
> usability of the primitive types. Reference types default to `null`, meaning
> "referring to no object", and must be initialized before use; primitives default
> to a usable zero value (which for most primitives is the additive identity) and
> @@ -77,6 +77,7 @@ under a certain category of data races (this is where we get the "immutable
> objects are always thread-safe" rule from); primitives allow tearing under race
> for larger-than-32-bit values. We could characterize the design principles
> behind these tradeoffs are "make objects safer, make primitives faster."
> +<!-- yes, ends with a good strong point -->
> The following figure illustrates the current universe of Java's types. The
> upper left quadrant is the built-in primitives; the rest of the space is
> @@ -140,9 +141,10 @@ value class Point implements Serializable {
> This says that an `Point` is a class whose instances have no identity. As a
> consequence, it must give up the things that depend on identity; the class and
> -its fields are implicitly final. Additionally, operations that depended on
> -identity must either be adjusted (`==` on value objects compares state, not
> -identity) or disallowed (it is illegal to lock on a value object.)
> +its fields are implicitly final. Additionally, operations that depend on
> +identity are adjusted as necessary for value objects. (For example, operator `==` on compares state not
> +identity, and it is illegal to lock on a value object.)
> +<!-- "must either be" seems awkward: it suggests that it is a task to be done later -->
> Value classes can still have most of the affordances of classes -- fields,
> methods, constructors, type parameters, superclasses (with some restrictions),
> @@ -190,7 +192,7 @@ value class ArrayCursor<T> {
> return offset < array.length;
> }
> - public T next() {
> + public T get() {
> return array[offset];
> }
> @@ -199,6 +201,12 @@ value class ArrayCursor<T> {
> }
> }
> ```
> +<!-- My old sketch of cursors cleverly keeps `next`/`hasNext`. I'm doubting
> + this choice a bit. It makes it appear that Cursor<T> <: Iterator<T>
> + since Cursor has both of Iterator's methods, but that's wrong because
> + of behavior. If the behavior is incompatible, then maybe the names
> + should differ. In addition, Cursor<T> <: Supplier<T> is legitimately
> + true and interesting, hence `get`. -->
> In looking at this code, we might mistakenly assume it will be inefficient, as
> each loop iteration appears to allocate a new cursor:
> @@ -224,8 +232,8 @@ compare in the loop header.
> The JDK (as well as other libraries) has many [value-based classes][valuebased]
> such as `Optional` and `LocalDateTime`. Value-based classes adhere to the
> -semantic restrictions of value classes, but are still identity classes -- even
> -though they don't want to be. Value-based classes can be migrated to true value
> +semantic restrictions of value classes, but they still possess identity -- even
> +though they don't want it. Value-based classes can be migrated to true value
> classes simply by redeclaring them as value classes, which is both source- and
> binary-compatible. @@ -325,7 +333,7 @@ the reference and value companion types are not nearly as heavy or wasteful,
> because of the lack of identity. A variable of type `Point.val` holds a "bare"
> value object; a variable of type `Point.ref` holds a _reference to_ a value
> object. For many use cases, the reference type will offer good enough
> -performance; in some cases, it may be desire to additionally give up the
> +performance; in some cases, the discerning user may choose to give up the
> affordances of reference-ness to make further flatness and footprint gains. See
> [Performance Model](05-performance-model) for more details on the specific
> tradeoffs.
> @@ -336,6 +344,7 @@ primitives:
> ** UPDATE DIAGRAM **
> +<!-- gack, still working on this; will adjust terms and syntax -->
> <figure>
> <a href="field-type-zoo.pdf" title="Click for PDF">
> <img src="field-type-zoo-new.png" alt="Java field types with extended primitives"/>
> @@ -381,15 +390,15 @@ if (us instanceof Number) { ... }
> Since subtyping is defined only on reference types, the `instanceof` operator
> (and corresponding type patterns) will behave as if both sides were lifted to
> -the appropriate reference type (unboxed), and then we can appeal to subtyping.
> +the appropriate reference type (boxing any bare value), and then we can appeal to subtyping.
> (This may trigger fears of expensive boxing conversions, but in reality no
> actual allocation will happen.)
> We introduce a new relationship between types based on `extends` / `implements`
> -clauses, which we'll call "extends": we define `A extends B` as meaning `A <: B`
> +clauses, which we'll call "`extends`": we define `A extends B` as meaning `A <: B`
> when A is a reference type, and `A.ref <: B` when A is a value companion type.
> The `instanceof` relation, reflection, and pattern matching are updated to use
> -"extends".
> +`extends`.
> ### Array covariance
> @@ -397,24 +406,28 @@ Arrays of reference types are _covariant_; this means that if `A <: B`, then
> `A[] <: B[]`. This allows `Object[]` to be the "top array type" -- but only for
> arrays of references. Arrays of primitives are currently left out of this
> story. We unify the treatment of arrays by defining array covariance over the
> -new "extends" relationship; if A _extends_ B, then `A[] <: B[]`. This means
> +new `extends` relationship; if A `extends` B, then `A[] <: B[]`. This means
> that for a value class P, `P.val[] <: P.ref[] <: Object[]`; when we migrate the
> primitive types to be value classes, then `Object[]` is finally the top type for
> all arrays. (When the built-in primitives are migrated to value classes, this
> means `int[] <: Integer[] <: Object[]` too.)
> +<!-- last two sentences are redundant. Suggest:
> +> When the built-in primitives are migrated to value classes, this
> +> means `int[] <: Integer[] <: Object[]` too. Then `Object[]` will
> +> be the top type for all arrays. -->
> ### Equality
> -For values, as with primitives, `==` compares by state rather than by identity.
> +For values, as with primitives, operator `==` compares by state rather than by identity.
> Two value objects are `==` if they are of the same type and their fields are
> -pairwise equal, where equality is defined by `==` for primitives (except `float`
> -and `double`, which are compared with `Float::equals` and `Double::equals` to
> -avoid anomalies), `==` for references to identity objects, and recursively with
> -`==` for references to value objects. In no case is a value object ever `==` to
> +pairwise the same, where sameness is defined by bitwise equality (operator `==` for primitives except `float`
> +and `double`, which are compared as if by `Float::equals` and `Double::equals` to
> +avoid anomalies), reference equality (operator `==`) for references to identity objects (and for `null`), and recursively with
> +operaetor `==` for references to value objects. In no case is a value object ever `==` to
> an identity object.
> When comparing two object _references_ with `==`, they are equal if they are
> -both null, or if they are both references to the same identity object, or they
> +both `null`, or if they are both references to the same identity object, or they
> are both references to value objects that are `==`. (When comparing a value
> type with a reference type, we treat this as if we convert the value to a
> reference, and proceed as per comparing references.) This means that the
> @@ -489,6 +502,10 @@ public value record Complex(double real, double imag) {
> public value companion Complex.val;
> }
> ```
> +<!-- Small grumble about repeat of the word `value`: I would certainly
> + rather elide `value` than `.val` in the sample syntax, because
> + `value` just repeats itself, while `.val` makes it crystal clear,
> + even to the casual reader, which companion type is being declared. -->
> ### Atomicity and tearing
> @@ -534,8 +551,9 @@ public value record Complex(double real, double imag) {
> For classes like `Complex`, all of whose bit patterns are valid, this is very
> much like the choice around `long` in 1995. For other classes that might have
> nontrivial representational invariants -- specifically, invariants that relate
> -multiple fields, such as ensuring that a range goes from low to high -- they
> -likely want to stick to the default of atomicity. +multiple fields, such as ensuring that a range goes from low to high --
> +the default of atomicity is likely to be a better choice.
> +<!-- (who is this "they"?) -->
> ## Do we really need two types?
> @@ -658,7 +676,7 @@ types:
> | Primitives | Objects |
> | ------------------------------------------ | ---------------------------------- |
> | No identity (pure values) | Identity |
> -| `==` compares values | `==` compares object identity |
> +| Operator `==` compares values | Operator `==` compares object identity |
> | Built-in | Declared in classes |
> | No members (fields, methods, constructors) | Members (including mutable fields) |
> | No supertypes or subtypes | Class and interface inheritance |
> @@ -672,10 +690,21 @@ types:
> The addition of value classes addresses many of these directly. Rather than
> saying "classes have identity, primitives do not", we make identity an optional
> characteristic of classes (and derive equality semantics from that.) Rather
> -than primitives being built in, we derive all types, including primitives, from
> +than primitives being built in, we derive all types, including existing primitives and new primitive-like types, from
> classes, and endow value companion types with the members and supertypes
> declared with the value class. Rather than having primitive arrays be
> monomorphic, we make all arrays covariant under the `extends` relation. +<!-- I'm uncomfortable with suddenly extending the meaning of the word
> + primitive on the fly here, and may still be uncomfortable even if
> + it is more properly introduced. After all, a primitive is primitive.
> + We used to say "extended primitive" to be explicit about the new
> + primitive-like types. For the moment, I have simply edited some
> + occurrences of the word "primitives", when evidently carrying the
> + new meaning, to the phrase "primitive-like types". Not because I
> + think that's a great phrase; I could have said "quasi-primitives"
> + or anything other than just "primitives". Let's pick a word for
> + "the value companion types when we view them as new user-defined
> + primitives". -->
> The remaining differences now become differences between reference types and
> value types:
> @@ -687,21 +716,22 @@ value types:
> | Default value is zero | Default value is null |
> | May tear under race, if declared `non-atomic` | Initialization safety guarantees |
> -The current dichotomy between primitives and references morphs to one between
> +The current dichotomy between primitive-like types and references morphs to one between
> value objects and references, where the legacy primitives become (slightly
> special) value objects, and, finally, "everything is an object".
> ## Summary
> -Valhalla unifies, to the extent possible, primitives and objects. The
> +Valhalla unifies, to the extent possible, primitives and objects and introduces
> +primitive-like types as optional companions to classes. The
> following table summarizes the transition from the current world to Valhalla.
> | Current World | Valhalla |
> | ------------------------------------------- | --------------------------------------------------------- |
> | All objects have identity | Some objects have identity |
> -| Fixed, built-in set of primitives | Open-ended set of primitives, declared via classes |
> -| Primitives don't have methods or supertypes | Primitives are classes, with methods and supertypes |
> -| Primitives have ad-hoc boxes | Primitives have regularized reference companions |
> +| Fixed, built-in set of primitives | Open-ended set of primitive-like types, declared via classes |
> +| Primitives don't have methods or supertypes | Primitive-like types are classes, with methods and supertypes |
> +| Primitives have ad-hoc boxes | Primitive-like types have regularized reference companions |
> | Boxes have accidental identity | Reference companions have no identity |
> | Boxing and unboxing conversions | Primitive reference and value conversions, but same rules |
> | Primitive arrays are monomorphic | All arrays are covariant |
</code></pre>
</div></div></body>
</html>