<!DOCTYPE html><html><head>

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

</head>

<body><div style="font-family: sans-serif;"><div class="markdown" style="white-space: normal;">

<p dir="auto">On 26 Jul 2022, at 11:18, Brian Goetz wrote:</p>

</div><div class="plaintext" style="white-space: normal;"><blockquote style="margin: 0 0 5px; padding-left: 5px; border-left: 2px solid #777777; color: #777777;"><p dir="auto">Yet another attempt at updating SoV to reflect the current thinking.  Please review.</p>

<p dir="auto"> # State of Valhalla

<br>

 ## Part 2: The Language Model {.subtitle}</p>

<p dir="auto"> #### Brian Goetz {.author}

<br>

 #### July 2022 {.date}</p>

</blockquote></div>

<div class="markdown" style="white-space: normal;">

<p dir="auto">Here’s a big diff on the MD file.  (I scraped the MD out of my mailer, which is an iffy proposition.)</p>

<pre style="margin-left: 15px; margin-right: 15px; padding: 5px; background-color: #F7F7F7; border-radius: 5px 5px 5px 5px; overflow-x: auto; max-width: 90vw;"><code style="margin: 0; border-radius: 3px; background-color: #F7F7F7; padding: 0px;">> --- a/Users/jrose/Projects/openjdk/valhalla-docs/site/design-notes/state-of-valhalla/02-object-model-take-3.md.~1~

> +++ b/Users/jrose/Projects/openjdk/valhalla-docs/site/design-notes/state-of-valhalla/02-object-model-take-3.md

> @@ -24,7 +24,7 @@ libraries, not as a language feature.

>  Java currently has eight built-in primitive types.  Primitives represent pure

>  _values_; any `int` value of "3" is equivalent to, and indistinguishable from,

>  any other `int` value of "3".  Because primitives are "just their bits" with no

> -ancillarly state such as object identity, they are _freely copyable_; whether

> +ancillary state such as object identity, they are _freely copyable_; whether

>  there is one copy of the `int` value "3", or millions, doesn't matter to the

>  execution of the program.  With the exception of the unusual treatment of exotic

>  floating point values such as `NaN`, the `==` operator on primitives performs a

> @@ -53,10 +53,10 @@ Primitives and objects currently differ in almost every conceivable way:

>  | Primitives                                 | Objects                            |

>  | ------------------------------------------ | ---------------------------------- |

>  | No identity (pure values)                  | Identity                           |

> -| `==` compares values                       | `==` compares object identity      |

> +| Operator `==` compares values              | Operator `==` compares object identity <!-- leading `==` looks awkward, like markup --> |

>  | Built-in                                   | Declared in classes                |

>  | No members (fields, methods, constructors) | Members (including mutable fields) |

> -| No supertypes or subtypes                  | Class and interface inheritance    |

> +| No inherited supertypes or subtypes        | Class and interface inheritance <!-- sadly, `int` <: `long` --> |

>  | Accessed directly                          | Accessed via object references     |

>  | Not nullable                               | Nullable                           |

>  | Default value is zero                      | Default value is null              |

> @@ -64,7 +64,7 @@ Primitives and objects currently differ in almost every conceivable way:

>  | May tear under race                        | Initialization safety guarantees   |

>  | Have reference companions (boxes)          | Don't need reference companions    |

> -Primitives embody a number tradeoffs aimed at maximizing the performance and

> +Primitives embody a number of tradeoffs aimed at maximizing the performance and

>  usability of the primitive types.  Reference types default to `null`, meaning

>  "referring to no object", and must be initialized before use; primitives default

>  to a usable zero value (which for most primitives is the additive identity) and

> @@ -77,6 +77,7 @@ under a certain category of data races (this is where we get the "immutable

>  objects are always thread-safe" rule from); primitives allow tearing under race

>  for larger-than-32-bit values.  We could characterize the design principles

>  behind these tradeoffs are "make objects safer, make primitives faster."

> +<!-- yes, ends with a good strong point -->

>  The following figure illustrates the current universe of Java's types.  The

>  upper left quadrant is the built-in primitives; the rest of the space is

> @@ -140,9 +141,10 @@ value class Point implements Serializable {

>  This says that an `Point` is a class whose instances have no identity.  As a

>  consequence, it must give up the things that depend on identity; the class and

> -its fields are implicitly final.  Additionally, operations that depended on

> -identity must either be adjusted (`==` on value objects compares state, not

> -identity) or disallowed (it is illegal to lock on a value object.)

> +its fields are implicitly final.  Additionally, operations that depend on

> +identity are adjusted as necessary for value objects.  (For example, operator `==` on compares state not

> +identity, and it is illegal to lock on a value object.)

> +<!-- "must either be" seems awkward: it suggests that it is a task to be done later -->

>  Value classes can still have most of the affordances of classes -- fields,

>  methods, constructors, type parameters, superclasses (with some restrictions),

> @@ -190,7 +192,7 @@ value class ArrayCursor<T> {

>          return offset < array.length;

>      }

> -    public T next() {

> +    public T get() {

>          return array[offset];

>      }

> @@ -199,6 +201,12 @@ value class ArrayCursor<T> {

>      }

>  }

>  ```

> +<!-- My old sketch of cursors cleverly keeps `next`/`hasNext`.  I'm doubting

> +     this choice a bit.  It makes it appear that Cursor<T> <: Iterator<T>

> +     since Cursor has both of Iterator's methods, but that's wrong because

> +     of behavior.  If the behavior is incompatible, then maybe the names

> +     should differ.  In addition, Cursor<T> <: Supplier<T> is legitimately

> +     true and interesting, hence `get`. -->

>  In looking at this code, we might mistakenly assume it will be inefficient, as

>  each loop iteration appears to allocate a new cursor:

> @@ -224,8 +232,8 @@ compare in the loop header.

>  The JDK (as well as other libraries) has many [value-based classes][valuebased]

>  such as `Optional` and `LocalDateTime`.  Value-based classes adhere to the

> -semantic restrictions of value classes, but are still identity classes -- even

> -though they don't want to be.  Value-based classes can be migrated to true value

> +semantic restrictions of value classes, but they still possess identity -- even

> +though they don't want it.  Value-based classes can be migrated to true value

>  classes simply by redeclaring them as value classes, which is both source- and

>  binary-compatible.   @@ -325,7 +333,7 @@ the reference and value companion types are not nearly as heavy or wasteful,

>  because of the lack of identity.  A variable of type `Point.val` holds a "bare"

>  value object; a variable of type `Point.ref` holds a _reference to_ a value

>  object.  For many use cases, the reference type will offer good enough

> -performance; in some cases, it may be desire to additionally give up the

> +performance; in some cases, the discerning user may choose to give up the

>  affordances of reference-ness to make further flatness and footprint gains.  See

>  [Performance Model](05-performance-model) for more details on the specific

>  tradeoffs.

> @@ -336,6 +344,7 @@ primitives:

>  ** UPDATE DIAGRAM **

> +<!-- gack, still working on this; will adjust terms and syntax -->

>  <figure>

>    <a href="field-type-zoo.pdf" title="Click for PDF">

>      <img src="field-type-zoo-new.png" alt="Java field types with extended primitives"/>

> @@ -381,15 +390,15 @@ if (us instanceof Number) { ... }

>  Since subtyping is defined only on reference types, the `instanceof` operator

>  (and corresponding type patterns) will behave as if both sides were lifted to

> -the appropriate reference type (unboxed), and then we can appeal to subtyping.

> +the appropriate reference type (boxing any bare value), and then we can appeal to subtyping.

>  (This may trigger fears of expensive boxing conversions, but in reality no

>  actual allocation will happen.)

>  We introduce a new relationship between types based on `extends` / `implements`

> -clauses, which we'll call "extends": we define `A extends B` as meaning `A <: B`

> +clauses, which we'll call "`extends`": we define `A extends B` as meaning `A <: B`

>  when A is a reference type, and `A.ref <: B` when A is a value companion type.

>  The `instanceof` relation, reflection, and pattern matching are updated to use

> -"extends".

> +`extends`.

>  ### Array covariance

> @@ -397,24 +406,28 @@ Arrays of reference types are _covariant_; this means that if `A <: B`, then

>  `A[] <: B[]`.  This allows `Object[]` to be the "top array type" -- but only for

>  arrays of references.  Arrays of primitives are currently left out of this

>  story.   We unify the treatment of arrays by defining array covariance over the

> -new "extends" relationship; if A _extends_ B, then `A[] <: B[]`.  This means

> +new `extends` relationship; if A `extends` B, then `A[] <: B[]`.  This means

>  that for a value class P, `P.val[] <: P.ref[] <: Object[]`; when we migrate the

>  primitive types to be value classes, then `Object[]` is finally the top type for

>  all arrays.  (When the built-in primitives are migrated to value classes, this

>  means `int[] <: Integer[] <: Object[]` too.)

> +<!-- last two sentences are redundant.  Suggest:

> +> When the built-in primitives are migrated to value classes, this

> +> means `int[] <: Integer[] <: Object[]` too.  Then `Object[]` will

> +> be the top type for all arrays.  -->

>  ### Equality

> -For values, as with primitives, `==` compares by state rather than by identity.

> +For values, as with primitives, operator `==` compares by state rather than by identity.

>  Two value objects are `==` if they are of the same type and their fields are

> -pairwise equal, where equality is defined by `==` for primitives (except `float`

> -and `double`, which are compared with `Float::equals` and `Double::equals` to

> -avoid anomalies), `==` for references to identity objects, and recursively with

> -`==` for references to value objects.  In no case is a value object ever `==` to

> +pairwise the same, where sameness is defined by bitwise equality (operator `==` for primitives except `float`

> +and `double`, which are compared as if by `Float::equals` and `Double::equals` to

> +avoid anomalies), reference equality (operator `==`) for references to identity objects (and for `null`), and recursively with

> +operaetor `==` for references to value objects.  In no case is a value object ever `==` to

>  an identity object.

>  When comparing two object _references_ with `==`, they are equal if they are

> -both null, or if they are both references to the same identity object, or they

> +both `null`, or if they are both references to the same identity object, or they

>  are both references to value objects that are `==`.  (When comparing a value

>  type with a reference type, we treat this as if we convert the value to a

>  reference, and proceed as per comparing references.)  This means that the

> @@ -489,6 +502,10 @@ public value record Complex(double real, double imag) {

>      public value companion Complex.val;

>  }

>  ```

> +<!-- Small grumble about repeat of the word `value`: I would certainly

> +     rather elide `value` than `.val` in the sample syntax, because

> +     `value` just repeats itself, while `.val` makes it crystal clear,

> +     even to the casual reader, which companion type is being declared. -->

>  ### Atomicity and tearing

> @@ -534,8 +551,9 @@ public value record Complex(double real, double imag) {

>  For classes like `Complex`, all of whose bit patterns are valid, this is very

>  much like the choice around `long` in 1995.  For other classes that might have

>  nontrivial representational invariants -- specifically, invariants that relate

> -multiple fields, such as ensuring that a range goes from low to high -- they

> -likely want to stick to the default of atomicity.  +multiple fields, such as ensuring that a range goes from low to high --

> +the default of atomicity is likely to be a better choice.

> +<!-- (who is this "they"?) -->

>  ## Do we really need two types?

> @@ -658,7 +676,7 @@ types:

>  | Primitives                                 | Objects                            |

>  | ------------------------------------------ | ---------------------------------- |

>  | No identity (pure values)                  | Identity                           |

> -| `==` compares values                       | `==` compares object identity      |

> +| Operator `==` compares values              | Operator `==` compares object identity |

>  | Built-in                                   | Declared in classes                |

>  | No members (fields, methods, constructors) | Members (including mutable fields) |

>  | No supertypes or subtypes                  | Class and interface inheritance    |

> @@ -672,10 +690,21 @@ types:

>  The addition of value classes addresses many of these directly.  Rather than

>  saying "classes have identity, primitives do not", we make identity an optional

>  characteristic of classes (and derive equality semantics from that.)  Rather

> -than primitives being built in, we derive all types, including primitives, from

> +than primitives being built in, we derive all types, including existing primitives and new primitive-like types, from

>  classes, and endow value companion types with the members and supertypes

>  declared with the value class.  Rather than having primitive arrays be

>  monomorphic, we make all arrays covariant under the `extends` relation.  +<!-- I'm uncomfortable with suddenly extending the meaning of the word

> +     primitive on the fly here, and may still be uncomfortable even if

> +     it is more properly introduced.  After all, a primitive is primitive.

> +     We used to say "extended primitive" to be explicit about the new

> +     primitive-like types.  For the moment, I have simply edited some

> +     occurrences of the word "primitives", when evidently carrying the

> +     new meaning, to the phrase "primitive-like types".  Not because I

> +     think that's a great phrase; I could have said "quasi-primitives"

> +     or anything other than just "primitives".  Let's pick a word for

> +     "the value companion types when we view them as new user-defined

> +     primitives".  -->

>  The remaining differences now become differences between reference types and

>  value types:

> @@ -687,21 +716,22 @@ value types:

>  | Default value is zero                         | Default value is null            |

>  | May tear under race, if declared `non-atomic` | Initialization safety guarantees |

> -The current dichotomy between primitives and references morphs to one between

> +The current dichotomy between primitive-like types and references morphs to one between

>  value objects and references, where the legacy primitives become (slightly

>  special) value objects, and, finally, "everything is an object".

>  ## Summary

> -Valhalla unifies, to the extent possible, primitives and objects.   The

> +Valhalla unifies, to the extent possible, primitives and objects and introduces

> +primitive-like types as optional companions to classes.   The

>  following table summarizes the transition from the current world to Valhalla.

>  | Current World                               | Valhalla                                                  |

>  | ------------------------------------------- | --------------------------------------------------------- |

>  | All objects have identity                   | Some objects have identity                                |

> -| Fixed, built-in set of primitives           | Open-ended set of primitives, declared via classes        |

> -| Primitives don't have methods or supertypes | Primitives are classes, with methods and supertypes       |

> -| Primitives have ad-hoc boxes                | Primitives have regularized reference companions          |

> +| Fixed, built-in set of primitives           | Open-ended set of primitive-like types, declared via classes        |

> +| Primitives don't have methods or supertypes | Primitive-like types are classes, with methods and supertypes       |

> +| Primitives have ad-hoc boxes                | Primitive-like types have regularized reference companions          |

>  | Boxes have accidental identity              | Reference companions have no identity                     |

>  | Boxing and unboxing conversions             | Primitive reference and value conversions, but same rules |

>  | Primitive arrays are monomorphic            | All arrays are covariant                                  |

</code></pre>


</div></div></body>


</html>