<html><head>

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  </head>

  <body>

    <font size="4"><font face="monospace">Yet another attempt at

        updating SoV to reflect the current thinking.  Please review.<br>

        <br>

        <br>

        # State of Valhalla<br>

        ## Part 2: The Language Model {.subtitle}<br>

        <br>

        #### Brian Goetz {.author}<br>

        #### July 2022 {.date}<br>

        <br>

        > _This is the second of three documents describing the

        current State of<br>

          Valhalla.  The first is [The Road to Valhalla](01-background);

        the<br>

          third is [The JVM Model](03-vm-model)._<br>

        <br>

        This document describes the directions for the Java _language_

        charted by<br>

        Project Valhalla.  (In this document, we use "currently" to

        describe the<br>

        language as it stands today, without value classes.)<br>

        <br>

        Valhalla started with the goal of providing user-programmable

        classes which can<br>

        be flat and dense in memory.  Numerics are one of the motivating

        use cases;<br>

        adding new primitive types directly to the language has a very

        high barrier.  As<br>

        we learned from [Growing a Language][growing] there are

        infinitely many numeric<br>

        types we might want to add to Java, but the proper way to do

        that is via<br>

        libraries, not as a language feature.<br>

        <br>

        ## Primitive and objects today<br>

        <br>

        Java currently has eight built-in primitive types.  Primitives

        represent pure<br>

        _values_; any `int` value of "3" is equivalent to, and

        indistinguishable from,<br>

        any other `int` value of "3".  Because primitives are "just

        their bits" with no<br>

        ancillarly state such as object identity, they are _freely

        copyable_; whether<br>

        there is one copy of the `int` value "3", or millions, doesn't

        matter to the<br>

        execution of the program.  With the exception of the unusual

        treatment of exotic<br>

        floating point values such as `NaN`, the `==` operator on

        primitives performs a<br>

        _substitutibility test_ -- it asks "are these two values the

        same value".<br>

        <br>

        Java also has _objects_, and each object has a unique _object

        identity_.  This<br>

        means that each object must live in exactly one place (at any

        given time), and<br>

        this has consequences for how the JVM lays out objects in

        memory.  Objects in<br>

        Java are not manipulated or accessed directly, but instead

        through _object<br>

        references_.  Object references are also a kind of value -- they

        encode the<br>

        identity of the object to which they refer, and the `==`

        operator on object<br>

        references also performs a substitutibility test, asking "do

        these two<br>

        references refer to the same object."  Accordingly, object

        _references_ (like<br>

        other values) can be freely copied, but the objects they refer

        to cannot.  <br>

        <br>

        This dichotomy -- that the universe of values consists of

        primitives and object<br>

        references -- has long been at the core of Java's design.  JVMS

        2.2 (Data Types)<br>

        opens with:<br>

        <br>

        > There are two kinds of values that can be stored in

        variables, passed as<br>

        > arguments, returned by methods, and operated upon:

        primitive values and<br>

        > reference values.<br>

        <br>

        Primitives and objects currently differ in almost every

        conceivable way:<br>

        <br>

        | Primitives                                 |

        Objects                            |<br>

        | ------------------------------------------ |

        ---------------------------------- |<br>

        | No identity (pure values)                  |

        Identity                           |<br>

        | `==` compares values                       | `==` compares

        object identity      |<br>

        | Built-in                                   | Declared in

        classes                |<br>

        | No members (fields, methods, constructors) | Members

        (including mutable fields) |<br>

        | No supertypes or subtypes                  | Class and

        interface inheritance    |<br>

        | Accessed directly                          | Accessed via

        object references     |<br>

        | Not nullable                               |

        Nullable                           |<br>

        | Default value is zero                      | Default value is

        null              |<br>

        | Arrays are monomorphic                     | Arrays are

        covariant               |<br>

        | May tear under race                        | Initialization

        safety guarantees   |<br>

        | Have reference companions (boxes)          | Don't need

        reference companions    |<br>

        <br>

        Primitives embody a number tradeoffs aimed at maximizing the

        performance and<br>

        usability of the primitive types.  Reference types default to

        `null`, meaning<br>

        "referring to no object", and must be initialized before use;

        primitives default<br>

        to a usable zero value (which for most primitives is the

        additive identity) and<br>

        therefore may be used without initialization.  (If primitives

        were nullable like<br>

        references, not only would this be less convenient in many

        situations, but they<br>

        would likely consume additional memory footprint to accomodate

        the possibility<br>

        of nullity, as most primitives already use all their bit

        patterns.)  Similarly,<br>

        reference types provide initialization safety guarantees for

        final fields even<br>

        under a certain category of data races (this is where we get the

        "immutable<br>

        objects are always thread-safe" rule from); primitives allow

        tearing under race<br>

        for larger-than-32-bit values.  We could characterize the design

        principles<br>

        behind these tradeoffs are "make objects safer, make primitives

        faster."<br>

        <br>

        The following figure illustrates the current universe of Java's

        types.  The<br>

        upper left quadrant is the built-in primitives; the rest of the

        space is<br>

        reference types.  In the upper-right, we have the abstract

        reference types --<br>

        abstract classes, interfaces, and `Object` (which, though

        concrete, acts more<br>

        like an interface than a concrete class).  The built-in

        primitives have wrappers<br>

        or boxes, which are reference types.<br>

        <br>

        <figure><br>

          <a href="field-type-zoo.pdf" title="Click for PDF"><br>

            <img src="field-type-zoo-old.png" alt="Current universe

        of Java field types"/><br>

          </a><br>

        </figure><br>

        <br>

        Valhalla aims to unify primitives and objects such that both are

        declared with<br>

        classes, but maintains the special runtime characteristics --

        flatness and<br>

        density -- that primitives currently enjoy.  <br>

        <br>

        ### Primitives and boxes today<br>

        <br>

        The built-in primitives are best understood as _pairs_ of types:

        the primitive<br>

        type (`int`) and its reference companion type (`Integer`), with

        built-in<br>

        conversions between the two.  The two types have different

        characteristics that<br>

        makes each more or less appropriate for a given situations.

        Primitives are<br>

        optimized for efficient storage and access: they are

        monomorphic, not nullable,<br>

        tolerate uninitialized (zero) values, and larger primitive types

        (`long`,<br>

        `double`) may tear under racy access.  The box types add back

        the affordances of<br>

        references -- nullity, polymorphism, interoperation with

        generics, and<br>

        initialization safety -- but at a cost.  <br>

        <br>

        Valhalla generalizes this primitive-box relationship, in a way

        that is more<br>

        regular and extensible and reduces the "boxing tax".<br>

        <br>

        ## Eliminating unwanted object identity<br>

        <br>

        Many impediments to optimization stem from _unwanted object

        identity_. For many<br>

        classes, not only is identity not directly useful, it can be a

        source of bugs.<br>

        For example, due to caching, `Integer` can be accidentally

        compared correctly<br>

        with `==` just often enough that people keep doing it. 

        Similarly, [value-based<br>

        classes][valuebased] such as `Optional` have no need for

        identity, but pay the<br>

        costs of having identity anyway.  <br>

        <br>

        Valhalla allows classes to explicitly disavow identity by

        declaring them as<br>

        _value classes_.  The instances of a value class are called

        _value objects_.  <br>

        <br>

        ```<br>

        value class Point implements Serializable {<br>

            int x;<br>

            int y;<br>

        <br>

            Point(int x, int y) { <br>

                this.x = x;<br>

                this.y = y;<br>

            }<br>

        <br>

            Point scale(int s) { <br>

                return new Point(s*x, s*y);<br>

            }<br>

        }<br>

        ```<br>

        <br>

        This says that an `Point` is a class whose instances have no

        identity.  As a<br>

        consequence, it must give up the things that depend on identity;

        the class and<br>

        its fields are implicitly final.  Additionally, operations that

        depended on<br>

        identity must either be adjusted (`==` on value objects compares

        state, not<br>

        identity) or disallowed (it is illegal to lock on a value

        object.)<br>

        <br>

        Value classes can still have most of the affordances of classes

        -- fields,<br>

        methods, constructors, type parameters, superclasses (with some

        restrictions),<br>

        nested classes, class literals, interfaces, etc.  The classes

        they can extend<br>

        are restricted: `Object` or abstract classes with no instance

        fields, empty<br>

        no-arg constructor bodies, no other constructors, no instance

        initializers, no<br>

        synchronized methods, and whose superclasses all meet this same

        set of<br>

        conditions.  (`Number` is an example of such an abstract class.)<br>

        <br>

        Because `Point` has value semantics, `==` compares by state

        rather than<br>

        identity.  This means that value objects, like primitives, are

        _freely<br>

        copyable_; we can explode them into their fields and

        re-aggregate them into<br>

        another value object, and we cannot tell the difference.  <br>

        <br>

        So far we've addressed the first two lines in our table of

        differences; rather<br>

        than all objects having identity, classes can opt into, or out

        of, object<br>

        identity for their instances.  By allowing classes to exclude

        unwanted identity,<br>

        we free the runtime to make better layout and compilation

        decisions.<br>

        <br>

        ### Example: immutable cursors<br>

        <br>

        Collections today use `Iterator` to facilitate traversal through

        the collection,<br>

        which store iteration state in mutable fields.  While heroic

        optimizations such<br>

        as _escape analysis_ can sometimes eliminate the cost associated

        with iterators,<br>

        such optimizations are fragile and hard to rely on.  Value

        objects offer an<br>

        iteration approach that is more reliably optimized: immutable

        cursors. (Without<br>

        value objects, immutable cursors would be prohibitively

        expensive for<br>

        iteration.)<br>

        <br>

        ```<br>

        value class ArrayCursor<T> { <br>

            T[] array;<br>

            int offset;<br>

        <br>

            public ArrayCursor(T[] array, int offset) { <br>

                this.array = array;<br>

                this.offset = offset;<br>

            }<br>

        <br>

            public ArrayCursor(T[] array) { <br>

                this(array, 0);<br>

            }<br>

        <br>

            public boolean hasNext() { <br>

                return offset < array.length;<br>

            }<br>

        <br>

            public T next() { <br>

                return array[offset];<br>

            }<br>

        <br>

            public ArrayCursor<T> advance() { <br>

                return new ArrayCursor(array, offset+1);<br>

            }<br>

        }<br>

        ```<br>

        <br>

        In looking at this code, we might mistakenly assume it will be

        inefficient, as<br>

        each loop iteration appears to allocate a new cursor:<br>

        <br>

        ```<br>

        for (ArrayCursor<T> c = new ArrayCursor<>(array); <br>

             c.hasNext(); <br>

             c = c.advance()) {<br>

            // use c.next();<br>

        }<br>

        ```<br>

        <br>

        In reality, we should expect that _no_ cursors are actually

        allocated here.  An<br>

        `ArrayCursor` is just its two fields, and the runtime is free to

        scalarize the<br>

        object into its fields and hoist them into registers.  The

        calling convention<br>

        for `advance` is optimized so that both receiver and return

        value are<br>

        scalarized.  Even without inlining `advance`, no allocation will

        take place,<br>

        just some shuffling of the values in registers.  And if

        `advance` is inlined,<br>

        the client code will compile down to having a single register

        increment and<br>

        compare in the loop header.  <br>

        <br>

        ### Migration<br>

        <br>

        The JDK (as well as other libraries) has many [value-based

        classes][valuebased]<br>

        such as `Optional` and `LocalDateTime`.  Value-based classes

        adhere to the<br>

        semantic restrictions of value classes, but are still identity

        classes -- even<br>

        though they don't want to be.  Value-based classes can be

        migrated to true value<br>

        classes simply by redeclaring them as value classes, which is

        both source- and<br>

        binary-compatible.  <br>

        <br>

        We plan to migrate many value-based classes in the JDK to value

        classes.<br>

        Additionally, the primitive wrappers can be migrated to value

        classes as well,<br>

        making the conversion between `int` and `Integer` cheaper; see

        "Migrating the<br>

        legacy primitives" below.  (In some cases, this may be

        _behaviorally_<br>

        incompatible for code that synchronizes on the primitive

        wrappers.  [JEP<br>

        390][jep390] has supported both compile-time and runtime

        warnings for<br>

        synchronizing on primitive wrappers since Java 16.)  <br>

        <br>

        <figure><br>

          <a href="field-type-zoo.pdf" title="Click for PDF"><br>

            <img src="field-type-zoo-mid.png" alt="Java field types

        adding value classes"/><br>

          </a><br>

        </figure><br>

        <br>

        ### Identity-sensitive operations<br>

        <br>

        Certain operations are currently defined in terms of object

        identity.  As we've<br>

        already seen, some of these, like equality, can be sensibly

        extended to cover<br>

        all instances.  Others, like synchronization, will become

        partial.  <br>

        Identity-sensitive operations include:<br>

        <br>

          - **Equality.**  We extend `==` on references to include

        references to value<br>

            objects.  Where it currently has a meaning, the new

        definition coincides<br>

            with that meaning.<br>

        <br>

          - **System::identityHashCode.**  The main use of

        `identityHashCode` is in the<br>

            implementation of data structures such as

        `IdentityHashMap`.  We can extend<br>

            `identityHashCode` in the same way we extend equality --

        deriving a hash on<br>

            value objects from the hash of all the fields.<br>

        <br>

          - **Synchronization.**  This becomes a partial operation.  If

        we can<br>

            statically detect that a synchronization will fail at

        runtime (including<br>

            declaring a `synchronized` method in a value class), we can

        issue a<br>

            compilation error; if not, attempts to lock on a value

        object results in<br>

            `IllegalMonitorStateException`.  This is justifiable because

        it is<br>

            intrinsically imprudent to lock on an object for which you

        do not have a<br>

            clear understanding of its locking protocol; locking on an

        arbitrary<br>

            `Object` or interface instance is doing exactly that.<br>

        <br>

          - **Weak, soft, and phantom references.**  Capturing an exotic

        reference to a<br>

            value object becomes a partial operation, as these are

        intrinsically tied to<br>

            reachability (and hence to identity).  However, we will

        likely make<br>

            enhancements to `WeakHashMap` to support mixed identity and

        value keys.  <br>

            <br>

        ### Value classes and records<br>

        <br>

        While records have a lot in common with value classes -- they

        are final and<br>

        their fields are final -- they are still identity classes. 

        Records embody a<br>

        tradeoff: give up on decoupling the API from the representation,

        and in return<br>

        get various syntactic and semantic benefits.  Value classes

        embody another<br>

        tradeoff: give up identity, and get various semantic and

        performance benefits.<br>

        If we are willing to give up both, we can get both sets of

        benefits, by<br>

        declaring a _value record_.  <br>

        <br>

        ```<br>

        value record NameAndScore(String name, int score) { }<br>

        ```<br>

        <br>

        Value records combine the data-carrier idiom of records with the

        improved <br>

        scalarization and flattening benefits of value classes.  <br>

        <br>

        In theory, it would be possible to apply `value` to certain

        enums as well, but<br>

        this is not currently possible because the `java.lang.Enum` base

        class that<br>

        enums extend do not meet the requirements for superclasses of

        value classes (it<br>

        has fields and non-empty constructors).<br>

        <br>

        ### Value and reference companion types<br>

        <br>

        Value classes are generalizations of primitives.  Since

        primitives have a<br>

        reference companion type, value classes actually give rise to

        _pairs_ of types:<br>

        a value type and a reference type.  We've seen the reference

        type already; for<br>

        the value class `ArrayCursor`, the reference type is called

        `ArrayCursor`, just<br>

        as with identity classes.  The full name for the reference type

        is<br>

        `ArrayCursor.ref`; `ArrayCursor` is just a convenient alias for

        that.  (This<br>

        aliasing is what allows value-based classes to be compatibly

        migrated to value<br>

        classes.) The value type is called `ArrayCursor.val`, and the

        two types have the<br>

        same conversions between them as primitives do today with their

        boxes.  The<br>

        default value of the value type is the one for which all fields

        take on their<br>

        default value; the default value of the reference type is, like

        all reference<br>

        types, null.  We will refer to the value type of a value class

        as the _value<br>

        companion type_.<br>

        <br>

        Just as with today's primitives and their boxes, the reference

        and value<br>

        companion types of a value class differ in their support for

        nullity,<br>

        polymorphism, treatment of uninitialized variables, and safety

        guarantees under<br>

        race.  Value companion types, like primitive types, are

        monomorphic,<br>

        non-nullable, tolerate uninitialized (zero) values, and (under

        some<br>

        circumstances) may tear under racy access.  Reference types are

        polymorphic,<br>

        nullable, and offer the initialization safety guarantees for

        final fields that<br>

        we have come to expect from identity objects.  <br>

        <br>

        Unlike with today's primitives, the "boxing" and "unboxing"

        conversions between<br>

        the reference and value companion types are not nearly as heavy

        or wasteful,<br>

        because of the lack of identity.  A variable of type `Point.val`

        holds a "bare"<br>

        value object; a variable of type `Point.ref` holds a _reference

        to_ a value<br>

        object.  For many use cases, the reference type will offer good

        enough<br>

        performance; in some cases, it may be desire to additionally

        give up the<br>

        affordances of reference-ness to make further flatness and

        footprint gains.  See<br>

        [Performance Model](05-performance-model) for more details on

        the specific<br>

        tradeoffs.<br>

        <br>

        In our diagram, these new types show up as another entity that

        straddles the<br>

        line between primitives and identity-free references, alongside

        the legacy<br>

        primitives: <br>

        <br>

        ** UPDATE DIAGRAM **<br>

        <br>

        <figure><br>

          <a href="field-type-zoo.pdf" title="Click for PDF"><br>

            <img src="field-type-zoo-new.png" alt="Java field types

        with extended primitives"/><br>

          </a><br>

        </figure><br>

        <br>

        ### Member access<br>

        <br>

        Both the reference and value companion types have the same

        members. Unlike<br>

        today's primitives, value companion types can be used as

        receivers to access<br>

        fields and invoke methods (subject to the usual accessibility

        constraints): <br>

        <br>

        ```<br>

        Point.val p = new Point(1, 2);<br>

        assert p.x == 1;<br>

        <br>

        p = p.scale(2);<br>

        assert p.x == 2;<br>

        ```<br>

        <br>

        ### Polymorphism<br>

        <br>

        An identity class `C` that extends `D` sets up a subtyping

        (is-a) relationship<br>

        between `C` and `D`.  For value classes, the same thing happens

        between its<br>

         _reference type_ and the declared supertypes.  (Reference types

        are<br>

         polymorphic; value types are not.)  This means that if we

        declare:<br>

        <br>

        ```<br>

        value class UnsignedShort extends Number <br>

                                  implements

        Comparable<UnsignedShort> { <br>

           ...<br>

        }<br>

        ```<br>

        <br>

        then `UnsignedShort` is a subtype of `Number` and

        `Comparable<UnsignedShort>`,<br>

        and we can ask questions about subtyping using `instanceof` or

        pattern matching.<br>

        What happens if we ask such a question of the value companion

        type?<br>

        <br>

        ```<br>

        UnsignedShort.val us = ...<br>

        if (us instanceof Number) { ... }<br>

        ```<br>

        <br>

        Since subtyping is defined only on reference types, the

        `instanceof` operator<br>

        (and corresponding type patterns) will behave as if both sides

        were lifted to<br>

        the appropriate reference type (unboxed), and then we can appeal

        to subtyping.<br>

        (This may trigger fears of expensive boxing conversions, but in

        reality no<br>

        actual allocation will happen.)<br>

        <br>

        We introduce a new relationship between types based on `extends`

        / `implements`<br>

        clauses, which we'll call "extends": we define `A extends B` as

        meaning `A <: B`<br>

        when A is a reference type, and `A.ref <: B` when A is a

        value companion type.<br>

        The `instanceof` relation, reflection, and pattern matching are

        updated to use<br>

        "extends".<br>

        <br>

        ### Array covariance<br>

        <br>

        Arrays of reference types are _covariant_; this means that if `A

        <: B`, then<br>

        `A[] <: B[]`.  This allows `Object[]` to be the "top array

        type" -- but only for<br>

        arrays of references.  Arrays of primitives are currently left

        out of this<br>

        story.   We unify the treatment of arrays by defining array

        covariance over the<br>

        new "extends" relationship; if A _extends_ B, then `A[] <:

        B[]`.  This means<br>

        that for a value class P, `P.val[] <: P.ref[] <:

        Object[]`; when we migrate the<br>

        primitive types to be value classes, then `Object[]` is finally

        the top type for<br>

        all arrays.  (When the built-in primitives are migrated to value

        classes, this<br>

        means `int[] <: Integer[] <: Object[]` too.)<br>

        <br>

        ### Equality<br>

        <br>

        For values, as with primitives, `==` compares by state rather

        than by identity.<br>

        Two value objects are `==` if they are of the same type and

        their fields are<br>

        pairwise equal, where equality is defined by `==` for primitives

        (except `float`<br>

        and `double`, which are compared with `Float::equals` and

        `Double::equals` to<br>

        avoid anomalies), `==` for references to identity objects, and

        recursively with<br>

        `==` for references to value objects.  In no case is a value

        object ever `==` to<br>

        an identity object.<br>

        <br>

        When comparing two object _references_ with `==`, they are equal

        if they are<br>

        both null, or if they are both references to the same identity

        object, or they<br>

        are both references to value objects that are `==`.  (When

        comparing a value<br>

        type with a reference type, we treat this as if we convert the

        value to a<br>

        reference, and proceed as per comparing references.)  This means

        that the<br>

        following will succeed: <br>

        <br>

        ```<br>

        Point.val p = new Point(3, 4);<br>

        Point pr = p;<br>

        assert p == pr;<br>

        ```<br>

        <br>

        The base implementation of `Object::equals` delegates to `==`,

        which is a<br>

        suitable default for both reference and value classes.  <br>

        <br>

        ### Serialization<br>

        <br>

        If a value class implements `Serializable`, this is also really

        a statement<br>

        about the reference type.  Just as with other aspects described

        here,<br>

        serialization of value companions can be defined by converting

        to the<br>

        corresponding reference type and serializing that, and reversing

        the process at<br>

        deserialization time.<br>

        <br>

        Serialization currently uses object identity to preserve the

        topology of an<br>

        object graph.  This generalizes cleanly to objects without

        identity, because<br>

        `==` on value objects treats two identical copies of a value

        object as equal.  <br>

        So any observations we make about graph topology prior to

        serialization with<br>

        `==` are consistent with those after deserialization.<br>

        <br>

        ## Refining the value companion<br>

        <br>

        Value classes have several options for refining the behavior of

        the value<br>

        companion type and how they are exposed to clients.<br>

        <br>

        ### Classes with no good default value<br>

        <br>

        For a value class `C`, the default value of `C.ref` is the same

        as any other<br>

        reference type: `null`.  For the value companion type `C.val`,

        the default value<br>

        is the one where all of its fields are initialized to their

        default value (0 for<br>

        numbers, false for boolean, null for references.)<br>

        <br>

        The built-in primitives reflect the design assumption that zero

        is a reasonable<br>

        default.  The choice to use a zero default for uninitialized

        variables was one<br>

        of the central tradeoffs in the design of the built-in

        primitives.  It gives us<br>

        a usable initial value (most of the time), and requires less

        storage footprint<br>

        than a representation that supports null (`int` uses all 2^32 of

        its bit<br>

        patterns, so a nullable `int` would have to either make some 32

        bit signed<br>

        integers unrepresentable, or use a 33rd bit).  This was a

        reasonable tradeoff<br>

        for the built-in primitives, and is also a reasonable tradeoff

        for many other<br>

        potential value classes (such as complex numbers, 2D points,

        half-floats, etc).<br>

        <br>

        But for other potential value classes, such as `LocalDate`,

        there simply _is_ no<br>

        reasonable default.  If we choose to represent a date as the

        number of days<br>

        since some some epoch, there will invariably be bugs that stem

        from<br>

        uninitialized dates; we've all been mistakenly told by computers

        that something<br>

        that never happened actually happened on or near 1 January

        1970.  Even if we<br>

        could choose a default other than the zero representation as a

        default, an<br>

        uninitialized date is still likely to be an error -- there

        simply is no good<br>

        default date value.  <br>

        <br>

        For this reason, value classes have the choice of

        _encapsulating_ their value<br>

        companion type.  If the class is willing to tolerate an

        uninitialized (zero)<br>

        value, it can freely share its `.val` companion with the world;

        if uninitialized<br>

        values are dangerous (such as for `LocalDate`), the value

        companion can be<br>

        encapsulated to the class or package, and clients can use the

        reference<br>

        companion.  Encapsulation is accomplished using ordinary access

        control.  By<br>

        default, the value companion is `private` to the value class (it

        need not be<br>

        declared explicitly); a class that wishes to share its value

        companion more<br>

        broadly can do so by declaring it explicitly:<br>

        <br>

        ```<br>

        public value record Complex(double real, double imag) { <br>

            public value companion Complex.val;<br>

        }<br>

        ```<br>

        <br>

        ### Atomicity and tearing<br>

        <br>

        For the primitive types longer than 32 bits (long and double),

        it was always<br>

        possible that reads and writes from different threads (without

        suitable<br>

        coordination) were not atomic with respect to each other.  This

        means that, if<br>

        accessed under data race, a long or double field or array

        element could be seen<br>

        to "tear", where a read sees the low 32 bits of one write and

        the high 32 bits<br>

        of another.  (Declaring the containing field `volatile` is

        sufficient to restore<br>

        atomicity, as is properly coordinating with locks or other

        concurrency control,<br>

        or not sharing across threads in the first place.)<br>

        <br>

        This was a pragmatic tradeoff given the hardware of the time;

        the cost of 64-bit<br>

        atomicity on 1995 hardware would have been prohibitive, and

        problems only arise<br>

        when the program already has data races -- and most numeric code

        deals entirely<br>

        with thread-local data.  Just like with the tradeoff of nulls vs

        zeros, the<br>

        design of the built-in primitives permits tearing as part of a

        tradeoff between<br>

        performance and correctness, where we chose "as fast as

        possible" for<br>

        primitives, and more safety for reference types.<br>

        <br>

        Today, most JVMs give us atomic loads and stores of 64-bit

        primitives, because<br>

        the hardware already makes them cheap enough.  But value classes

        bring us back<br>

        to 1995; atomic loads and stores of larger-than-64-bit values

        are still<br>

        expensive on many CPUs, leaving us with a choice of "make

        operations on value<br>

        types slower" or permitting tearing when accessed under race.  <br>

        <br>

        It would not be wise for the language to select a

        one-size-fits-all policy about<br>

        tearing; choosing "no tearing" means that types like `Complex`

        are slower than<br>

        they need to be, even in a single-threaded program; choosing

        "tearing" means<br>

        that classes like `Range` can be seen to not exhibit invariants

        asserted by<br>

        their constructor.  Class authors can choose, with full

        knowledge of their<br>

        domain, whether their types can tolerate tearing.  The default

        is no tearing<br>

        (following the principle of "safe by default"); a class can opt

        for greater<br>

        flattening (at the cost of potential tearing) by declaring the

        value companion<br>

        as `non-atomic`:<br>

        <br>

        ```<br>

        public value record Complex(double real, double imag) { <br>

            public non-atomic value companion Complex.val;<br>

        }<br>

        ```<br>

        <br>

        For classes like `Complex`, all of whose bit patterns are valid,

        this is very<br>

        much like the choice around `long` in 1995.  For other classes

        that might have<br>

        nontrivial representational invariants -- specifically,

        invariants that relate<br>

        multiple fields, such as ensuring that a range goes from low to

        high -- they<br>

        likely want to stick to the default of atomicity.  <br>

        <br>

        ## Do we really need two types?<br>

        <br>

        It is sensible to ask: why do we need companion types at all? 

        This is analogous<br>

        to the need for boxes in 1995: we'd made one set of tradeoffs

        for primitives<br>

        favoring performance (monomorphic, non-nullable, zero-default,

        tolerant of<br>

        non-initialization, tolerant of tearing under race, unrelated to

        `Object`), and<br>

        another for references, favoring flexibility and safety.  Most

        of the time, we<br>

        ignored the primitive wrapper classes, but sometimes we needed

        to temporarily<br>

        suppress one of these properties, such as when interoperating

        with code that<br>

        expects an `Object` or the ability to express "no value".  The

        reasons we needed<br>

        boxes in 1995 still apply today: sometimes we need the

        affordances of<br>

        references, and in those cases, we appeal to the reference

        companion.  <br>

        <br>

        Reasons we might want to use the reference companion include: <br>

        <br>

         - **Interoperation with reference types.**  Value classes can

        implement<br>

           interfaces and extend classes (including `Object` and some

        abstract classes),<br>

           which means some class and interface types are going to be

        polymorphic over<br>

           both identity and primitive objects.  This polymorphism is

        achieved through<br>

           object references; a reference to `Object` may be a reference

        to an identity<br>

           object, or a reference to a value object.  <br>

        <br>

         - **Nullability.**  Nullability is an affordance of object

        _references_, not<br>

           objects themselves.  Most of the time, it makes sense that

        value types are<br>

           non-nullable (as the primitives are today), but there may be

        situations where<br>

           null is a semantically important value.  Using the reference

        companion when<br>

           nullability is required is semantically clear, and avoids the

        need to invent<br>

           new sentinel values for "no value."<br>

        <br>

           This need comes up when migrating existing classes; the

        method `Map::get`<br>

           uses `null` to signal that the requested key was not present

        in the map. But,<br>

           if the `V` parameter to `Map` is a value type, `null` is not

        a valid value.<br>

           We can capture the "`V` or null" requirement by changing the

        descriptor of<br>

           `Map::get` to:<br>

        <br>

           ```<br>

           public V.ref get(K key);<br>

           ```<br>

        <br>

           where, whatever type `V` is instantiated as, `Map::get`

        returns the reference<br>

           companion. (For a type `V` that already is a reference type,

        this is just `V`<br>

           itself.) This captures the notion that the return type of

        `Map::get` will<br>

           either be a reference to a `V`, or the `null` reference.

        (This is a<br>

           compatible change, since both erase to the same thing.)<br>

        <br>

         - **Self-referential types.**  Some types may want to directly

        or indirectly<br>

           refer to themselves, such as the "next" field in the node

        type of a linked<br>

           list:<br>

        <br>

           ```<br>

           class Node<T> {<br>

               T theValue;<br>

               Node<T> nextNode;<br>

           }<br>

           ```<br>

        <br>

           We might want to represent this as a value class, but if the

        type of<br>

           `nextNode` were `Node.val<T>`, the layout of `Node`

        would be<br>

           self-referential, since we would be trying to flatten a

        `Node` into its own<br>

           layout. <br>

        <br>

         - **Protection from tearing.**  For a value class with a

        non-atomic value<br>

           companion type, we may want to use the reference companion in

        cases where we<br>

           are concerned about tearing; because loads and stores of

        references are<br>

           atomic, `P.ref` is immune to the tearing under race that

        `P.val` might be<br>

           subject to.<br>

        <br>

         - **Compatibility with existing boxing.**  Autoboxing is

        convenient, in that it<br>

           lets us pass a primitive where a reference is required.  But

        boxing affects<br>

           far more than assignment conversion; it also affects method

        overload<br>

           selection.  The rules are designed to prefer overloads that

        require no<br>

           conversions to those requiring boxing (or varargs)

        conversions.  Having both<br>

           a value and reference type for every value class means that

        these rules can<br>

           be cleanly and intuitively extended to cover value classes.<br>

        <br>

        ### Choosing which to use<br>

        <br>

        How would we choose between declaring an identity class or a

        value class, and<br>

        the various options on value companions?  Here are some quick

        rules of thumb for<br>

        declaring classes:<br>

        <br>

         - If you need mutability, subclassing, locking, or aliasing,

        choose an identity<br>

           class.  <br>

         - Otherwise, choose a value class.  If uninitialized (zero)

        values are<br>

           unacceptable, leave the value companion encapsulated; if zero

        is a reasonable<br>

           default value, make the value companion `public`.<br>

         - If there are no cross-field invariants and you are willing to

        tolerate<br>

           possible tearing to enable more flattening, make the value

        companion<br>

           `non-atomic`.<br>

        <br>

        ## Migrating the legacy primitives<br>

        <br>

        As part of generalizing primitives, we want to adjust the

        built-in primitives to<br>

        behave as consistently with value classes as possible.  While we

        can't change<br>

        the fact that `int`'s reference companion is the oddly-named

        `Integer`, we can<br>

        give them more uniform aliases (`int.ref` is an alias for

        `Integer`; `int` is an<br>

        alias for `Integer.val`) -- so that we can use a consistent rule

        for naming<br>

        companions.  Similarly, we can extend member access to the

        legacy primitives<br>

        (`3.getClass()`) and adjust `int[]` to be a subtype of

        `Integer[]` (and therefore<br>

        of `Object[]`.)<br>

        <br>

        We will redeclare `Integer` as a value class with a public value

        companion:<br>

        <br>

        ```<br>

        value class Integer { <br>

            public value companion Integer.val;<br>

        <br>

            // existing methods<br>

        }<br>

        ```<br>

        <br>

        where the type name `int` is an alias for `Integer.val`.  <br>

        <br>

        ## Unifying primitives with classes<br>

        <br>

        Earlier, we had a chart of the differences between primitive and

        reference<br>

        types: <br>

        <br>

        | Primitives                                 |

        Objects                            |<br>

        | ------------------------------------------ |

        ---------------------------------- |<br>

        | No identity (pure values)                  |

        Identity                           |<br>

        | `==` compares values                       | `==` compares

        object identity      |<br>

        | Built-in                                   | Declared in

        classes                |<br>

        | No members (fields, methods, constructors) | Members

        (including mutable fields) |<br>

        | No supertypes or subtypes                  | Class and

        interface inheritance    |<br>

        | Accessed directly                          | Accessed via

        object references     |<br>

        | Not nullable                               |

        Nullable                           |<br>

        | Default value is zero                      | Default value is

        null              |<br>

        | Arrays are monomorphic                     | Arrays are

        covariant               |<br>

        | May tear under race                        | Initialization

        safety guarantees   |<br>

        | Have reference companions (boxes)          | Don't need

        reference companions    |<br>

        <br>

        The addition of value classes addresses many of these directly. 

        Rather than<br>

        saying "classes have identity, primitives do not", we make

        identity an optional<br>

        characteristic of classes (and derive equality semantics from

        that.)  Rather<br>

        than primitives being built in, we derive all types, including

        primitives, from<br>

        classes, and endow value companion types with the members and

        supertypes<br>

        declared with the value class.  Rather than having primitive

        arrays be<br>

        monomorphic, we make all arrays covariant under the `extends`

        relation.  <br>

        <br>

        The remaining differences now become differences between

        reference types and<br>

        value types:<br>

        <br>

        | Value types                                   | Reference

        types                  |<br>

        | --------------------------------------------- |

        -------------------------------- |<br>

        | Accessed directly                             | Accessed via

        object references   |<br>

        | Not nullable                                  |

        Nullable                         |<br>

        | Default value is zero                         | Default value

        is null            |<br>

        | May tear under race, if declared `non-atomic` | Initialization

        safety guarantees |<br>

        <br>

        The current dichotomy between primitives and references morphs

        to one between<br>

        value objects and references, where the legacy primitives become

        (slightly<br>

        special) value objects, and, finally, "everything is an object".<br>

        <br>

        ## Summary<br>

        <br>

        Valhalla unifies, to the extent possible, primitives and

        objects.   The<br>

        following table summarizes the transition from the current world

        to Valhalla.<br>

        <br>

        | Current World                               |

        Valhalla                                                  |<br>

        | ------------------------------------------- |

        --------------------------------------------------------- |<br>

        | All objects have identity                   | Some objects

        have identity                                |<br>

        | Fixed, built-in set of primitives           | Open-ended set

        of primitives, declared via classes        |<br>

        | Primitives don't have methods or supertypes | Primitives are

        classes, with methods and supertypes       |<br>

        | Primitives have ad-hoc boxes                | Primitives have

        regularized reference companions          |<br>

        | Boxes have accidental identity              | Reference

        companions have no identity                     |<br>

        | Boxing and unboxing conversions             | Primitive

        reference and value conversions, but same rules |<br>

        | Primitive arrays are monomorphic            | All arrays are

        covariant                                  |<br>

        <br>

        <br>

        [valuebased]:

<a class="moz-txt-link-freetext" href="https://docs.oracle.com/javase/8/docs/api/java/lang/doc-files/ValueBased.html">https://docs.oracle.com/javase/8/docs/api/java/lang/doc-files/ValueBased.html</a><br>

        [growing]: <a class="moz-txt-link-freetext" href="https://dl.acm.org/doi/abs/10.1145/1176617.1176621">https://dl.acm.org/doi/abs/10.1145/1176617.1176621</a><br>

        [jep390]: <a class="moz-txt-link-freetext" href="https://openjdk.java.net/jeps/390">https://openjdk.java.net/jeps/390</a><br>

        <br>

        <br>

        <br>

      </font></font>

  </body>

</html>