<html><head>

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  </head>

  <body>

    <br>

    <div class="moz-cite-prefix">On 6/15/2022 12:41 PM, Kevin

      Bourrillion wrote:<br>

    </div>

    <blockquote type="cite" cite="mid:CAGKkBkttZ35rvuZ7Exfe6Ozh1CJS0NSmTFKMrQ-K+sW1N_66Vg@mail.gmail.com">All

      else being equal, the idea to use "inaccessible value type" over

      "value type doesn't exist" feels very good and simplifying, with

      the main problem that the syntax can't help but be gross.</blockquote>

    <br>

    <font size="4"><font face="monospace">A few weeks in, and this

        latest stacking is still feeling pretty good:<br>

        <br>

         - There are no coarse buckets any more; there are just identity

        classes and value classes.  <br>

         - Value classes have ref and val companion types with the

        obvious properties.  (Notably, refs are always atomic.)<br>

         - For `value class C`, C as a type is an alias for `C.ref`.  <br>

         - The bucket formerly known as B2 becomes "value class, whose

        .val type is private."  This is the default for a value class. 

        <br>

         - The bucket formerly known as B3a is denoted by explicitly

        making the val companion public, with a public modifier on a

        "member" of the class.  <br>

         - The bucket formerly known as B3n is denoted by explicitly

        making the </font></font><font size="4"><font face="monospace"><font size="4"><font face="monospace">val companion </font></font>public

        and non-atomic, again using modifiers.  <br>

        <br>

        I went and updated the State of the Values document to use the

        new terminology, test-driving some new syntax.  (Usual rules:

        syntax comments are premature at this time.)  I was very pleased

        with the result, because almost all the changes were small

        changes in terminology (e.g., "value companion type"), and

        eliminating the clumsy distinction between value classes and

        primitive classes.  Overall the structure remains the same, but

        feels more compact and clean.  MD source is below, for review. 

        <br>

        <br>

        Kevin's two questions remain, but I don't think they get in the

        way of refining the model in this way:<br>

        <br>

         - Have we made the right choices around == ?<br>

         - Are we missing a big opportunity by not spelling Complex.val

        with a bang?  <br>

        <br>

        <br>

        <br>

        # State of Valhalla<br>

        ## Part 2: The Language Model {.subtitle}<br>

        <br>

        #### Brian Goetz {.author}<br>

        #### June 2022 {.date}<br>

        <br>

        > _This is the second of three documents describing the

        current State of<br>

          Valhalla.  The first is [The Road to Valhalla](01-background);

        the<br>

          third is [The JVM Model](03-vm-model)._<br>

        <br>

        This document describes the directions for the Java _language_

        charted by<br>

        Project Valhalla.  (In this document, we use "currently" to

        describe the<br>

        language as it stands today, without value classes.)<br>

        <br>

        Valhalla started with the goal of providing user-programmable

        classes which can<br>

        be flat and dense in memory.  Numerics are one of the motivating

        use cases;<br>

        adding new primitive types directly to the language has a very

        high barrier.  As<br>

        we learned from [Growing a Language][growing] there are

        infinitely many numeric<br>

        types we might want to add to Java, but the proper way to do

        that is via<br>

        libraries, not as a language feature.<br>

        <br>

        ## Primitive and reference types in Java today<br>

        <br>

        Java currently has eight built-in primitive types.  Primitives

        represent pure<br>

        _values_; any `int` value of "3" is equivalent to, and

        indistinguishable from,<br>

        any other `int` value of "3".  Primitives are monolithic (their

        bits cannot be<br>

        addressed individually) and have no canonical location, and so

        are _freely<br>

        copyable_. With the exception of the unusual treatment of exotic

        floating point<br>

        values such as `NaN`, the `==` operator performs a

        _substitutibility test_ -- it<br>

        asks "are these two values the same value".<br>

        <br>

        Java also has _objects_, and each object has a unique _object

        identity_. Because<br>

        of identity, objects are not freely copyable; each object lives

        in exactly one<br>

        place at any given time, and to access its state we have to go

        to that place.<br>

        But we mostly don't notice this because objects are not

        manipulated or accessed<br>

        directly, but instead through _object references_.  Object

        references are also a<br>

        kind of value -- they encode the identity of the object to which

        they refer, and<br>

        the `==` operator on object references asks "do these two

        references refer to<br>

        the same object."  Accordingly, object _references_ (like other

        values) can be<br>

        freely copied, but the objects they refer to cannot.  <br>

        <br>

        Primitives and objects differ in almost every conceivable way:<br>

        <br>

        | Primitives                                 |

        Objects                            |<br>

        | ------------------------------------------ |

        ---------------------------------- |<br>

        | No identity (pure values)                  |

        Identity                           |<br>

        | `==` compares values                       | `==` compares

        object identity      |<br>

        | Built-in                                   | Declared in

        classes                |<br>

        | No members (fields, methods, constructors) | Members

        (including mutable fields) |<br>

        | No supertypes or subtypes                  | Class and

        interface inheritance    |<br>

        | Accessed directly                          | Accessed via

        object references     |<br>

        | Not nullable                               |

        Nullable                           |<br>

        | Default value is zero                      | Default value is

        null              |<br>

        | Arrays are monomorphic                     | Arrays are

        covariant               |<br>

        | May tear under race                        | Initialization

        safety guarantees   |<br>

        | Have reference companions (boxes)          | Don't need

        reference companions    |<br>

        <br>

        The design of primitives represents various tradeoffs aimed at

        maximizing<br>

        performance and usability of the primtive types.  Reference

        types default to<br>

        `null`, meaning "referring to no object"; primitives default to

        a usable zero<br>

        value (which for most primitives is the additive identity). 

        Reference types<br>

        provide initialization safety guarantees against a certain

        category of data<br>

        races; primitives allow tearing under race for

        larger-than-32-bit values.  <br>

        We could characterize the design principles behind these

        tradeoffs are "make<br>

        objects safer, make primitives faster."<br>

        <br>

        The following figure illustrates the current universe of Java's

        types.  The<br>

        upper left quadrant is the built-in primitives; the rest of the

        space is<br>

        reference types.  In the upper-right, we have the abstract

        reference types --<br>

        abstract classes, interfaces, and `Object` (which, though

        concrete, acts more<br>

        like an interface than a concrete class).  The built-in

        primitives have wrappers<br>

        or boxes, which are reference types.<br>

        <br>

        <figure><br>

          <a href="field-type-zoo.pdf" title="Click for PDF"><br>

            <img src="field-type-zoo-old.png" alt="Current universe

        of Java field types"/><br>

          </a><br>

        </figure><br>

        <br>

        Valhalla aims to unify primitives and objects in that they can

        both be<br>

        declared with classes, but maintains the special runtime

        characteristics<br>

        primitives have.  But while everyone likes the flatness and

        density that<br>

        user-definable value types promise, in some cases we want them

        to be more like<br>

        classical objects (nullable, non-tearable), and in other cases

        we want them to<br>

        be more like classical primitives (trading some safety for

        performance).  <br>

        <br>

        ## Value classes: separating references from identity<br>

        <br>

        Many of the impediments to optimization that Valhalla seeks to

        remove center<br>

        around _unwanted object identity_.  The primitive wrapper

        classes have identity,<br>

        but it is a purely accidental one.  Not only is it not directly

        useful, it can<br>

        be a source of bugs.  For example, due to caching, `Integer` can

        be accidentally<br>

        compared correctly with `==` just often enough that people keep

        doing it.<br>

        Similarly, [value-based classes][valuebased] such as `Optional`

        have no need for<br>

        identity, but pay the costs of having identity anyway.  <br>

        <br>

        Our first step is allowing class declarations to explicitly

        disavow identity, by<br>

        declaring themselves as _value classes_.  The instances of a

        value class are<br>

        called _value objects_.  <br>

        <br>

        ```<br>

        value class ArrayCursor<T> { <br>

            T[] array;<br>

            int offset;<br>

        <br>

            public ArrayCursor(T[] array, int offset) { <br>

                this.array = array;<br>

                this.offset = offset;<br>

            }<br>

        <br>

            public boolean hasNext() { <br>

                return offset < array.length;<br>

            }<br>

        <br>

            public T next() { <br>

                return array[offset];<br>

            }<br>

        <br>

            public ArrayCursor<T> advance() { <br>

                return new ArrayCursor(array, offset+1);<br>

            }<br>

        }<br>

        ```<br>

        <br>

        This says that an `ArrayCursor` is a class whose instances have

        no identity --<br>

        that instead they have _value semantics_.  As a consequence, it

        must give up the<br>

        things that depend on identity; the class and its fields are

        implicitly final.  <br>

        <br>

        But, value classes are still classes, and can have most of the

        things classes<br>

        can have -- fields, methods, constructors, type parameters,

        superclasses (with<br>

        some restrictions), nested classes, class literals, interfaces,

        etc.  The<br>

        classes they can extend are restricted: `Object` or abstract

        classes with no<br>

        instance fields, empty no-arg constructor bodies, no other

        constructors, no instance<br>

        initializers, no synchronized methods, and whose superclasses

        all meet this same<br>

        set of conditions.  (`Number` meets these conditions.)<br>

        <br>

        Classes in Java give rise to types; the class `ArrayCursor`

        gives rise to a type<br>

        `ArrayCursor` (actually a parametric family of instantiations

        `ArrayCursor<T>`.)<br>

        `ArrayCursor` is still a reference type, just one whose

        references refer to<br>

        value objects rather than identity objects. For the types in the

        upper-right<br>

        quadrant of the diagram (interfaces, abstract classes, and

        `Object`), references<br>

        to these types might refer to either an identity object or a

        value object.<br>

        (Historically, JVMs were effectively forced to represent object

        references with<br>

        pointers; for references to value objects, JVMs now have more

        flexibility.)<br>

        <br>

        Because `ArrayCursor` is a reference type, it is nullable

        (because references<br>

        are nullable), its default value is null, and loads and stores

        of references are<br>

        atomic with respect to each other even in the presence of data

        races, providing<br>

        the initialization safety we are used to with classical objects.<br>

        <br>

        Because instances of `ArrayCursor` have value semantics, `==`

        compares by state<br>

        rather than identity.  This means that value objects, like

        primitives, are<br>

        _freely copyable_; we can explode them into their fields and

        re-aggregate them<br>

        into another value object, and we cannot tell the difference. 

        (Because they<br>

        have no identity, some identity-sensitive operations, such as

        synchronization,<br>

        are disallowed.)<br>

        <br>

        So far we've addressed the first two lines of the table of

        differences above;<br>

        rather than identity being a property of all object instances,

        classes can<br>

        decide whether their instances have identity or not.  By

        allowing classes that<br>

        don't need identity to exclude it, we free the runtime to make

        better layout and<br>

        compilation decisions -- and avoid a whole category of bugs.<br>

        <br>

        In looking at the code for `ArrayCursor`, we might mistakenly

        assume it will be<br>

        inefficient, as each loop iteration appears to allocate a new

        cursor:<br>

        <br>

        ```<br>

        for (ArrayCursor<T> c = Arrays.cursor(array); <br>

             c.hasNext(); <br>

             c = c.advance()) {<br>

            // use c.next();<br>

        }<br>

        ```<br>

        <br>

        One should generally expect here that _no_ cursors are actually

        allocated.<br>

        Because an `ArrayCursor` is just its two fields, these fields

        will routinely get<br>

        scalarized and hoisted into registers, and the constructor call

        in `advance`<br>

        will typically compile down to incrementing one of these

        registers.<br>

        <br>

        ### Migration<br>

        <br>

        The JDK (as well as other libraries) has many [value-based

        classes][valuebased]<br>

        such as `Optional` and `LocalDateTime`.  Value-based classes

        adhere to the<br>

        semantic restrictions of value classes, but are still identity

        classes -- even<br>

        though they don't want to be.  Value-based classes can be

        migrated to true value<br>

        classes simply by redeclaring them as value classes, which is

        both source- and<br>

        binary-compatible.<br>

        <br>

        We plan to migrate many value-based classes in the JDK to value

        classes.<br>

        Additionally, the primitive wrappers can be migrated to value

        classes as well,<br>

        making the conversion between `int` and `Integer` cheaper; see

        the section<br>

        "Legacy Primitives" below.  (In some cases, this may be

        _behaviorally_<br>

        incompatible for code that synchronizes on the primitive

        wrappers.  [JEP<br>

        390][jep390] has supported both compile-time and runtime

        warnings for<br>

        synchronizing on primitive wrappers since Java 16.)  <br>

        <br>

        <figure><br>

          <a href="field-type-zoo.pdf" title="Click for PDF"><br>

            <img src="field-type-zoo-mid.png" alt="Java field types

        adding value classes"/><br>

          </a><br>

        </figure><br>

        <br>

        ### Equality<br>

        <br>

        Earlier we said that `==` compares value objects by state rather

        than by<br>

        identity.  More precisely, two value objects are `==` if they

        are of the same<br>

        type, and each of their fields are pairwise equal, where

        equality is given by<br>

        `==` for primitives (except `float` and `double`, which are

        compared with<br>

        `Float::equals` and `Double::equals` to avoid anomalies), `==`

        for references to<br>

        identity objects, and recursively with `==` for references to

        value objects.  In<br>

        no case is a value object ever `==` to a reference to an

        identity object.<br>

        <br>

        ### Value records<br>

        <br>

        While records have a lot in common with value classes -- they

        are final and<br>

        their fields are final -- they are still identity classes. 

        Records embody a<br>

        tradeoff: give up on decoupling the API from the representation,

        and in return<br>

        get various syntactic and semantic benefits.  Value classes

        embody another<br>

        tradeoff: give up identity, and get various semantic and

        performance benefits.<br>

        If we are willing to give up both, we can get both sets of

        benefits.  <br>

        <br>

        ```<br>

        value record NameAndScore(String name, int score) { }<br>

        ```<br>

        <br>

        Value records combine the data-carrier idiom of records with the

        improved <br>

        scalarization and flattening benefits of value classes.  <br>

        <br>

        In theory, it would be possible to apply `value` to certain

        enums as well, but<br>

        this is not currently possible because the `java.lang.Enum` base

        class that<br>

        enums extend do not meet the requirements for superclasses of

        value classes (it<br>

        has fields and non-empty constructors).<br>

        <br>

        ## Unboxing values for flatness and density<br>

        <br>

        Value classes shed object identity, gaining a host of

        performance and<br>

        predictability benefits in the process.  They are an ideal

        replacement for many<br>

        of today's value-based classes, fully preserving their semantics

        (except for the<br>

        accidental identity these classes never wanted).  But

        identity-free reference<br>

        types are only one point a spectrum of tradeoffs between

        abstraction and<br>

        performance, and other desired use cases -- such as numerics --

        may want a<br>

        different set of tradeoffs.<br>

        <br>

        Reference types are nullable, and therefore must account for

        null somehow in<br>

        their representation, which may involve additional footprint. 

        Similarly, they<br>

        offer the initialization safety guarantees for final fields that

        we come to<br>

        expect from identity objects, which may entail limits on

        flatness.  For certain<br>

        use cases, it may be desire to additionally give up something

        else to make<br>

        further flatness and footprint gains -- and that something else

        is<br>

        reference-ness.<br>

        <br>

        The built-in primitives are best understood as _pairs_ of types:

        a primitive<br>

        type (e.g., `int`) and its reference companion or box

        (`Integer`), with<br>

        conversions between the two (boxing and unboxing.)  We have both

        types because<br>

        the two have different characteristics.  Primitives are

        optimized for efficient<br>

        storage and access: they are not nullable, they tolerate

        uninitialized (zero)<br>

        values, and larger primitive types (`long`, `double`) may tear

        under racy<br>

        access.  References err on the side of safety and flexibility;

        they support<br>

        nullity, polymorphism, and offer initialization safety (freedom

        from tearing),<br>

        but by comparison to primitives, they pay a footprint and

        indirection cost.  <br>

        <br>

        For these reasons, value classes give rise to pairs of types as

        well: a<br>

        reference type and a _value companion type_.  We've seen the

        reference type so<br>

        far; for a value class `Point`, the reference type is called

        `Point`.  (The full<br>

        name for the reference type is `Point.ref`; `Point` is an alias

        for that.)  The<br>

        value companion type is called `Point.val`, and the two types

        have the same<br>

        conversions between them as primitives do today with their

        boxes.  (If we are<br>

        talking explicitly about the value companion type of a value

        class, we may<br>

        sometimes describe the corresponding reference type as its

        _reference<br>

        companion_.)<br>

        <br>

        ```<br>

        value class Point implements Serializable {<br>

            int x;<br>

            int y;<br>

        <br>

            Point(int x, int y) { <br>

                this.x = x;<br>

                this.y = y;<br>

            }<br>

        <br>

            Point scale(int s) { <br>

                return new Point(s*x, s*y);<br>

            }<br>

        }<br>

        ```<br>

        <br>

        The default value of the value companion type is the one for

        which all fields<br>

        take on their default value; the default value of the reference

        type is, like<br>

        all reference types, null.  <br>

        <br>

        In our diagram, these new types show up as another entity that

        straddles the<br>

        line between primitives and identity-free references, alongside

        the legacy<br>

        primitives: <br>

        <br>

        ** UPDATE DIAGRAM **<br>

        <br>

        <figure><br>

          <a href="field-type-zoo.pdf" title="Click for PDF"><br>

            <img src="field-type-zoo-new.png" alt="Java field types

        with extended primitives"/><br>

          </a><br>

        </figure><br>

        <br>

        ### Member access<br>

        <br>

        Both the reference and value companion types are seen to have

        the same instance<br>

        members.  Unlike today's primitives, value companion types can

        be used as<br>

        receivers to access fields and invoke methods, subject to

        accessibility<br>

        constraints: <br>

        <br>

        ```<br>

        Point.val p = new Point(1, 2);<br>

        assert p.x == 1;<br>

        <br>

        p = p.scale(2);<br>

        assert p.x == 2;<br>

        ```<br>

        <br>

        ### Polymorphism<br>

        <br>

        When we declare a class today, we set up a subtyping (is-a)

        relationship between<br>

        the declared class and its supertypes.  When we declare a value

        class, we set up<br>

        a subtyping relationship between the _reference type_ and the

        declared<br>

        supertypes. This means that if we declare:<br>

        <br>

        ```<br>

        value class UnsignedShort extends Number <br>

                                  implements

        Comparable<UnsignedShort> { <br>

           ...<br>

        }<br>

        ```<br>

        <br>

        then `UnsignedShort` is a subtype of `Number` and

        `Comparable<UnsignedShort>`,<br>

        and we can ask questions about subtyping using `instanceof` or

        pattern matching.<br>

        What happens if we ask such a question of the value companion

        type?<br>

        <br>

        ```<br>

        UnsignedShort.val us = ...<br>

        if (us instanceof Number) { ... }<br>

        ```<br>

        <br>

        Since subtyping is defined only on reference types, the

        `instanceof` operator<br>

        (and corresponding type patterns) will behave as if both sides

        were lifted to<br>

        the approrpriate reference type, and we can answer the question

        that way.  (This<br>

        may trigger fears of expensive boxing conversions, but in

        reality no actual<br>

        allocation will happen.)<br>

        <br>

        We introduce a new relationship based on `extends` /

        `implements` clauses, which<br>

        we'll call "extends"; we define `A extends B` as meaning `A

        <: B` when A is a<br>

        reference type, and `A.ref <: B` when A is a value companion

        type.  The<br>

        `instanceof` relation, reflection, and pattern matching are

        updated to use<br>

        "extends".<br>

        <br>

        ### Arrays<br>

        <br>

        Arrays of reference types are _covariant_; this means that if `A

        <: B`, then<br>

        `A[] <: B[]`.  This allows `Object[]` to be the "top array

        type", at least for<br>

        arrays of references.  But arrays of primitives are currently

        left out of this<br>

        story.   We can unify the treatment of arrays by defining array

        covariance over<br>

        the new "extends" relationship; if A extends B, then `A[] <:

        B[]`.  For a value<br>

        class P, `P.val[] <: P.ref[] <: Object[]`, finally making

        `Object[]` the top<br>

        type for all arrays.<br>

        <br>

        ### Equality<br>

        <br>

        Just as with `instanceof`, we define `==` on values by appealing

        to the<br>

        reference companion (though no actual boxing need occur). 

        Evaluating `a == b`,<br>

        where one or both operands are of a value companion type, can be

        defined as if<br>

        the operands are first converted to their corresponding

        reference type, and then<br>

        comparing the results.  This means that the following will

        succeed: <br>

        <br>

        ```<br>

        Point.val p = new Point(3, 4);<br>

        Point pr = p;<br>

        assert p == pr;<br>

        ```<br>

        <br>

        The base implementation of `Object::equals` delegates to `==`,

        which is a<br>

        suitable default for both reference and value classes.  <br>

        <br>

        ### Serialization<br>

        <br>

        If a value class implements `Serializable`, this is also really

        a statement<br>

        about the reference type.  Just as with other aspects described

        here,<br>

        serialization of value companions can be defined by converting

        to the<br>

        corresponding reference type and serializing that, and reversing

        the process at<br>

        deserialization time.<br>

        <br>

        Serialization currently uses object identity to preserve the

        topology of an<br>

        object graph.  This generalizes cleanly to objects without

        identity, because<br>

        `==` on value objects treats two identical copies of a value

        object as equal.  <br>

        So any observations we make about graph topology prior to

        serialization with<br>

        `==` are consistent with those after deserialization.<br>

        <br>

        ### Identity-sensitive operations<br>

        <br>

        Certain operations are currently defined in terms of object

        identity.  As we've<br>

        already seen, some of these, like equality, can be sensibly

        extended to cover<br>

        all instances.  Others, like synchronization, will become

        partial.  <br>

        Identity-sensitive operations include:<br>

        <br>

          - **Equality.**  We extend `==` on references to include

        references to value<br>

            objects.  Where it currently has a meaning, the new

        definition coincides<br>

            with that meaning.<br>

        <br>

          - **System::identityHashCode.**  The main use of

        `identityHashCode` is in the<br>

            implementation of data structures such as

        `IdentityHashMap`.  We can extend<br>

            `identityHashCode` in the same way we extend equality --

        deriving a hash on<br>

            primitive objects from the hash of all the fields.<br>

        <br>

          - **Synchronization.**  This becomes a partial operation.  If

        we can<br>

            statically detect that a synchronization will fail at

        runtime (including<br>

            declaring a `synchronized` method in a value class), we can

        issue a<br>

            compilation error; if not, attempts to lock on a value

        object results in<br>

            `IllegalMonitorStateException`.  This is justifiable because

        it is<br>

            intrinsically imprudent to lock on an object for which you

        do not have a<br>

            clear understanding of its locking protocol; locking on an

        arbitrary<br>

            `Object` or interface instance is doing exactly that.<br>

        <br>

          - **Weak, soft, and phantom references.**  Capturing an exotic

        reference to a<br>

            value object becomes a partial operation, as these are

        intrinsically tied to<br>

            reachability (and hence to identity).  However, we will

        likely make<br>

            enhancements to `WeakHashMap` to support mixed identity and

        value keys.  <br>

            <br>

        ### What about Object?<br>

        <br>

        The root class `Object` poses an unusual problem, in that every

        class must<br>

        extend it directly or indirectly, but it is also instantiable

        (non-abstract),<br>

        and its instances have identity -- it is common to use `new

        Object()` as a way<br>

        to obtain a new object identity for purposes of locking.  <br>

        <br>

        ## Why two types?<br>

        <br>

        It is sensible to ask: why do we need companion types at all? 

        This is analogous<br>

        to the need for boxes in 1995: we'd made one set of tradeoffs

        for primitives,<br>

        favoring performance (non-nullable, zero-default, tolerant of<br>

        non-initialization, tolerant of tearing under race, unrelated to

        `Object`), and<br>

        another for references, favoring flexibility and safety.  Most

        of the time, we<br>

        ignored the primitive wrapper classes, but sometimes we needed

        to temporarily<br>

        suppress one of these properties, such as when interoperating

        with code that<br>

        expects an `Object` or the ability to express "no value".  The

        reasons we needed<br>

        boxes in 1995 still apply today: sometimes we need the

        affordances of<br>

        references, and in those cases, we appeal to the reference

        companion.  <br>

        <br>

        Reasons we might want to use the reference companion include: <br>

        <br>

         - **Interoperation with reference types.**  Value classes can

        implement<br>

           interfaces and extend classes (including `Object` and some

        abstract classes),<br>

           which means some class and interface types are going to be

        polymorphic over<br>

           both identity and primitive objects.  This polymorphism is

        achieved through<br>

           object references; a reference to `Object` may be a reference

        to an identity<br>

           object, or a reference to a value object.  <br>

        <br>

         - **Nullability.**  Nullability is an affordance of object

        _references_, not<br>

           objects themselves.  Most of the time, it makes sense that

        primitive types<br>

           are non-nullable (as the primitives are today), but there may

        be situations<br>

           where null is a semantically important value.  Using the

        reference companion<br>

           when nullability is required is semantically clear, and

        avoids the need to<br>

           invent new sentinel values for "no value."<br>

        <br>

           This need comes up when migrating existing classes; the

        method `Map::get`<br>

           uses `null` to signal that the requested key was not present

        in the map. But,<br>

           if the `V` parameter to `Map` is a primitive class, `null` is

        not a valid<br>

           value.  We can capture the "`V` or null" requirement by

        changing the<br>

           descriptor of `Map::get` to:<br>

        <br>

           ```<br>

           public V.ref get(K key);<br>

           ```<br>

        <br>

           where, whatever type `V` is instantiated as, `Map::get`

        returns the reference<br>

           companion. (For a type `V` that already is a reference type,

        this is just `V`<br>

           itself.) This captures the notion that the return type of

        `Map::get` will<br>

           either be a reference to a `V`, or the `null` reference.

        (This is a<br>

           compatible change, since both erase to the same thing.)<br>

        <br>

        <br>

         - **Self-referential types.**  Some types may want to directly

        or indirectly<br>

           refer to themselves, such as the "next" field in the node

        type of a linked<br>

           list:<br>

        <br>

           ```<br>

           class Node<T> {<br>

               T theValue;<br>

               Node<T> nextNode;<br>

           }<br>

           ```<br>

        <br>

           We might want to represent this as a value class, but if the

        type of<br>

           `nextNode` were `Node.val<T>`, the layout of `Node`

        would be<br>

           self-referential, since we would be trying to flatten a

        `Node` into its own<br>

           layout. <br>

        <br>

         - **Protection from tearing.**  For a value class with a

        non-atomic value<br>

           companion type, we may want to use the reference companion in

        cases where we<br>

           are concerned about tearing; because loads and stores of

        references are<br>

           atomic, `P.ref` is immune to the tearing under race that

        `P.val` might be<br>

           subject to.<br>

        <br>

         - **Compatibility with existing boxing.**  Autoboxing is

        convenient, in that it<br>

           lets us pass a primitive where a reference is required.  But

        boxing affects<br>

           far more than assignment conversion; it also affects method

        overload<br>

           selection.  The rules are designed to prefer overloads that

        require no<br>

           conversions to those requiring boxing (or varargs)

        conversions.  Having both<br>

           a value and reference type for every value class means that

        these rules can<br>

           be cleanly and intuitively extended to cover value classes.<br>

        <br>

        ## Refining the value companion<br>

        <br>

        Value classes have several options for refining the behavior of

        the value<br>

        companion type and how they are exposed to clients.<br>

        <br>

        ### Classes with no good default value<br>

        <br>

        For a value class `C`, the default value of `C.ref` is the same

        as any other<br>

        reference type: `null`.  For the value companion type `C.val`,

        the default value<br>

        is the one where all of its fields are initialized to their

        default value.  <br>

        <br>

        The built-in primitives reflect the design assumption that zero

        is a reasonable<br>

        default.  The choice to use a zero default for uninitialized

        variables was one<br>

        of the central tradeoffs in the design of the built-in

        primitives.  It gives us<br>

        a usable initial value (most of the time), and requires less

        storage footprint<br>

        than a representation that supports null (`int` uses all 2^32 of

        its bit<br>

        patterns, so a nullable `int` would have to either make some 32

        bit signed<br>

        integers unrepresentable, or use a 33rd bit).  This was a

        reasonable tradeoff<br>

        for the built-in primitives, and is also a reasonable tradeoff

        for many (but not<br>

        all) other potential value classes (such as complex numbers, 2D

        points,<br>

        half-floats, etc).<br>

        <br>

        But for others potential value classes, such as `LocalDate`,

        there _is_ no<br>

        reasonable default.  If we choose to represent a date as the

        number of days<br>

        since some some epoch, there will invariably be bugs that stem

        from<br>

        uninitialized dates; we've all been mistakenly told by computers

        that something<br>

        will happen on or near 1 January 1970.  Even if we could choose

        a default other<br>

        than the zero representation, an uninitialized date is still

        likely to be an<br>

        error -- there simply is no good default date value.  <br>

        <br>

        For this reason, value classes have the choice of encapsulating

        or exposing<br>

        their value companion type.  If the class is willing to tolerate

        an<br>

        uninitialized (zero) value, it can freely share its `.val`

        companion with the<br>

        world; if uninitialized values are dangerous (such as for

        `LocalDate`), it can<br>

        be encapsulated to the class or package.  <br>

        <br>

        Encapsulation is accomplished using ordinary access control.  By

        default, the<br>

        value companion is `private`, and need not be declared

        explicitly; a class that<br>

        wishes to share its value companion can make it public:<br>

        <br>

        ```<br>

        public value record Complex(double real, double imag) { <br>

            public value companion Complex.val;<br>

        }<br>

        ```<br>

        <br>

        ### Atomicity and tearing<br>

        <br>

        For the primitive types longer than 32 bits (long and double),

        it is not<br>

        guaranteed that reads and writes from different threads (without

        suitable<br>

        coordination) are atomic with respect to each other.  The result

        is that, if<br>

        accessed under data race, a long or double field or array

        element can be seen to<br>

        "tear", and a read might see the low 32 bits of one write and

        the high 32 bits<br>

        of another.  (Declaring the containing field `volatile` is

        sufficient to restore<br>

        atomicity, as is properly coordinating with locks or other

        concurrency control,<br>

        or not sharing across threads in the first place.)<br>

        <br>

        This was a pragmatic tradeoff given the hardware of the time;

        the cost of 64-bit<br>

        atomicity on 1995 hardware would have been prohibitive, and

        problems only arise<br>

        when the program already has data races -- and most numeric code

        deals with<br>

        thread-local data.  Just like with the tradeoff of nulls vs

        zeros, the design of<br>

        the built-in primitives permits tearing as part of a tradeoff

        between<br>

        performance and correctness, where primitives chose "as fast as

        possible" and<br>

        reference types chose more safety.<br>

        <br>

        Today, most JVMs give us atomic loads and stores of 64-bit

        primitives, because<br>

        the hardware makes them cheap enough.  But value classes bring

        us back to<br>

        1995; atomic loads and stores of larger-than-64-bit values are

        still expensive<br>

        on many CPUs, leaving us with a choice of "make operations on

        primitives slower"<br>

        or permitting tearing when accessed under race.  <br>

        <br>

        It would not be wise for the language to select a

        one-size-fits-all policy about<br>

        tearing; choosing "no tearing" means that types like `Complex`

        are slower than<br>

        they need to be, even in a single-threaded program; choosing

        "tearing" means<br>

        that classes like `Range` can be seen to not exhibit invariants

        asserted by<br>

        their constructor.  Class authors have to choose, with full

        knowledge of their<br>

        domain, whether their types can tolerate tearing.  The default

        is no tearing<br>

        (safe by default); a class can opt for greater flattening at the

        cost of<br>

        potential tearing by declaring the value companion as

        `non-atomic`:<br>

        <br>

        ```<br>

        public value record Complex(double real, double imag) { <br>

            public non-atomic value companion Complex.val;<br>

        }<br>

        ```<br>

        <br>

        For classes like `Complex`, all of whose bit patterns are valid,

        this is very<br>

        much like the choice around `long` in 1995.  For other classes

        that might have<br>

        nontrivial representational invariants, they likely want to

        stick to the default<br>

        of atomicity.  <br>

        <br>

        ## Migrating legacy primitives<br>

        <br>

        As part of generalizing primitives, we want to adjust the

        built-in primitives to<br>

        behave as consistently with value classes as possible.  While we

        can't change<br>

        the fact that `int`'s reference companion is the oddly-named

        `Integer`, we can give them<br>

        more uniform aliases (`int.ref` is an alias for `Integer`; `int`

        is an alias for<br>

        `Integer.val`) -- so that we can use a consistent rule for

        naming companions.<br>

        Similarly, we can extend member access to the legacy primitives,

        and allow<br>

        `int[]` to be a subtype of `Integer[]` (and therefore of

        `Object[]`.)<br>

        <br>

        We will redeclare `Integer` as a value class with a public value

        companion:<br>

        <br>

        ```<br>

        value class Integer { <br>

            public value companion Integer.val;<br>

        <br>

            // existing methods<br>

        }<br>

        ```<br>

        <br>

        where the type name `int` is an alias for `Integer.val`.  The

        primitive array<br>

        types will be retrofitted such that arrays of primitives are

        subtypes of arrays<br>

        of their boxes (`int[] <: Integer[]`).  <br>

        <br>

        ## Unifying primitives with classes<br>

        <br>

        Earlier, we had a chart of the differences between primitive and

        reference<br>

        types: <br>

        <br>

        | Primitives                                 |

        Objects                            |<br>

        | ------------------------------------------ |

        ---------------------------------- |<br>

        | No identity (pure values)                  |

        Identity                           |<br>

        | `==` compares values                       | `==` compares

        object identity      |<br>

        | Built-in                                   | Declared in

        classes                |<br>

        | No members (fields, methods, constructors) | Members

        (including mutable fields) |<br>

        | No supertypes or subtypes                  | Class and

        interface inheritance    |<br>

        | Accessed directly                          | Accessed via

        object references     |<br>

        | Not nullable                               |

        Nullable                           |<br>

        | Default value is zero                      | Default value is

        null              |<br>

        | Arrays are monomorphic                     | Arrays are

        covariant               |<br>

        | May tear under race                        | Initialization

        safety guarantees   |<br>

        | Have reference companions (boxes)          | Don't need

        reference companions    |<br>

        <br>

        The addition of value classes addresses many of these directly. 

        Rather than<br>

        saying "classes have identity, primitives do not", we make

        identity an optional<br>

        characteristic of classes (and derive equality semantics from

        that.)  Rather<br>

        than primitives being built in, we derive all types, including

        primitives, from<br>

        classes, and endow value companion types with the members and

        supertypes<br>

        declared with the value class.  Rather than having primitive

        arrays be<br>

        monomorphic, we make all arrays covariant under the `extends`

        relation.  <br>

        <br>

        The remaining differences now become differences between

        reference types and<br>

        value types:<br>

        <br>

        | Value types                                   | Reference

        types                  |<br>

        | --------------------------------------------- |

        -------------------------------- |<br>

        | Accessed directly                             | Accessed via

        object references   |<br>

        | Not nullable                                  |

        Nullable                         |<br>

        | Default value is zero                         | Default value

        is null            |<br>

        | May tear under race, if declared `non-atomic` | Initialization

        safety guarantees |<br>

        <br>

        <br>

        ### Choosing which to use<br>

        <br>

        How would we choose between declaring an identity class or a

        value class, and<br>

        the various options on value companiones?  Here are some quick

        rules of thumb: <br>

        <br>

         - If you need mutability, subclassing, or aliasing, choose an

        identity class.  <br>

         - If uninitialized (zero) values are unacceptable, choose a

        value class with <br>

           the value companion encapsulated.  <br>

         - If you have no cross-field invariants and are willing to

        tolerate tearing to<br>

           enable more flattening, choose a value class with a

        non-atomic value<br>

           companion.<br>

        <br>

        ## Summary<br>

        <br>

        Valhalla unifies, to the extent possible, primitives and

        objects.   The<br>

        following table summarizes the transition from the current world

        to Valhalla.<br>

        <br>

        | Current World                               |

        Valhalla                                                  |<br>

        | ------------------------------------------- |

        --------------------------------------------------------- |<br>

        | All objects have identity                   | Some objects

        have identity                                |<br>

        | Fixed, built-in set of primitives           | Open-ended set

        of primitives, declared via classes        |<br>

        | Primitives don't have methods or supertypes | Primitives are

        classes, with methods and supertypes       |<br>

        | Primitives have ad-hoc boxes                | Primitives have

        regularized reference companions          |<br>

        | Boxes have accidental identity              | Reference

        companions have no identity                     |<br>

        | Boxing and unboxing conversions             | Primitive

        reference and value conversions, but same rules |<br>

        | Primitive arrays are monomorphic            | All arrays are

        covariant                                  |<br>

        <br>

        <br>

        [valuebased]:

<a class="moz-txt-link-freetext" href="https://docs.oracle.com/javase/8/docs/api/java/lang/doc-files/ValueBased.html">https://docs.oracle.com/javase/8/docs/api/java/lang/doc-files/ValueBased.html</a><br>

        [growing]: <a class="moz-txt-link-freetext" href="https://dl.acm.org/doi/abs/10.1145/1176617.1176621">https://dl.acm.org/doi/abs/10.1145/1176617.1176621</a><br>

        [jep390]: <a class="moz-txt-link-freetext" href="https://openjdk.java.net/jeps/390">https://openjdk.java.net/jeps/390</a><br>

        <br>

      </font></font>

  </body>

</html>