<html><head>

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  </head>

  <body>

    <font size="4"><font face="monospace">As we've hinted at, we've made

        some progress refining the essential differences between

        primitive and reference types, which has enabled us to shed the

        `.val` / `.ref` distinction and lean more heavily on

        nullability.  The following document outlines the observations

        that have enabled this current turn of direction and some of its

        consequences.  <br>

        <br>

        This document is mostly to be interpreted in the context of the

        Valhalla journey, and so talks about where we were a few months

        ago and where we're heading now.  <br>

        <br>

        <br>

        <br>

        # Rehabilitating primitive classes: a nullity-centric approach<br>

        <br>

        Over the course of Project Valhalla, we have observed that there

        are two<br>

        distinct groups of value types.  We've tried stacking them in

        various ways, but<br>

        there are always two groups, which we've historically described

        as "objects<br>

        without identity" and "primitive classes", and which admit

        different degrees of<br>

        flattening.  <br>

        <br>

        The first group, which we are now calling "value objects" or

        "value classes",<br>

        represent the minimal departure from traditional classes to

        disavow object<br>

        identity.  The existing classes that are described as

        "value-based", such as<br>

        `Optional` or `LocalDate`, are candidate for migrating to value

        classes.  Such<br>

        classes give up object identity; identity-sensitive behaviors

        are either recast<br>

        as state-based (such as for `==` and

        `Objects::identityHashCode`) or partialized<br>

        (`synchronized`, `WeakReference`), and such classes must live

        without the<br>

        affordances of identity (mutability, layout polymorphism.)  In

        return, they<br>

        avoid being burdened by "accidental identity" which can be a

        source of bugs, and<br>

        gain significant optimization for stack-based values (e.g.,

        scalarization in<br>

        calling convention) and other JIT optimizations.  <br>

        <br>

        The second group, which we had been calling "primitive classes"

        (we are now<br>

        moving away from that term), are those that are more like the

        existing<br>

        primitives, such as `Decimal` or `Complex`.  Where ordinary

        value classes, like<br>

        identity classes, gave rise to a single (reference) type, these

        classes gave<br>

        rise to two types, a value type (`X.val`) and a reference type

        (`X.ref`).  This<br>

        pair of types was directly analogous to legacy primitives and

        their boxes. These<br>

        classes come with more restrictions and more to think about, but

        are rewarded<br>

        with greater heap flattening.  This model -- after several

        iterations -- seemed<br>

        to meet the goals for expressiveness and performance: we can

        express the<br>

        difference between `int`-like behavior and `Integer`-like

        behavior, and get<br>

        routine flattening for `int`-like types.  But the result still

        had many<br>

        imbalances; the distinction was heavyweight, and a significant

        fraction of the<br>

        incremental specification complexity was centered only on these

        types.  We<br>

        eventually concluded that the source of this was trying to model

        the `int` /<br>

        `Integer` distinction directly, and that this distinction, while

        grounded in<br>

        user experience, was just not "primitive" enough.  <br>

        <br>

        In this document, we will break down the characteristics of

        so-called "primitive<br>

        classes" into more "primitive" (and hopefully less ad-hoc)

        distinctions.  This<br>

        results in a simpler model, streamlines the syntactic baggage,

        and enables us to<br>

        finally reunite with an old friend, null-exclusion (bang)

        types.  Rather than<br>

        treating "value types" and "reference types" as different

        things, we can treat<br>

        the existing primitives (and the "value projection" of

        user-defined primitive<br>

        classes) as being restricted references, whose restrictions

        enable the desired<br>

        runtime properties.  <br>

        <br>

        ## Primitives and objects<br>

        <br>

        In a previous edition of _State of Valhalla_, we outlined a host

        of differences<br>

        between primitives and objects:<br>

        <br>

        | Primitives                                 |

        Objects                                   |<br>

        | ------------------------------------------ |

        ----------------------------------------- |<br>

        | No identity (pure values)                  |

        Identity                                  |<br>

        | `==` compares state                        | `==` compares

        object identity             |<br>

        | Built-in                                   | Declared in

        classes                       |<br>

        | No members (fields, methods, constructors) | Members

        (including mutable fields)        |<br>

        | No supertypes or subtypes                  | Class and

        interface inheritance           |<br>

        | Represented directly in memory             | Represented

        indirectly through references |<br>

        | Not nullable                               |

        Nullable                                  |<br>

        | Default value is zero                      | Default value is

        null                     |<br>

        | Arrays are monomorphic                     | Arrays are

        covariant                      |<br>

        | May tear under race                        | Initialization

        safety guarantees          |<br>

        | Have reference companions (boxes)          | Don't need

        reference companions           |<br>

        <br>

        Over many iterations, we have chipped away at this list, mostly

        by making<br>

        classes richer: value classes can disavow identity (and thereby

        opt into<br>

        state-based `==` comparison); the lack of members and supertypes

        are an<br>

        accidental restriction that can go away with declarable value

        classes; we can<br>

        make primitive arrays covariant with arrays of their boxes; we

        can let some<br>

        class declarations opt into non-atomicity under race.  That

        leaves the<br>

        following, condensed list of differences: <br>

        <br>

        | Primitives                        |

        Objects                                   |<br>

        | --------------------------------- |

        ----------------------------------------- |<br>

        | Represented directly in memory    | Represented indirectly

        through references |<br>

        | Not nullable                      |

        Nullable                                  |<br>

        | Default value is zero             | Default value is

        null                     |<br>

        | Have reference companions (boxes) | Don't need reference

        companions           |<br>

        <br>

        The previous approach ("primitive classes") started with the

        assumption that<br>

        this is the list of things to be modeled by the value/reference

        distinction.  In<br>

        this document we go further, by showing that flattening (direct

        representation)<br>

        is derived from more basic principles around nullity and

        initialization<br>

        requirements, and perhaps surprisingly, the concept of

        "primitive type" can<br>

        disappear almost completely, save only for historical vestiges

        related to the<br>

        existing eight primitives.  The `.val` type can be replaced by

        restricted<br>

        references whose restrictions enable the desired

        representational properties. As<br>

        is consistent with the goals of Valhalla, flattenability is an

        emergent<br>

        property, gained by giving up those properties that would

        undermine<br>

        flattenability, rather than being a linguistic concept on its

        own.<br>

        <br>

        ### Initialization<br>

        <br>

        The key distinction between today's primitives and objects has

        to do with<br>

        _initialization requirements_.   Primitives are designed to be

        _used<br>

        uninitialized_; if we declare a field `int count`, it is

        reliably initialized to<br>

        zero by the JVM before any code can access it.  This initial

        value is a<br>

        perfectly good default, and it is not a bug to read or even

        increment this field<br>

        before it has been explicitly assigned a value by the program,

        because it has<br>

        _already_ been initialized to a known good value by the JVM. 

        The zero value<br>

        pre-written by the JVM is not just a safety net; it is actually

        part of the<br>

        programming model that primitives start out life with "good

        enough" defaults.<br>

        This is part of what it means to be a primitive type.<br>

        <br>

        Objects, on the other hand, are not designed for uninitialized

        use; they must be<br>

        initialized via constructors before use.  The default zero

        values written to an<br>

        object's fields by the JVM typically don't necessarily

        constitute a valid state<br>

        according to the classes specification, and, even if it did, is

        rarely a good<br>

        default value.  Therefore, we require that class instances be

        initialized by<br>

        their constructors before they can be exposed to the rest of the

        program.  To<br>

        ensure that this happens, objects are referenced exclusively

        through _object<br>

        references_, which _can_ be safely used uninitialized -- because

        they reliably<br>

        have the usable default value of `null`.  (Some may quibble with

        this use of<br>

        "safely" and "usable", because null references are fairly

        limited, but they do<br>

        their limited job correctly: we can easily and safely test

        whether a reference<br>

        is null, and if we accidentally dereference a null reference, we

        get a clear<br>

        exception rather than accessing uninitialized object state.)  <br>

        <br>

        > Primitives can be safely used without explicit

        initialization; objects cannot.<br>

        > Object references are nullable _precisely because_ objects

        cannot be used<br>

        > safely without explicit initialization.  <br>

        <br>

        ### Nullability<br>

        <br>

        A key difference between today's primitives and references is

        that primitives<br>

        are non-nullable and references are nullable.  One might think

        this was<br>

        primarily a choice of convenience: null is useful for references

        as a universal<br>

        sentinel, and not all that useful for primitives (when we want

        nullable<br>

        primitives we can use the box classes -- but we usually don't.) 

        But the<br>

        reality is not one of convenience, but of necessity: nullability

        is _required_<br>

        for the safety of objects, and usually _detrimental_ to the

        performance of<br>

        primitives.<br>

        <br>

        Nullability for object references is a forced move because null

        is what is<br>

        preventing us from accessing uninitialized object state. 

        Nullability for<br>

        primitives is usually not needed, but that's not the only reason

        primitives are<br>

        non-nullable.  If primitives were nullable, `null` would be

        another state that<br>

        would have to be represented in memory, and the costs would be

        out of line with<br>

        the benefits.  Since a 64-bit `long` uses all of its bit

        patterns, a nullable<br>

        `long` would require at least 65 bits, and alignment

        requirements would likely<br>

        round this up to 128 bits, doubling memory usage.  (The density

        cost here is<br>

        substantial, but it gets worse because most hardware today does

        not have cheap<br>

        atomic 128 bit loads and stores.  Since tearing might conflate a

        null value with<br>

        a non-null value -- even worse than the usual consequences of

        tearing -- this<br>

        would push us strongly towards using an indirection instead.) 

        So<br>

        non-nullability is a precondition for effective flattening and

        density of<br>

        primitives, and nullable primitives would involve giving up the

        flatness and<br>

        density that are the reason to have primitives in the first

        place.  <br>

        <br>

        > Nullability interferes with heap flattening.<br>

        <br>

        To summarize, the design of primitives and objects implicitly

        stems from the<br>

        following facts: <br>

        <br>

         - For most objects, the uninitialized (zeroed) state is either

        invalid or not a<br>

           good-enough default value;<br>

         - For primitives, the uninitialized (zeroed) state is both

        valid and a<br>

           good-enough default value; <br>

         - Having the uninitialized (zeroed) state be a good-enough

        default is a<br>

           precondition for reliable flattening;<br>

         - Nullability is required when the the uninitialized (zeroed)

        state is not a<br>

           good-enough default; <br>

         - Nullability not only has a footprint cost, but often is an

        impediment to<br>

           flattening.<br>

        <br>

        > Primitives exist in the first place because they can be

        flattened to give us<br>

        > better numeric performance; flattening requires giving up

        nullity and<br>

        > tolerance of uninitialized (zero) values.<br>

        <br>

        These observations were baked in to the language (and other

        languages too), but<br>

        the motivation for these decisions was then "erased" by the

        rigid distinction<br>

        between primitives and objects.  Valhalla seeks to put that

        choice back into the<br>

        user's hands.<br>

        <br>

        ### Getting the best of both worlds<br>

        <br>

        Project Valhalla promises the best of both worlds: sufficiently

        constrained<br>

        entities can "code like a class and work like an int."  Classes

        that give up<br>

        object identity can get some of the runtime benefits of

        primitives, but to get<br>

        full heap flattening, we must embrace the two defining

        characteristics of<br>

        primitives described so far: non-nullability and safe

        uninitialized use.  <br>

        <br>

        Some candidates for value classes, such as `Complex`, are safe

        to use<br>

        uninitialized because the default (zero) value is a good initial

        value.  Others,<br>

        like `LocalDate`, simply have no good default value (zero or

        otherwise), and<br>

        therefore need the initialzation protocol enabled by

        null-default object<br>

        references.  This distinction in inherent to the semantics of

        the domain; some<br>

        domains simply do not have reasonable default value, and this is

        a choice that<br>

        the class author must capture when the code is written.  <br>

        <br>

        There is a long list of classes that are candidates to be value

        classes; some<br>

        are like `Complex`, but many are more like `LocalDate`.  The

        latter group can<br>

        still benefit significantly from eliminating identity, but can't

        necessarily get<br>

        full heap flattening.  The former group, which are most like

        today's primitives,<br>

        can get all the benefits, including heap flattening -- when

        their instances are<br>

        non-null.  <br>

        <br>

        ### Declaring value classes<br>

        <br>

        As in previous iterations, a class can be declared as as _value

        class_:<br>

        <br>

        ```<br>

        value class LocalDate { ... }<br>

        ```<br>

        <br>

        A value class gives up identity and its consequences (e.g.,

        mutability) -- and<br>

        that's it.  The resulting  `LocalDate` type is still a reference

        type, and<br>

        variables of type `LocalDate` are still nullable.  Instances can

        get significant<br>

        optimizations for on-stack use but are still usually represented

        in the heap via<br>

        indirections.  <br>

        <br>

        ### Implicitly constructible value classes<br>

        <br>

        In order to get the next group of benefits, a value class must

        additionally<br>

        attest that it can be used uninitialized.  Because this is a

        statement of how<br>

        instances of this class come into existence, modeling this as a

        special kind of<br>

        constructor seems natural:<br>

        <br>

        ```<br>

        value class Complex { <br>

            private int re;<br>

            private int im;<br>

        <br>

            public implicit Complex();<br>

            public Complex(int re, int im) { ... }<br>

        <br>

            ...<br>

        }<br>

        ```<br>

        <br>

        These two constructors say that there are two ways a `Complex`

        instance comes<br>

        into existence: the first is via the traditional constructor

        that takes real and<br>

        imaginary values (`new Complex(1.0, 1.0)`), and the second is

        via the _implicit_<br>

        constructor that produces the instance used to initialize fields

        and array<br>

        elements to their default values.  That the implicit constructor

        cannot have a<br>

        body is a signal that the "zero default" is not something the

        class author can<br>

        fine-tune.  A value class with an implicit constructor is called

        an _implicitly<br>

        constructible_ value class.<br>

        <br>

        Having an implicit constructor is a necessary but not sufficient

        condition for<br>

        heap flattening.  The other required condition is that variable

        that holds a<br>

        `Complex` needs to be non-nullable.  In the previous iteration,

        the `.val` type<br>

        was non-nullable for the same reason primitive types were, and

        therefore `.val`<br>

        types could be fully flattened.  However, after several rounds

        of teasing apart<br>

        the fundamental properties of primitives and value types,

        nullability has<br>

        finally sedimented to a place in the model where a sensible

        reunion between<br>

        value types and non-nullable types may be possible.  <br>

        <br>

        ## Null exclusion <br>

        <br>

        Non-nullable reference types have been a frequent request for

        Java for years,<br>

        having been explored in `C#`, Kotlin, and Scala.  The goals of

        non-nullable<br>

        types are sensible: richer types means safer programs.  It is a

        pervasive<br>

        problem in Java libraries that we are not able to express within

        the language<br>

        whether a returned object reference might be null, or is known

        never to be null,<br>

        and programmers can therefore easily make wrong assumptions

        about nullability.  <br>

        <br>

        To date, Project Valhalla has deliberately steered clear of

        non-nullable types<br>

        as a standalone feature. This is not only because the goals of

        Valhalla were too<br>

        ambitious to burden the project with another ambitious goal

        (though that is<br>

        true), but for a more fundamental reason: the assumptions one

        might make in a<br>

        vacuum about the semantics of non-nullable types would likely

        become hidden<br>

        sources of constraints for the value type design, which was

        already bordering on<br>

        over-constrained.  Now that the project has progressed

        sufficiently, we are more<br>

        confident that we can engage with the issue of null exclusion.<br>

        <br>

        A _refinement type_ (or _restriction type_) is a type that is

        derived from<br>

        another type that excludes certain values from the derived

        type's value set,<br>

        such as "the non-negative integers". In the most general form, a

        refinement type<br>

        is defined by one or more predicates (Liquid Haskell and Clojure

        Spec are<br>

        examples of this); range types in Pascal are a more constrained

        form of<br>

        refinement type.  Non-nullable types ("bang" types) can

        similarly be viewed as a<br>

        constrained form of refinement type, characterized by the

        predicate `x != null`.<br>

        (Note that the null-excluding refinement type `X!` of a

        reference type is still<br>

        a reference type.)<br>

        <br>

        Rather than saying that primitive classes give rise to two

        types, `X.val` and<br>

        `X.ref`, we can observe the the null-excluding type `X!` of a<br>

        implicitly-constructible value class can have the same runtime

        characteristic as<br>

        the `.val` type in the previous round.  Both the

        declaration-site property that<br>

        a value class is implicitly constructible, and the use-site

        property that a<br>

        variable is null-excluding, are necessary to routinely get

        flattening.  <br>

        <br>

        Related to null exclusion is _null-adjunction_; this takes a

        non-nullable type<br>

        (such as `int`) or a type of indeterminate nullability (such as

        a type variable<br>

        `T` in a generic class that can be instantiated with either

        nullable or<br>

        non-nullable type parameters) and produces a type that is

        explicitly nullable<br>

        (`int?` or `T?`.)  In the current form of the design, there is

        only one place<br>

        where the null-adjoining type is strictly needed -- when generic

        code needs to<br>

        express "`T`, but might be null.  The canonical example of this

        is `Map::get`;<br>

        it wants to wants to return `V?`, to capture the fact that `Map`

        uses `null` to<br>

        represent "no mapping".<br>

        <br>

        For a given class `C`, the type `C!` is clearly non-nullable,

        and the type `C?`<br>

        is clearly nullable.  What of the unadorned name `C`?  This has

        _unspecified_<br>

        nullability.  Unspecified nullability is analogous to raw types

        in generics (we<br>

        could call this "raw nullability"); we cannot be sure what the

        author had in<br>

        mind, and so must find a balance between the desire for greater

        null safety and<br>

        tolerance of ambiguity in author intent.<br>

        <br>

        Readers who are familiar with explicitly nullable and

        non-nullable types in<br>

        other languages may be initially surprised at some of the

        choices made regarding<br>

        null-exclusion (and null-adjunction) types here.  The

        interpretation outlined<br>

        here is not necessarily the "obvious" one, because it is

        constrained both by the<br>

        needs of null-exclusion, of Valhalla, and the

        migration-compatibility<br>

        constraints needed for the ecosystem to make a successful

        transition to types<br>

        that have richer nullability information.  <br>

        <br>

        While the theory outlined here will allow all class types to

        have a<br>

        null-excluding refinement type, it is also possible that we will

        initially<br>

        restrict null-exclusion to implicitly constructible value

        types.  There are<br>

        several reasons to consider pursuing such an incremental path,

        including the<br>

        fact that we will be able to reify the non-nullability of

        implicitly<br>

        constructible value types in the JVM, whereas the null-exclusion

        types of other<br>

        classes such as `String` or of ordinary value classes such as

        `LocalDate` would<br>

        need to be done through erasure, increasing the possible sources

        of null<br>

        polluion.  <br>

        <br>

        ### Goals<br>

        <br>

        We adopt the following set of goals for adding null-excluding

        refinement types: <br>

        <br>

         - More complete unification of primitives with classes;<br>

         - Flatness is an emergent property that can derive from more

        basic semantic<br>

           constraints, such as identity-freedom, implicit

        constructibility, and<br>

           non-nullity;<br>

         - Merge the concept of "value companion" (`.val` type) into the

        null-restricted<br>

           refinement type of implicitly constructible value classes;<br>

         - Allow programmers to annotate type uses to explicitly exclude

        or affirm nulls<br>

           in the value set;<br>

         - Provide some degree of runtime nullness checking to detect

        null pollution;<br>

         - Annotating an existing API (one based on identity classes)

        with additional<br>

           nullness information should be binary- and source-compatible.<br>

        <br>

        The last goal is a source of strong constraints, and not one to

        be taken<br>

        lightly.  If an existing API that specifies "this method never

        returns null"<br>

        cannot be compatibly migrated to one where this constraint is

        reflected in the<br>

        method declaration proper, the usefulness of null-exclusion

        types is greatly<br>

        reduced; library maintainers will be put to a bad choice of

        forgoing a feature<br>

        that will make their APIs safer, or making an incompatible

        change in order to do<br>

        so.  If we were building a new language from scratch, the

        considerations might<br>

        be different, but we do not have that luxury.  "Just copying"

        what other<br>

        languages have done here is a non-starter.  <br>

        <br>

        ### Interoperation between nullable and non-nullable types<br>

        <br>

        We enable conversions between a nullable type and a compatible

        null-excluding<br>

        refinement type by adding new widening and narrowing conversions

        between `T?`<br>

        and `T!` that have analogous semantics to the existing boxing

        and unboxing<br>

        conversions between `Integer` and `int`.  Just as with boxing

        and unboxing,<br>

        widening from a non-nullable type to a nullable type is

        unconditional and never<br>

        fails, and narrowing from a nullable type to a non-nullable type

        may fail by<br>

        throwing `NullPointerException`.  These conversions for

        null-excluding types<br>

        would be sensible in assignment context, cast context, and

        method invocation<br>

        context (both loose and strict, unlike boxing for primitives

        today.) This would<br>

        allow existing assignments, invocation, and overload

        applicability checks to<br>

        continue to work even after migrating one of the types involved,

        as required for<br>

        source-compatibility.<br>

        <br>

        Checking for bad values can mirror the approach taken for

        generics.  When a<br>

        richer compile-time type system erases to a less-rich runtime

        type system, type<br>

        safety derives from a mix of compile-time type checking and

        synthetic runtime<br>

        checks.  In both cases, there is a possibility of pollution

        which can be<br>

        injected at the boundary between legacy and new code, by

        malicious code, or<br>

        through injudicious use of unchecked casts and raw types.  And

        like generics, we<br>

        would like to offer the possibility that if a program compiles

        in its entirety<br>

        with no unchecked warnings, null-excluding types will not be

        observed to contain<br>

        null.  To achieve this, we will need a combination of runtime

        checks, new<br>

        unchecked warnings, and possibly restrictions on initialization.

         <br>

        <br>

        The intrusion on the type-checking of generics here is

        considerable; nullity<br>

        will have to be handled in type inference, bounds conformance,

        subtyping, etc.<br>

        In addition, there are new sources of heap pollution and new

        conditions under<br>

        which a varaible may be polluted.  The _Universal Generics_ JEP

        outlines a<br>

        number of unchecked warnings that must be issued in order to

        avoid null<br>

        pollution in type variables that might be instantiated either

        with a nullable or<br>

        null-excluding type.  While this work was designed for `ref` and

        `val` types,<br>

        much of it applies directly to null-excluding types.<br>

        <br>

        The liberal use of conversion rather than subtyping here may be

        surprising to<br>

        readers who are familiar with other languages that support

        null-excluding types.<br>

        At first, it may appear to be "giving up all the benefit" of

        having annotated<br>

        APIs for nullness, since a nullable value may be assigned

        directly to a<br>

        non-nullable type without requiring a cast.  But the reality is

        that for the<br>

        first decade at least, we will at best be living in a mixed

        world where some<br>

        APIs are migrated to use nullness information and some will not,

        and forcing<br>

        users to modify code that uses these libraries (and then do so

        again and again<br>

        as more libraries migrate) would be an unnacceptable tax on Java

        users, and a<br>

        deterrent to libraries migrating to use these features.  <br>

        <br>

        Starting from `T! <: T?` -- and forcing explicit conversions

        when you want to go<br>

        from nullable to non-nullable values -- does seem an obvious

        choice if you have<br>

        the luxury of building a type system from scratch.  But if we

        want to make<br>

        migration to null-excluding types a source-compatible change for

        libraries and<br>

        clients, we cannot accept a strict subtyping approach.  (Even if

        we did, we<br>

        could still only use subtyping in one direction, and would have

        to add an<br>

        additional implicit conversion for the other direction -- a

        conversion that is<br>

        similar to the narrowing conversion proposed here.)<br>

        <br>

        Further, primitives _already_ use boxing and unboxing

        conversions to go between<br>

        their nullable (box) and non-nullable (primitive) forms.  So

        choosing subtyping<br>

        for references (plus an unbalanced implicit conversion) and

        boxing/unboxing<br>

        conversion for primitives means our treatment of null-excluding

        types is<br>

        gratuitously different for primitives than for other classes.<br>

        <br>

        Another consequence of wanting migration compatibility for

        annotating a library<br>

        with nullness constraints is that nullness constraints cannot

        affect overload<br>

        selection.  Compatibility is not just for clients, it is also

        for subclasses.<br>

        <br>

        ### Null exclusion for implicitly constructible value classes<br>

        <br>

        Implicitly constructible value classes go particularly well with

        null exclusion,<br>

        because we can choose a memory representation that _cannot_

        encode null,<br>

        enabling a more compact and direct representation.  <br>

        <br>

        The Valhalla JVM has support for such a representation, and so

        we describe the<br>

        null-exclusion type of an implicitly constructible value class

        as _strongly null<br>

        excluding_.  This means that its null exclusion is reified by

        the JVM.  Such a<br>

        variable can never be seen to contain null, because null simply

        does not have a<br>

        runtime representation for these types.  This is only possible

        because these<br>

        classes are implicitly constructible; that the default zero

        value written by the<br>

        JVM is known to be a valid value of the domain.  As with

        primitives, these types<br>

        are explicitly safe to use uninitialized.  <br>

        <br>

        A strongly null-excluding type will have a type mirror, as type

        mirrors describe<br>

        reifiable types.  <br>

        <br>

        ### Null exclusion for other classes<br>

        <br>

        For identity classes and non-implicitly-constructible value

        classes, the story<br>

        is not quite as nice.  Since there is no JVM representation of

        "non-nullable<br>

        String", the best we can do is translate `String!` to `String`

        (a form of<br>

        erasure), and then try to keep the nulls at bay.  This means

        that we do not get<br>

        the flattening or density benefits, and null-excluding variables

        may still be<br>

        subject to heap pollution.   We can try to minimize this with a

        combination of<br>

        static type checking and generated runtime checks.  We refer to

        the<br>

        null-exclusion type of an identity or non-implicitly

        constructible value class<br>

        as _weakly null-excluding_.<br>

        <br>

        There is an additional source of potential null pollution, aside

        from the<br>

        sources analogous to generic heap pollution: the JVM itself. 

        The JVM<br>

        initializes references in the heap to null.  If `String!` erases

        to an ordinary<br>

        `String` reference, there is at least a small window in time

        when this<br>

        supposedly non-nullable field contains null.  We can erect

        barriers to reduce<br>

        the window in which this can be observed, but these barriers

        will not be<br>

        foolproof.  For example, the compiler could enforce that a field

        of type<br>

        `String!` either has an initializer or is definitely assigned in

        every<br>

        constructor.  However, if the receiver escapes during

        construction, all bets are<br>

        off, just as they are with initialization safety for final

        fields.<br>

        <br>

        We have a similar problem with arrays of `String!`; newly

        created arrays<br>

        initialize their elements to the default value for the component

        type, which is<br>

        `null`, and we don't even have the option of requiring an

        initializer as we<br>

        would with fields.  (Since a `String![]` is also a `String[]`,

        one option is to<br>

        to outlaw the direct creation of arrays of weakly null-excluding

        types, instead<br>

        providing reflective API points which will safely create the

        array and<br>

        initialize all elements to a non-null value.)<br>

        <br>

        A weakly null-excluding type will not have a type mirror, as the

        nullity<br>

        information is erased for these types.  Generic signatures would

        be extended to<br>

        represent null-exclusion, and similarly the `Type` hiearchy

        would reflect such<br>

        signatures.  <br>

        <br>

        Because of erasure and the new possibilities for pollution,

        allowing<br>

        null-exclusion types for identity classes introduces significant

        potential new<br>

        complexity.  For this reason, we may choose a staged approach

        where<br>

        null-restricted types are initially limited to the strongly

        null-restricted<br>

        ones.<br>

        <br>

        ### Null exclusion for other value classes<br>

        <br>

        Value classes that are not implicitly constructible are similar

        to identity<br>

        classes in that their null-exclusion types are only weakly

        null-excluding.<br>

        These classes are the ones for which the author has explicitly

        decided that the<br>

        default zero value is not a valid member of the domain, so we

        must ensure that<br>

        in no case does this invalid value ever escape. This effectively

        means that we<br>

        must similarly erase these types to a nullable representation to

        ensure that the<br>

        zero value stays contained.  (There are limited heroics the VM

        can do with<br>

        alternate representations for null when these classes are small

        and have readily<br>

        identifiable slack bits, but this is merely a potential

        optimization for the<br>

        future.)  <br>

        <br>

        ### Atomicity<br>

        <br>

        Primitives additionally have the property that

        larger-than-32-bit primitives<br>

        (`long` and `double`) may tear under race.  The allowance for

        tearing was an<br>

        accomodation to the fact that numeric code is often

        performance-critical, and so<br>

        a tradeoff was made to allow for more performance at the cost of

        less safety for<br>

        incorrect programs.  The corresponding box types, as well as

        primitive variables<br>

        declared `volatile`, are guaranteed not to tear, even under

        race.  (See the<br>

        document entitled "Understanding non-atomicity and tearing" for

        more detail.)<br>

        <br>

        Implicitly constructible value classes can be declared as

        "non-atomic" to<br>

        indicate that its null-exclusion type may tear under race (if

        not declared<br>

        `volatile`), just as with `long` and `double`.  The classes

        `Long` and `Double`<br>

        would be declared non-atomic (though most implementations still

        offer atomic<br>

        access for 64-bit primitives.)<br>

        <br>

        ### Flattening<br>

        <br>

        Flattening in the heap is an emergent property, which is

        achieved when we give<br>

        up the degrees of freedom that would prevent flattening:<br>

        <br>

         - Identity prevents flattening entirely;<br>

         - Nullability prevents flattening in the absence of heroics

        involving exotic<br>

           representations for null; <br>

         - The inability to use a class without initialization requires

        nullability at<br>

           the VM representation level, undermining flattening;<br>

         - Atomicity prevents flattening for larger value objects.<br>

        <br>

        Putting this together, the null-exclusion type of implicitly

        constructible value<br>

        classes is flattenable in the heap when the class is non-atomic

        or the layout is<br>

        suitably small.  For ordinary value classes, we can still get

        flattening in the<br>

        calling convention: all identity-free types can be flattened on

        the stack,<br>

        regardless of layout size or nullability.<br>

        <br>

        ### Summarizing null-exclusion<br>

        <br>

        The feature described so far is at the weak end of the spectrum

        of features<br>

        described by "non-nullable types".  We make tradeoffs to enable

        gradual<br>

        migration compatibility, moving checks to the boundary -- where

        in some cases<br>

        they might not happen due to erasure, separate compilation, or

        just dishonest<br>

        clients.  <br>

        <br>

        Users may choose to look at this as "glass X% full" or "glass

        (100-X)% empty".<br>

        We can now more clearly say what we mean, migrate incrementally

        towards more<br>

        explicit and safe code without forking the ecosystem, and catch

        many errors<br>

        earlier in time.  On the other hand, it is less explicit where

        we might<br>

        experience runtime failures, because autoboxing makes unboxing

        implicit.  And<br>

        some users will surely complain merely because this is not what

        their favorite<br>

        language does.  But it is the null-exclusion we can actually

        have, rather than<br>

        the one we wish we might have in an alternate universe.  <br>

        <br>

        This approach yields a significant payoff for the Valhalla

        story.  Valhalla<br>

        already had to deal with considerable new complexity to handle

        the relationship<br>

        between reference and value types -- but this new complexity

        applied only to<br>

        primitive classes.  For less incremental complexity, we can have

        a more uniform<br>

        treatment of null-exclusion across all class types.  The story

        is significantly<br>

        simpler and more unified than we had previously: <br>

        <br>

         - Everything, including the legacy primitives, is an object (an

        instance of<br>

           some class);<br>

         - Every type, including the legacy primitives, is derived from

        a class;<br>

         - All types are reference types (they refer to objects), but

        some reference<br>

           types (non-nullable references to implicitly constructible

        objects) exhibit<br>

           the runtime behavior of primitives;<br>

         - Some reference types exclude null, and some null-excluding

        reference types<br>

           are reifiable with a known-good non-null default;<br>

         - Every type can have a corresponding null-exclusion type.<br>

        <br>

        ## Planning for a null-free future (?)<br>

        <br>

        Users prefer working with unnanotated types (e.g., `Foo`) rather

        than explicitly<br>

        annotated types (`Foo!`, `Foo?`), where possible.  The

        unannotated type `Foo`<br>

        could mean one of three things: an alias for `Foo!`, an alias

        for `Foo?`, or a<br>

        type of "raw" (unknown) nullity.   Investigations into

        null-excluding type<br>

        systems have shown that the better default would be to treat an

        unannotated name<br>

        as indicating non-nullability, and use explicitly nullable types

        (`T?`) to<br>

        indicate the presence of null, because returning or accepting

        null is generally<br>

        a less common case.  Of course, today `String` means "possibly

        nullable String"<br>

        in Java, meaning that, yet again, we seem to have chosen the

        wrong default.  <br>

        <br>

        Our friends in the `C#` community have explored the possibility

        of a<br>

        "flippening".  `C#` started with the Java defaults, and later

        provided a<br>

        compiler mode to flip the default on a per-module basis, with

        checking (or<br>

        pollution risk) at the boundary between modules with opposite

        defaults.  This is<br>

        an interesting experiment and we look forward to seeing how this

        plays out in<br>

        the `C#` ecosystem.  <br>

        <br>

        Alternately, another possible approach for Java is to continue

        to treat the<br>

        unadorned name as having "raw" or "unknown" nullity, encouraging

        users to<br>

        annotate types with either `!` or `?`.  This approach has been

        partially<br>

        explored in the `JSpecify` project.  Within this approach is a

        range of options<br>

        for what the language will do with such types; there is a risk

        of flooding users<br>

        with warnings.  We may want to leave such analysis to

        extralinguistic type<br>

        checkers, at least initially -- but we would like to not

        foreclose on the<br>

        possibility of an eventual flippening.<br>

        <br>

      </font></font>

  </body>

</html>