<html><head>

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  </head>

  <body>

    <font size="4"><font face="monospace">There has been, not

        surprisingly, a lot of misunderstanding about atomicity,

        non-atomicity, and tearing.  In particular, various syntactic

        expressions of non-atomicity (e.g., a `non-atomic` class

        keyword) tend to confuse users into thinking that non-atomic

        access is somehow a *feature*, rather than providing more

        precise control over the breakage modes of already-broken

        programs (to steer optimizations for non-broken programs.)<br>

        <br>

        I've written the following as an attempt to help people

        understand the role of atomicity and tearing in the model;

        comments are welcome (though let's steer clear of trying to

        paint the bikeshed in this thread.)<br>

        <br>

        <br>

        <br>

        # Understanding non-atomicity and tearing<br>

        <br>

        Almost since the beginning of Project Valhalla, the design has

        included some<br>

        form of "non-atomicity" or "tearability".  Addressing this in

        the programming<br>

        model is necessary if we are to achieve the heap flattening that

        Valhalla wants<br>

        to deliver, but unfortunately this aspect of the feature set is

        frequently<br>

        misunderstood.  <br>

        <br>

        Whether non-atomicity is expressed syntactically as a class

        modifier,<br>

        constructor modifier, supertype, or some other means, the

        concept is the same: a<br>

        class indicates its willingness to give up certain guarantees in

        order to gain<br>

        additional heap flattening.  <br>

        <br>

        Unlike most language features, which express either the presence

        or absence of<br>

        things that are at some level "normal" (e.g., the presence or

        absence of `final`<br>

        means a class either can be assigned to, or cannot),

        non-atomicity is different;<br>

        it is about what the possible observable effects are when an

        instance of this<br>

        class is accessed with a data race.  Programs with data races

        are _already<br>

        broken_, so rather than opting into or out of a feature,

        non-atomicity is<br>

        expressing a choice between "breakage mode A" and "breakage mode

        B".  <br>

        <br>

        > Non-atomicity is best thought of not as a _feature_ or the

        absence thereof,<br>

        > but an alternate choice about the runtime-visible behavior

        of _already-broken<br>

        > programs_.  <br>

        <br>

        ## Background: flattening and tearing in built-in primitives<br>

        <br>

        Flattening and non-atomicity have been with us since Java 1.0. 

        The eight<br>

        built-in primitive types are routinely flattened into object

        layouts and arrays.<br>

        This "direct" storage results from several design choices made

        about primitives:<br>

        primitive types are non-nullable, and their zero values

        represent explicitly<br>

        "good" default values and therefore even "uninitialized"

        primitives have useful<br>

        initial values.  <br>

        <br>

        Further, the two 64-bit primitive types (`long` and `double`)

        are explicitly<br>

        permitted to _tear_ when accessed via a data race, as if they

        are read and<br>

        written using two 32-bit loads and stores.  When a mutable

        `long` or `double` is<br>

        read with a data race, it may be seen to have the high-order 32

        bits of one<br>

        previous write and the low-order 32 bits of another.  This is

        because at the<br>

        time, atomic 64-bit loads and stores were prohibitively

        expensive on most<br>

        processors, so we faced a tradeoff: punish well-behaved programs

        with<br>

        poorly-performing numerics, or allow already-broken programs

        (concurrent<br>

        programs with insufficient synchronization) to be seen to

        produce broken numeric<br>

        results.  <br>

        <br>

        In most similar situations, Java would have come down on the

        side of<br>

        predictability and correctness. However, numeric performance was

        important<br>

        enough, and data races enough of an "all bets are off" sort of

        thing, that this<br>

        set of decisions was a pragmatic compromise.  While tearing

        sounds scary, it is<br>

        important to reiterate that tearing only happens when programs

        _are already<br>

        broken_, and that even if we outlawed tearing, _something else

        bad_ would still<br>

        happen.<br>

        <br>

        Valhalla takes these implicit characteristics of primitives and

        formalizes them<br>

        to explicit characteristics of value classes in the programming

        model, enabling<br>

        user-defined classes to gain the runtime characteristics of

        primitives.  <br>

        <br>

        ## Data races and consistency<br>

        <br>

        A _data race_ is when a nonfinal heap variable (array element or

        nonfinal field)<br>

        is accessed by multiple threads, at least once access is a

        write, and the reads<br>

        and writes of that variable are not ordered by _happens-before_

        (see JLS Ch17 or<br>

        _Java Concurrency in Practice_ Ch16.)  In the presence of a data

        race, the<br>

        reading thread may see a stale (out of date) value for that

        variable.  <br>

        <br>

        "Stale" doesn't sound so bad, but in a program with multiple

        variables, the<br>

        error states can multiply with the number and configuration of

        mutable<br>

        variables.  Suppose we have two `Range` classes:<br>

        <br>

        ```<br>

        class MutableRange {<br>

            int low, high;<br>

        <br>

            // obvious constructor, accessor, and updater methods<br>

            // constructor and updater methods validate invariant low

        <= high<br>

        }<br>

        <br>

        class ImmutableRange { <br>

            final int low, high;<br>

        <br>

            // obvious constructor and accessors, constructor validates

        invariant<br>

        }<br>

        <br>

        final static MutableRange mr = new MutableRange(0, 10);<br>

        static ImmutableRange ir = new ImmutableRange(0, 10);<br>

        ```<br>

        <br>

        For `mr`, we have a final reference to a mutable point, so there

        are two mutable<br>

        variables here (`mr.low` and `mr.high`.)  We update our range

        value through a<br>

        method that mutates `low` and/or `high`.  By contrast, `ir` is a

        mutable<br>

        reference to an immutable object, with one mutable variable

        (`ir`), and we<br>

        update our range value by creating a new `ImmutableRange` and

        mutating the<br>

        reference `ir` to refer to it.  <br>

        <br>

        More things can go wrong when we racily access the mutable

        range, because there<br>

        are more mutable variables.  If Thread A writes `low` and then

        writes `high`,<br>

        and Thread B reads `low` and `high`; under racy access B could

        see stale or<br>

        up-to-date values for either field, and even if it sees an

        up-to-date value for<br>

        `high` (the one written later), that still doesn't mean it would

        see an<br>

        up-to-date value for `low`.  This means that in addition to

        seeing out-of-date<br>

        values for either or both, we could observe an instance of

        `MutableRange` to not<br>

        obey the invariant that is checked by constructors and setters.

         <br>

        <br>

        Suppose instead we racily access the immutable range.  At least

        there are fewer<br>

        possible error states; a reader might see a stale _reference_ to

        the immutable<br>

        object.  Access to `low` and `high` through that stale reference

        would see<br>

        out-of-date values, but those out-of-date values would at least

        be consistent<br>

        with each other (because of the initialization safety guarantees

        of final<br>

        fields.)<br>

        <br>

        When primitives other than `long` or `double` are accessed with

        a data race, the<br>

        failure modes are like that of `ImmutableRange`; when we accept

        that `long` or<br>

        `double` could tear under race, we are additionally accepting

        the failure modes<br>

        of `MutableRange` under race for those types as well, as if the

        high- and<br>

        low-order 32-bit quantities were separate fields (in exchange

        for better<br>

        performance).  Accepting non-atomicity of large primitives

        merely _increases_<br>

        the number of observable failure modes for broken programs; even

        with atomic<br>

        access, such programs are still broken and can produce

        observably incorrect<br>

        results.<br>

        <br>

        Note that a `long` or `double` will never tear if it is `final`,

        `volatile`,<br>

        only accessed from a single thread, or accessed concurrently

        with appropriate<br>

        sychronization.  Tearing only happens in the presence of

        concurrent access to<br>

        mutable variables with insufficient synchronization.  <br>

        <br>

        ## Non-atomicity and value types<br>

        <br>

        Hardware has improved significantly since Java 1.0, so the

        specific tradeoff<br>

        faced by the Java designers regarding `long` and `double` is no

        longer an issue,<br>

        as most processors have fast atomic 64-bit load and store

        operations today.<br>

        However, Valhalla will still face the same problem, as value

        types can easily<br>

        exceed 64 bits in size, and whatever the limit on efficient

        atomic loads and<br>

        stores is, we can easily write value types that will exceed that

        size.  This<br>

        leaves us with three choices:<br>

        <br>

         - Never allow tearing of values, as with `int`; <br>

         - Always allow tearing of values under race, as with `long`;<br>

         - Allow tearing of values under race based on some sort of

        opt-in or opt-out.<br>

        <br>

        Note that tearing is not anything anyone ever _wants_, but it is

        sometimes an<br>

        acceptable tradeoff to get more flattening.  It was a sensible

        tradeoff for<br>

        `long` and `double` in 1995, and will continue to be a sensible

        tradeoff for at<br>

        least some value types going forward.  <br>

        <br>

        The first choice -- values are always atomic -- offers the most

        safety, but<br>

        means we must forgo one of the primary goals of Valhalla for all

        but the<br>

        smallest value types.  <br>

        <br>

        This leaves us with "values are always like `long`", or "values

        can opt into /<br>

        out of being like `long`."  Types like `long` have the

        interesting property that<br>

        all bit patterns correspond to valid values; there are no

        representational<br>

        invariants for `long`.  On the other hand, values are classes,

        and can have<br>

        representation invariants that are enforced by the constructor. 

        Having<br>

        representational invariants for immutable classes be seen to not

        hold would be a<br>

        significant and novel new failure mode, and so we took the safe

        route, requiring<br>

        class authors to make the tradeoff between flattening and

        failure modes under<br>

        race.  <br>

        <br>

        Just as with `long` and `double`, a value will never tear if the

        variable that<br>

        holds the value is `final`, `volatile`, only accessed from a

        single thread, or<br>

        accessed concurrently with appropriate sychronization.  Tearing

        only happens in<br>

        the presence of concurrent access to mutable variables with

        insufficient<br>

        synchronization.  <br>

        <br>

        Further, tearing under race will only happen for non-nullable

        variables of value<br>

        types that support default instances.  <br>

        <br>

        What remains is to offer sensible advice to authors of value

        classes as to when<br>

        to opt into non-atomicity.  If a class has any cross-field

        invariants (such as<br>

        `ImmutableRange`), atomicity should definitely be retained.  In

        the remaining<br>

        cases, class authors (like the creators of `long` or `double`)

        must make a<br>

        tradeoff about the perceived value of atomicity vs flattening

        for the expected<br>

        range of users of the class.  <br>

        <br>

        <br>

      </font></font>

  </body>

</html>