<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<font size="4"><font face="monospace">There has been, not
surprisingly, a lot of misunderstanding about atomicity,
non-atomicity, and tearing. In particular, various syntactic
expressions of non-atomicity (e.g., a `non-atomic` class
keyword) tend to confuse users into thinking that non-atomic
access is somehow a *feature*, rather than providing more
precise control over the breakage modes of already-broken
programs (to steer optimizations for non-broken programs.)<br>
<br>
I've written the following as an attempt to help people
understand the role of atomicity and tearing in the model;
comments are welcome (though let's steer clear of trying to
paint the bikeshed in this thread.)<br>
<br>
<br>
<br>
# Understanding non-atomicity and tearing<br>
<br>
Almost since the beginning of Project Valhalla, the design has
included some<br>
form of "non-atomicity" or "tearability". Addressing this in
the programming<br>
model is necessary if we are to achieve the heap flattening that
Valhalla wants<br>
to deliver, but unfortunately this aspect of the feature set is
frequently<br>
misunderstood. <br>
<br>
Whether non-atomicity is expressed syntactically as a class
modifier,<br>
constructor modifier, supertype, or some other means, the
concept is the same: a<br>
class indicates its willingness to give up certain guarantees in
order to gain<br>
additional heap flattening. <br>
<br>
Unlike most language features, which express either the presence
or absence of<br>
things that are at some level "normal" (e.g., the presence or
absence of `final`<br>
means a class either can be assigned to, or cannot),
non-atomicity is different;<br>
it is about what the possible observable effects are when an
instance of this<br>
class is accessed with a data race. Programs with data races
are _already<br>
broken_, so rather than opting into or out of a feature,
non-atomicity is<br>
expressing a choice between "breakage mode A" and "breakage mode
B". <br>
<br>
> Non-atomicity is best thought of not as a _feature_ or the
absence thereof,<br>
> but an alternate choice about the runtime-visible behavior
of _already-broken<br>
> programs_. <br>
<br>
## Background: flattening and tearing in built-in primitives<br>
<br>
Flattening and non-atomicity have been with us since Java 1.0.
The eight<br>
built-in primitive types are routinely flattened into object
layouts and arrays.<br>
This "direct" storage results from several design choices made
about primitives:<br>
primitive types are non-nullable, and their zero values
represent explicitly<br>
"good" default values and therefore even "uninitialized"
primitives have useful<br>
initial values. <br>
<br>
Further, the two 64-bit primitive types (`long` and `double`)
are explicitly<br>
permitted to _tear_ when accessed via a data race, as if they
are read and<br>
written using two 32-bit loads and stores. When a mutable
`long` or `double` is<br>
read with a data race, it may be seen to have the high-order 32
bits of one<br>
previous write and the low-order 32 bits of another. This is
because at the<br>
time, atomic 64-bit loads and stores were prohibitively
expensive on most<br>
processors, so we faced a tradeoff: punish well-behaved programs
with<br>
poorly-performing numerics, or allow already-broken programs
(concurrent<br>
programs with insufficient synchronization) to be seen to
produce broken numeric<br>
results. <br>
<br>
In most similar situations, Java would have come down on the
side of<br>
predictability and correctness. However, numeric performance was
important<br>
enough, and data races enough of an "all bets are off" sort of
thing, that this<br>
set of decisions was a pragmatic compromise. While tearing
sounds scary, it is<br>
important to reiterate that tearing only happens when programs
_are already<br>
broken_, and that even if we outlawed tearing, _something else
bad_ would still<br>
happen.<br>
<br>
Valhalla takes these implicit characteristics of primitives and
formalizes them<br>
to explicit characteristics of value classes in the programming
model, enabling<br>
user-defined classes to gain the runtime characteristics of
primitives. <br>
<br>
## Data races and consistency<br>
<br>
A _data race_ is when a nonfinal heap variable (array element or
nonfinal field)<br>
is accessed by multiple threads, at least once access is a
write, and the reads<br>
and writes of that variable are not ordered by _happens-before_
(see JLS Ch17 or<br>
_Java Concurrency in Practice_ Ch16.) In the presence of a data
race, the<br>
reading thread may see a stale (out of date) value for that
variable. <br>
<br>
"Stale" doesn't sound so bad, but in a program with multiple
variables, the<br>
error states can multiply with the number and configuration of
mutable<br>
variables. Suppose we have two `Range` classes:<br>
<br>
```<br>
class MutableRange {<br>
int low, high;<br>
<br>
// obvious constructor, accessor, and updater methods<br>
// constructor and updater methods validate invariant low
<= high<br>
}<br>
<br>
class ImmutableRange { <br>
final int low, high;<br>
<br>
// obvious constructor and accessors, constructor validates
invariant<br>
}<br>
<br>
final static MutableRange mr = new MutableRange(0, 10);<br>
static ImmutableRange ir = new ImmutableRange(0, 10);<br>
```<br>
<br>
For `mr`, we have a final reference to a mutable point, so there
are two mutable<br>
variables here (`mr.low` and `mr.high`.) We update our range
value through a<br>
method that mutates `low` and/or `high`. By contrast, `ir` is a
mutable<br>
reference to an immutable object, with one mutable variable
(`ir`), and we<br>
update our range value by creating a new `ImmutableRange` and
mutating the<br>
reference `ir` to refer to it. <br>
<br>
More things can go wrong when we racily access the mutable
range, because there<br>
are more mutable variables. If Thread A writes `low` and then
writes `high`,<br>
and Thread B reads `low` and `high`; under racy access B could
see stale or<br>
up-to-date values for either field, and even if it sees an
up-to-date value for<br>
`high` (the one written later), that still doesn't mean it would
see an<br>
up-to-date value for `low`. This means that in addition to
seeing out-of-date<br>
values for either or both, we could observe an instance of
`MutableRange` to not<br>
obey the invariant that is checked by constructors and setters.
<br>
<br>
Suppose instead we racily access the immutable range. At least
there are fewer<br>
possible error states; a reader might see a stale _reference_ to
the immutable<br>
object. Access to `low` and `high` through that stale reference
would see<br>
out-of-date values, but those out-of-date values would at least
be consistent<br>
with each other (because of the initialization safety guarantees
of final<br>
fields.)<br>
<br>
When primitives other than `long` or `double` are accessed with
a data race, the<br>
failure modes are like that of `ImmutableRange`; when we accept
that `long` or<br>
`double` could tear under race, we are additionally accepting
the failure modes<br>
of `MutableRange` under race for those types as well, as if the
high- and<br>
low-order 32-bit quantities were separate fields (in exchange
for better<br>
performance). Accepting non-atomicity of large primitives
merely _increases_<br>
the number of observable failure modes for broken programs; even
with atomic<br>
access, such programs are still broken and can produce
observably incorrect<br>
results.<br>
<br>
Note that a `long` or `double` will never tear if it is `final`,
`volatile`,<br>
only accessed from a single thread, or accessed concurrently
with appropriate<br>
sychronization. Tearing only happens in the presence of
concurrent access to<br>
mutable variables with insufficient synchronization. <br>
<br>
## Non-atomicity and value types<br>
<br>
Hardware has improved significantly since Java 1.0, so the
specific tradeoff<br>
faced by the Java designers regarding `long` and `double` is no
longer an issue,<br>
as most processors have fast atomic 64-bit load and store
operations today.<br>
However, Valhalla will still face the same problem, as value
types can easily<br>
exceed 64 bits in size, and whatever the limit on efficient
atomic loads and<br>
stores is, we can easily write value types that will exceed that
size. This<br>
leaves us with three choices:<br>
<br>
- Never allow tearing of values, as with `int`; <br>
- Always allow tearing of values under race, as with `long`;<br>
- Allow tearing of values under race based on some sort of
opt-in or opt-out.<br>
<br>
Note that tearing is not anything anyone ever _wants_, but it is
sometimes an<br>
acceptable tradeoff to get more flattening. It was a sensible
tradeoff for<br>
`long` and `double` in 1995, and will continue to be a sensible
tradeoff for at<br>
least some value types going forward. <br>
<br>
The first choice -- values are always atomic -- offers the most
safety, but<br>
means we must forgo one of the primary goals of Valhalla for all
but the<br>
smallest value types. <br>
<br>
This leaves us with "values are always like `long`", or "values
can opt into /<br>
out of being like `long`." Types like `long` have the
interesting property that<br>
all bit patterns correspond to valid values; there are no
representational<br>
invariants for `long`. On the other hand, values are classes,
and can have<br>
representation invariants that are enforced by the constructor.
Having<br>
representational invariants for immutable classes be seen to not
hold would be a<br>
significant and novel new failure mode, and so we took the safe
route, requiring<br>
class authors to make the tradeoff between flattening and
failure modes under<br>
race. <br>
<br>
Just as with `long` and `double`, a value will never tear if the
variable that<br>
holds the value is `final`, `volatile`, only accessed from a
single thread, or<br>
accessed concurrently with appropriate sychronization. Tearing
only happens in<br>
the presence of concurrent access to mutable variables with
insufficient<br>
synchronization. <br>
<br>
Further, tearing under race will only happen for non-nullable
variables of value<br>
types that support default instances. <br>
<br>
What remains is to offer sensible advice to authors of value
classes as to when<br>
to opt into non-atomicity. If a class has any cross-field
invariants (such as<br>
`ImmutableRange`), atomicity should definitely be retained. In
the remaining<br>
cases, class authors (like the creators of `long` or `double`)
must make a<br>
tradeoff about the perceived value of atomicity vs flattening
for the expected<br>
range of users of the class. <br>
<br>
<br>
</font></font>
</body>
</html>