atomicity for value types

Tue Jan 14 20:43:28 UTC 2020

On Jan 14, 2020, at 9:11 AM, Doug Lea <dl at cs.oswego.edu> wrote:
> 
> On 1/13/20 4:44 PM, Tobi Ajila wrote:
>> Hi John
>> 
>> Given that inline types can be flattened there is a possibility that
>> data races will occur in places where users were not expecting it
>> before. So your `__AlwaysAtomic` modifier is a necessary tool as the
>> existing spec will only enforce atomicity for 32bit primitives and
>> references. I just want to confirm if the intention of the
>> __AlwaysAtomic bit on an inline class is only to ensure atomic reads and
>> writes of inline types and that there are no happens-before ordering
>> expectations as there are with the existing volatile modifier on fields.
>> 
> 
> In which case "__AlwaysOpaque" would be a more accurate term.

Very interesting!  I guess this is the most relevant definition of opaque:
  http://gee.cs.oswego.edu/dl/html/j9mm.html#opaquesec

Doug, in honor of one of your pet expressions, I would have
preferred to spell this keyword “not-too-tearable”, but
that’s truly an opaque phrase.

OK, so the above document defines a nice linear scale of four
memory access modes, ordered from weak to strong: Plain,
Opaque, Release/Acquire, and Volatile.  “Any guaranteed
property of a weaker mode, plus more, holds for a stronger
mode.”

For a JIT writer, this means stronger modes will require
additional ordering constraints, in IR and/or as hardware
fence instructions, and perhaps also stronger memory
access instructions.  In the worst case (which we may
see with inline types) library calls may be required
to perform some accesses — plus the space overhead of
control variables for things like seq-locks or mutexes.

The effect of the Plain mode on atomicity is described here:
> Additionally, while Java Plain accesses to int, char, short, float, byte, and reference types are primitively bitwise atomic, for the others, long, double, as well as compound Value Types planned for future JDK releases, it is possible for a racy read to return a value with some bits from a write by one thread, and other bits from another, with unusable results.

Then, Opaque mode tightens up the behavior of Plain mode by
adding Bitwise Atomicity (what I want here), plus three more
guarantees: Per-variable antecedence acyclicity, Coherence,
and Progress.

The document then suggests that these three more guarantees
won’t inconvenience the JIT writer:
> Opaque mode does not directly impose any ordering constraints with respect to other variables beyond Plain mode.

But I think there will might be inconveniences.  Our current prototype
doesn’t mess with STM or HTM, but just buffers every new value
(under always-atomic or volatile) into a freshly allocated heap node,
issues a Release fence, and publishes the node reference into the
relevant 64-bit variable.  The node reference itself is stored in Plain
(Relaxed) mode, not Opaque or Release mode, and subsequent loads
are also relaxed (no IR or HW fences).

What we are doing with this buffering trick is meeting the requirements
of atomicity by using the previously-specified mechanisms for safe
publication (of regular identity classes with final instance variables).
In order to use this trick correctly we need to ensure that the specified
behavior of the always-atomic store does not make additional requirements.

When I look at the HotSpot code, I find that, if I were to classify
loads and stores of always-atomic as always-Opaque, I would find
myself adding more IR constraints than if I simply use the trick of
buffering for safe publication.  Maybe HotSpot is doing some overkill
on Opaque mode (see notes below for evidence of that) but I can’t
help thinking that at least the requirement of Progress (for Opaque)
will require the loop optimizer to take special care with always-Opaque
variables that it would not have to take with merely always-atomic ones.

This is a round-about way of say, “really Opaque? Why not just atomic?”
If I take always-Opaque as the definition I can use a clearly defined
category in upcoming JMM revisions (good!) but OTOH I get knock
on requirements (slow-downs) from that same category (bad!).

It’s  not right to say, “but always-atomic values will *always* be
*slow* as well, so quit complaining about lost optimizations”.
That’s because the JVM will often pack small always-atomic values
into single memory units (64- or 128-bit, whatever the hardware
supports with native atomicity).  In such cases, Plain order has
a real performance benefit relative to Opaque order, yes?

So, in the end, I’d like to call it always-atomic, and leave Opaque
mode as an additional opt-in for these types.

— John

P.S. More background, FTR:

Our intention with always-atomic types is to guarantee a modest
extension of type safety, that combinations of field values which
appear from memory reads will never be different from combinations
that have been created by constructor code (or else they are the default
combination).  This appeal to constructors extends type safety in the
sense that the inline class is able to exclude from its value set some of
the composite values that would otherwise appear if the composite
were an unconstrained tuple type (aka. direct product).  If the class
has the power to absolutely constrain its value set in this way, any
type safety properties that depend on exclusion of values can be
proven, while if tearing is allowed, type safety proofs must confront
all values physically possible to the corresponding tuple type.

In particular, if an inline type implements a var-handle holding the
components of an unsafe addressing mode (Object, long), tearing
could introduce arbitrary combinations of (previously stored) Object
and long values, breaking delicate type safety invariants.  Such inlines
need to be marked always-atomic to exclude tearing.

This concern is separate from any other control over memory
ordering and race exclusion.  In particular, the JVM will (by default)
freely reorder reads and writes of always-atomic inlines like any
other Plain mode operations, as if the inline were a naturally
atomic value (like an int or reference).  The semantics will be
“as if” the inline value were actually represented as a reference
to an all-final box-like object.  We call such an object a “buffer”
to distinguish it from a user-visible “box” or “wrapper object”.

The point is not to always require memory allocations when
storing an always-atomic value (though such allocations are
a valid tactic).  The point is to align atomicity constraints with
the safe-publication rules enjoyed by non-inline identity objects,
whether or not actual memory buffers are created.

In that case, why not just make inline values safe by default,
by analogy with safe publication of their (all-final) indirect
cousines?  The problem is that safe-publication is likely to be
more expensive for inlines than for identity objects, so it needs
to be an opt-in feature.  Hence the very special always-atomic
modifier.

(Thought exercise:  What would always-atomic mean if applied
to stateful identity objects?  I think it would mean that updates
to the object would become transactional, as if the whole object
state were a single always-atomic tuple.  I suppose every method
body would be treated as a transaction, probably under an N
readers / 1 writer discipline.  As with inlines, such a transaction
would be enforced by whatever tricks the JVM could muster,
maybe TSX on Intel, or STM, or hidden buffering.)

To finish the analogy with naturally atomic types (int, reference),
an inline can be accessed in Opaque, Release/Acquire, or Volatile
mode by means of appropriate var-handle operations (or the
equivalent).  Such accesses will be subject to stronger ordering
constraints.  They will also be atomic whether or not the inline
type was declared always-atomic.

BTW some code in HotSpot refers to Plain mode as Relaxed.
Both the JVM and the Unsafe API work in the expected way
with stores that Release and reads that Acquire.  Confusingly,
other HotSpot code assigns the term RELAXED to Opaque,
using UNORDERED for Plain.  Naming is hard.  Also, the
HotSpot code treats RELAXED/Opaque as stronger not
weaker than VOLATILE.  Not everyone is on the same page
yet.

So the current state of the HotSpot code, Opaque (aka. RELAXED
> VOLATILE) does in fact add IR ordering constraints where Plain
mode does not.