User model stacking: current status

Sun Jul 3 01:40:54 UTC 2022

On 5 May 2022, at 12:21, Brian Goetz wrote:

>> There are lots of other things to discuss here, including a 
>> discussion of what does non-atomic B2 really mean, and whether there 
>> are additional risks that come from tearing _between the null and the 
>> fields_.
>
> So, let's discuss non-atomic B2s.  (First, note that atomicity is 
> only relevant in the heap; on the stack, everything is 
> thread-confined, so there will be no tearing.)
>
> If we have:
>
>     non-atomic __b2 class DateTime {
>         long date;
>         long time;
>     }
>
> then the layout of a B2 (or a B3.ref) is really (long, long, boolean), 
> not just (long, long), because of the null channel.  (We may be able 
> to hide the null channel elsewhere, but that's an optimization.)

That goes straight to a desired optimization, but it leaves something 
valuable in the dust.

The valuable thing is one of the “affordances of references”, which 
is that a reference to an immutable value can be safely published.  This 
is a core feature of the JMM that applies to all value-based classes.

The behavior you are citing is inconsistent with a reference to an 
object containing an immutable field (of type `DateTime.val`).  It is 
consistent with a reference to a mutable field or to an array of type 
`DateTime.val[]`, but none of our current wrapper types work like that.  
(Arrays do, which is a problem with arrays.)

I see how you got there:  You want to apply full flattening to 
`DateTime.ref`, simply adding a boolean.  That’s a nice data structure 
but it departs from how we expect boxing of values to work.  There are 
extra races between the components of `DateTime`, as well as a race 
between null and non-null states.  With today’s value-based classes, a 
mutable `DateTime` reference will only show races between null and 
non-null, and between earlier and later pairs of field values.  With 
this proposed feature, a mutable reference will act as if the wrapper 
object being referenced were no longer immutable, and not safely 
published.  I think this is too much of a sharp edge, even for an opt-in 
feature.

What I would prefer here is a principle that boxes (including 
`DateTime.ref`) are always safely publishable.  The possibility to race 
on individual object states should be confined to the value companion 
type.

That somewhat reduces the optimization for heap variables of reference 
type of non-atomics.  I think that’s a fine price to pay, in order to 
avoid putting new exceptions into the JMM’s current assurances about 
safe publication.  The optimizations on the val-companion are 
unaffected.  This is good:  Reasoning about strange race conditions can 
concentrate around uses of the val-companion, and all uses of the 
ref-companion would be race-safe.

Part of my discomfort here is that when we say that the fields of a 
value-based class are final is that we are telling users their instances 
can be safely published.  I don’t want to claw that back, even for a 
corner case like explicitly non-atomic value classes.

I do see that there could be a workaround, if a class `Foo` allowed 
field-races even on its reference companion `Foo.ref`:  Manually make a 
value-based wrapper class `AtomicFoo` which is (implicitly declared as) 
atomic and has a final `Foo` field as its sole payload.  In that case I 
think the JMM will assure me (am I right?) that a variable of type 
`AtomicFoo` accesses a stable set of `Foo` fields, even if that 
`AtomicFoo` variable is updated by data races, because its nested field 
is not raced.  And that should be true even if the JVM aggressively 
flattens `AtomicFoo` into the `Foo` fields plus two null channels.  
That’s all consistent, but I think it will cause bugs as people fumble 
around with a mix of `Foo` and `AtomicFoo` values in containers like 
`ArrayList` or `Object[]`.

>
> If two threads racily write (d1, t1) and (d2, t2) to a shared mutable 
> DateTime, it is possible for an observer to observe (d1, t2) or (d2, 
> t1).  Saying non-atomic says "this is the cost of data races".

(So of course that’s OK for mutable copies of `DateTime.val`, but 
that’s not how references behave now or should behave in the future.)

> But additionally, if we have a race between writing null and (d, t), 
> there is another possible form of tearing.
>
> Let's write this out more explicitly.  Suppose that T1 writes a 
> non-null value (d, t, true), and T2 writes null as (0, 0, false). Then 
> it would be possible to observe (0, 0, true), which means that we 
> would be conceivably exposing the zero value to the user, even though 
> a B2 class might want to hide its zero.

This is another reason to confine races to the value companion, because 
we are making a plan to protect value companions specially, for cases 
like this.

> So, suppose instead that we implemented writing a null as simply 
> storing false to the synthetic boolean field.  Then, in the event of 
> a race between reader and writer, we could only see values for date 
> and time that were previously put there by some thread.  This 
> satisfies the OOTA (out of thin air) safety requirements of the JMM.

I think the right approach here is starting with the semantics of 
value-based classes (which include safe publication) and working out the 
allowed implementation techniques.

The semantics of a flattened ref are (or should be) that it must behave 
*as if* it were a non-flattened ref.  (“Should be”:  We are talking 
optimization here, not a changeable variation in the user model adopted 
randomly as the JIT comes and goes.)  A ref, in fact, to a VBC.

A non-flattened ref is a thing which you first query as to null-ness, 
and then if non-null you can load the VBC’s field or fields.  (Without 
races.)

So if there is a null channel sitting inside or next to some data 
fields, the read-access code has to first check for null, and if not 
null then to load a consistent view of the fields, in such a way that 
racing writes of null or other values do not impair the consistency.

The write-access code can write null by asserting the null flag and (as 
others have observed) it is an implementation puzzle whether to “clear 
out” the other storage.  (My take is that the JVM could do this during 
GC at a safepoint, but it is hard to do so at other times.)

The write-access code can write non-null by (atomically) setting the 
field values and then (if that did not already de-assert the null 
channel) de-asserting the null channel.  Again, the fields should be 
written as a group consistently, so as not to interfere with racing 
reads or writes.  The null channel need not be written consistently.

All this would imply that the size of a flattened ref, perhaps including 
its null channel, should be no larger than a naturally atomic unit of 
memory, which is 64 or maybe 128 bits today.

Your argument above, which I think I buy, is that is also probably 
possible to place the null channel outside of the naturally atomic unit 
that contains the other fields; this would allow 9-byte and 17-byte 
refs.

Such a racing null channel, with non-racing payload fields, can be 
modeled in classic Java in the JMM like this:

```
class RacyNullable<V extends ValueBasedClass> {
   private non-final boolean isNull = true;
   private static final Object GARB = new Object();  //any value OK, 
even null
   private non-final Object v = GARB;  //null and GARB never observed
   public V get() { return isNull ? null : (V) v; }
   public void set(V v) {
     if (v == null) { isNull = true; if (EAGER_CLEANUP)  cleanup(); }
     else { this.v = v; /*race here!*/ isNull = false; }
   }
   private final boolean EAGER_CLEANUP = false;
   private void cleanup() { if (isNull) /*race here?*/ v = GARB; }
   }
}
```

I think really nice flattened refs can be built with “as if” 
semantics the follow that pattern.  They won’t flatten quite as well 
as some of the “no holds barred” cases discussed by the EG, but they 
would behave… “as if” …they follow the JMM without surprises.

The one race (outside of the cleanup method) is innocuous if the cleanup 
method is used with restraint.  How to do that is a puzzle.

> …
> So we have a choice for how we implement writing nulls, with a 
> pick-your-poison consequence:
>
>  - If we do a wide write, and write all the fields to zero, we risk 
> exposing a zero value even when the zero is a bad value;

Yes, that’s like flipping `EAGER_CLEANUP` above.

(After if we go to the trouble of making `C.val` access-controlled, 
let’s not make racy refs let the cat back out of the bag!)

>  - If we do a narrow write, and only write the null field, we risk 
> pinning other OOPs in memory

That’s the one I prefer.  I think it’s actually a reasonable thing 
to try for.  Basically, the GC would have to special-case those fields 
in a similar way that it special-cases weak-reference fields.  For 
WR’s the GC clears them under certain non-local conditions.  In this 
case the GC would clear them under a very local condition, the setting 
of the null channel.  GC folks growl about requests like this, but I 
think this one is reasonable.

— John

P.S. Next up, a long-ish study on how to put access control on `C.val`!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/valhalla-spec-observers/attachments/20220702/9fb2a222/attachment-0001.htm>