[External] : Re: User model stacking

Fri Apr 29 14:25:23 UTC 2022

On Thu, Apr 28, 2022 at 10:13 AM Brian Goetz <brian.goetz at oracle.com> wrote:
>
> I threw a lot of ideas out there, and people have reacted to various corners of them.  That’s fair; it’s a lot to take in at once.  Let me focus on what I think the most important bit of what I wrote is, which is, where we get permission for relaxation of atomicity.
>
> Until now, we’ve been treating atomicity as a property of ref-ness, because the JMM requires that loads and stores of refs be atomic wrt each other.  But I think this is (a) too much of a simplification and (b) too risky, because it will not be obvious that by opting into val, you are opting out of atomicity.  Worse, people will write classes that are intended to be used with val, but for which the loss of atomicity will be astonishing.
>
> Another problem with the “until now” straw man is that B2 and B3 are gratuitously different with respect to the flattening they can get.  This makes the user performance model, and the recommendations of which to use in which situations, harder to understand, and leaves holes for “but what if I want X and Y”, even if we could deliver both.
>
> My conclusion is that problem here is that we’re piggybacking atomicity on other things, in non-obvious ways.  The author of the class knows when atomicity is needed to protect invariants (specifically, cross-field invariants), and when it is not, so let that simply be selected at the declaration site.  Opting out of atomicity is safer and less surprising, so that argues for tagging classes that don’t need atomicity as `non-atomic`.  (For some classes, such as single-field classes, it makes no difference, because preserving atomicity has no cost, so the VM will just do it.)
>
> In addition to the explicitness benefits, now atomicity works uniformly across B2 and B3, ref and val.  Not only does this eliminate the asymmetries, but it means that classes that are B2 because they don’t have a good default, can *routinely get better flattening* than they would have under the status quo straw man; previously there was a big flattening gap, even with heroics like stuffing four ints into 128 bit atomic loads.  When the user says “this B2 is non-atomic”, we can immediately go full-flat, maybe with some extra footprint for null.  So:
>
>  - No difference between an atomic B2 and an atomic B3.ref
>  - Only difference between atomic B2 and atomic B3.val is the null footprint
>  - Only difference between non-atomic B2 and non-atomic B3.val is the null footprint

I'd like to reserve judgement on this stacking as I'm uncomfortable
(uncertain maybe?) about the practicality of the extra null channel.
Without having validated the extra null channel, I'm concerned we're
exposing a broader set of options in the language that will, in
practice, map down to the existing 3 buckets we've been talking about.
Maybe this factoring allows a slightly larger number of classes to be
flattened or leaves the door open for them to get it in the future?

In previous discussions around the extra null channel for flattened
values, we were really looking at narrowly applicable optimization -
basically for nullable values that would fit within 64bits.  With this
stacking, and the info about intel allowing atomicity up to 128bits,
the extra null channel becomes more widely applicable.

The J9 folks haven't spent much time (Tobi feel free to correct me)
really digging into the implementation for how this will work in
practice.  Does your team have more experience with the extra null
channel?  Enough to believe it is a viable and maintainable approach?

Some of my hesitation comes from experiences writing structs or
multi-field invariants in C where memory barriers and careful
read/write protocols are important to ensure consistent data in the
face of races.  Widening the set of cases that have a multifield
invariant *created and enforced by the VM* by adding an additional
null channel will make it more likely the VM (and optimized jit code!)
can do the wrong thing.

I've been puzzling over examples related to the nullcheck for
flattened, nullable values.  The best outcome for these values is to
be completely scalarized in jitted code, which makes it easy to
(incorrectly) separate reading the null channel from the rest of the
data.

__non-atomic __nullable value class Foo {
  long x;
  long y;
}

Foo localFoo = someArray[i];
if (localFoo != null) {
  ... do something with localFoo.x or y ...
}

Because there's a local variable involved, it would be invalid to not
privatize a copy of the value before doing the null check, even if the
IR treats the fields as individual scalars.

Similarly a direct read of one of the fields:
  long aY = someArray[i].y;
is really;
  privatize nullcheck & y
  if (nullcheck fails) throw NPE;
  long aY = privatized y;

which may tear and will be surprising to users that the nullcheck,
which is injected by the VM, is not consistent.

There's a lot of engineering needed to be worked through (or maybe
your team has already worked through it?) to feel confident that the
set of buckets is meaningfully growing at the implementation level and
that it won't lead to surprising results for users.  (And even then, I
expect the initial implementation will be an indirection in most
cases....)

I have always been somewhat uneasy about the injected nullchannel
approach and concerned about how difficult it will be for service
engineers to support when something goes wrong.  If there's experience
that can be shared that shows this works well in an implementation,
then I'll be less concerned.

--Dan

--Dan

>
> This is a very nice place to be.  (There are interesting discussions to be had about the null/zero part, whether we even need it now, how to denote it, what the default is, etc), but before we dive back into those, I’d like to sync on this, because this is the high order bit of what I’m getting at here.  Factor out atomicity in the user model, which in turn renders the matrix much simpler.
>
> A side benefit is that `non-atomic` is new and weird!  Which will immediately cause anyone who stumbles across it to run to Stack Overflow (“What is a non-atomic value class”), where they’ll find an excellent and very scary explanation of tearing.  As part of adding value types, we’ve exposed a previously hidden, scary thing, but done so in an explicit way.  I think this is much better than stapling it to one corner on the matrix, hidden behind something that looks like something else.
>
> > - The default is atomicity / integrity FOR ALL BUCKETS (safe by default)
> > - The default is nullability FOR ALL BUCKETS
> > - All unadorned type names are reference types / nullable
> > - All Val-adorned type names (X.val) are non-nullable (or .zero, or .whatever)
> > - Atomicity is determined by declaration site, can’t be changed at use site
>