[External] : Re: User model stacking

Thu Apr 28 14:13:00 UTC 2022

I threw a lot of ideas out there, and people have reacted to various corners of them.  That’s fair; it’s a lot to take in at once.  Let me focus on what I think the most important bit of what I wrote is, which is, where we get permission for relaxation of atomicity.  

Until now, we’ve been treating atomicity as a property of ref-ness, because the JMM requires that loads and stores of refs be atomic wrt each other.  But I think this is (a) too much of a simplification and (b) too risky, because it will not be obvious that by opting into val, you are opting out of atomicity.  Worse, people will write classes that are intended to be used with val, but for which the loss of atomicity will be astonishing.  

Another problem with the “until now” straw man is that B2 and B3 are gratuitously different with respect to the flattening they can get.  This makes the user performance model, and the recommendations of which to use in which situations, harder to understand, and leaves holes for “but what if I want X and Y”, even if we could deliver both.  

My conclusion is that problem here is that we’re piggybacking atomicity on other things, in non-obvious ways.  The author of the class knows when atomicity is needed to protect invariants (specifically, cross-field invariants), and when it is not, so let that simply be selected at the declaration site.  Opting out of atomicity is safer and less surprising, so that argues for tagging classes that don’t need atomicity as `non-atomic`.  (For some classes, such as single-field classes, it makes no difference, because preserving atomicity has no cost, so the VM will just do it.)  

In addition to the explicitness benefits, now atomicity works uniformly across B2 and B3, ref and val.  Not only does this eliminate the asymmetries, but it means that classes that are B2 because they don’t have a good default, can *routinely get better flattening* than they would have under the status quo straw man; previously there was a big flattening gap, even with heroics like stuffing four ints into 128 bit atomic loads.  When the user says “this B2 is non-atomic”, we can immediately go full-flat, maybe with some extra footprint for null.  So:

 - No difference between an atomic B2 and an atomic B3.ref
 - Only difference between atomic B2 and atomic B3.val is the null footprint
 - Only difference between non-atomic B2 and non-atomic B3.val is the null footprint

This is a very nice place to be.  (There are interesting discussions to be had about the null/zero part, whether we even need it now, how to denote it, what the default is, etc), but before we dive back into those, I’d like to sync on this, because this is the high order bit of what I’m getting at here.  Factor out atomicity in the user model, which in turn renders the matrix much simpler.  

A side benefit is that `non-atomic` is new and weird!  Which will immediately cause anyone who stumbles across it to run to Stack Overflow (“What is a non-atomic value class”), where they’ll find an excellent and very scary explanation of tearing.  As part of adding value types, we’ve exposed a previously hidden, scary thing, but done so in an explicit way.  I think this is much better than stapling it to one corner on the matrix, hidden behind something that looks like something else.  

> - The default is atomicity / integrity FOR ALL BUCKETS (safe by default)
> - The default is nullability FOR ALL BUCKETS 
> - All unadorned type names are reference types / nullable 
> - All Val-adorned type names (X.val) are non-nullable (or .zero, or .whatever)
> - Atomicity is determined by declaration site, can’t be changed at use site