User model stacking: current status

Fri Jun 3 19:14:39 UTC 2022

Continuing to shake this tree.

I'm glad we went through the exploration of "flattenable B3.ref"; while 
I think we probably could address the challenges of tearing across the 
null channel / data channels boundary, I'm pretty willing to let this 
one go.  Similarly I'm glad we went through the "atomicity orthogonal to 
buckets" exploration, and am ready to let that one go too.

What I'm not willing to let go of us making atomicity explicit in the 
model.  Not only is piggybacking non-atomicity on something like 
val-ness too subtle and surprising, but non-atomicity seems like it is a 
property that the class author needs to ask for.  Flatness is an 
important benefit, but only when it doesn't get in the way of safety.

Recall that we have three different representation techniques:

  - no-flat -- use a pointer
  - low-flat -- for sufficiently small (depending on size of atomic 
instructions provided by the hardware) values, pack multiple fields into 
a single, atomically accessed unit.
  - full-flat -- flatten the layout, access individual individual fields 
directly, may allow tearing.

The "low-flat" bucket got some attention recently when we discovered 
that there are usable 128-bit atomics on Intel (based on a recent 
revision of the chip spec), but this is not a slam-dunk; it requires 
some serious compiler heroics to pack multiple values into single 
accesses.  But there may be targets of opportunity here for single-field 
values (like Optional) or final fields.  And we can always fall back to 
no-flat whenever the VM feels like it.

One of the questions that has been raised is how similar B3.ref is to 
B2, specifically with respect to atomicity.  We've gone back and forth 
on this.

Having shaken the tree quite a bit, what feels like the low energy state 
to me right now is:

  - The ref type of all on-identity classes are treated uniformly; 
B3.ref and B2.ref are translated the same, treated the same, have the 
same atomicity, the same nullity, etc.
  - The only difference across the spectrum of non-identity classes is 
the treatment of the val type.  For B2, this means the val type is 
*illegal*; for B3, this means it is atomic; for B3n, it is non-atomic 
(which in practice will mean more flatness.)
  - (controversial) For all types, the ref type is the default. This 
means that some current value-based classes can migrate not only to B2, 
but to B3 or B3n.  (And that we could migrate to B2 today and further to 
B3 tomorrow.)

While this is technically four flavors, I don't think it needs to feel 
that complex.  I'll pick some obviously silly modifiers for exposition:

  - class B1 { }
  - zero-hostile value class B2 { }
  - value class B3 { }
  - tearing-happy value class B3n { }

In other words: one new concept ("value class"), with two sub-modifiers 
(zero-hostile, and tearing-happy) which affect the behavior of the val 
type (forbidden for B2, loosened for B3n.)

For heap flattening, what this gets us is:

  - B1 -- no-flat
  - B2, B3.ref, B3n.ref -- low-flat atomic (with null channel)
  - B3 -- low-flat (atomic, no null channel)
  - B3n -- full-flat (non-atomic, no null channel)

This is a slight departure from earlier tree-shakings with respect to 
tearing.  In particular, refs do not tear at all, so programs that use 
all refs will never see tearing (but it is still possible to get a torn 
value using .val and then box that into a ref.)

If you turn this around, the declaration-site decision tree becomes:

  - Do I need identity (mutability, subclassing, aliasing)?  Then B1.
  - Are uninitialized values unacceptable?  Then B2.
  - Am I willing to tolerate tearing to enable more flattening? Then B3n.
  - Otherwise, B3.

And the use-site decision tree becomes:

  - For B1, B2 -- no choices to make.
  - Do I need nullity?  Then .ref
  - Do I need atomicity, and the class doesn't already provide it?  Then 
.ref
  - Otherwise, can use .val

The main downside of making ref the default is that people will grumble 
about having to say .val at the use site all the time. And they will!  
And it does feel a little odd that you have to opt into val-ness at both 
the declaration and use sites.  But it unlocks a lot of things (see 
Kevin's list for more):

  - The default name is the safest version.
  - Every unadorned name works the same way; it's always a reference 
type.  You don't need to maintain a mental database around "which kind 
of name is this".
  - Migration from B1 -> B2 -> B3 is possible.  This is huge (and more 
than we had hoped for when we started this game.)

(The one thing to still worry about is that while refs can't tear, you 
can still observe a torn value through a ref, if someone tore it and 
then boxed it.  I don't see how we defend against this, but the 
non-atomic label should be enough of a warning.)

On 5/6/2022 10:04 AM, Brian Goetz wrote:
> In this model, (non-atomic B3).ref takes the place of (non-atomic B2) 
> in the stacking I've been discussing.  Is that what you're saying?
>
>     class B1 { }  // ref, identity, atomic
>     value-based class B2 { }  // ref, non-identity, atomic
>     [ non-atomic ] value class B3 { }  // ref or val, zero is ok, both 
> projections share atomicity
>
> If we go with ref-default, then this is a small leap from yesterday's 
> stacking, because "B3" and "B2" are both reference types, so if you 
> want a tearable, non-atomic reference type, saying `non-atomic value 
> class B3` and then just using B3 gets you that. Then:
>
>  - B2 is like B1, minus identity
>  - B3 means "uninitialized values are OK, you get two types, a 
> zero-default and a non-default"
>  - Non-atomicity is an extra property we can add to B3, to get more 
> flattening in exchange for less integrity
>  - The use cases for non-atomic B2 are served by non-atomic B3 (when 
> .ref is the default)
>
> I think this still has the properties I want; I can freely choose the 
> reasonable subsets of { identity, has-zero, nullable, atomicity } that 
> I want; the orthogonality of non-atomic across buckets becomes 
> orthogonality of non-atomic with nullity, and the "B3.ref is just like 
> B2" is shown to be the "false friend."
>