User model stacking
Brian Goetz
brian.goetz at oracle.com
Wed Apr 27 16:44:01 UTC 2022
Here’s some considerations for stacking the user model. (Again, please let’s resist the temptation to jump to the answer and then defend it.)
We have a stacking today which says:
- B1 is ordinary identity classes, giving rise to a single reference type
- B2 are identity-free classes, giving rise to a single reference type
- B3 are flattenable identity-free classes, giving rise to both a reference (L/ref) and primitive (Q/val) type.
This stacking has some pleasant aspects. B2 differs from B1 by “only one bit”: identity. The constraints on B2 are those that come from the lack of identity (mutability, extensibility, locking, etc.) B2 references behave like the object references we are familiar with; nullability, final field guarantees, etc. B3 further makes reference-ness optional; reference-free B3 values give up the affordances of references: they are zero-default and tearable. This stacking is nice because it can framed as a sequence of “give up some X, get some Y”.
People keep asking “do we need B2, or could we get away with B1/B3”. The main reason for having this distinction is that some id-free classes have no sensible default, and so want to use null as their default. This is a declaration-site property; B3 means that the zero value is reasonable, and use sites can opt into / out of zero-default / nullity. We’d love to compress away this bucket but forcing a zero on classes that can’t give it a reasonable interpretation is problematic. But perhaps we can reduce the visibility of this in the model.
The degrees of freedom we could conceivably offer are
{ identity or not, zero-capable or not, atomic or not } x { use-site, declaration-site }
In actuality, not all of these boxes make sense (disavowing the identity of an ArrayList at the use site), and some have been disallowed by the stacking (some characteristics have been lumped.) Here’s another way to stack the declaration:
- Some classes can disavow identity
- Identity-free classes can further opt into zero-default (currently, B3, polarity chosen at use site)
- Identity-free classes can further opt into tearability (currently, B3, polarity chosen at use site)
It might seem the sensible move here is to further split B3 into B3a and B3b (where all B3 support zero default, and a/b differ with regard to whether immediate values are tearable). But that may not be the ideal stacking, because we want good flattening for B2 (and B3.ref) also. Ideally, the difference between B2 and B3.val is nullity only (Kevin’s antennae just went up.)
So another possible restacking is to say that atomicity is something that has to be *opted out of* at the declaration site (and maybe also at the use site.) With deliberately-wrong syntax:
__non-id class B2 { }
__non-atomic __non-id class B2a { }
__zero-ok __non-id class B3 { }
__non-atomic __zero-ok __non-id class B3a { }
In this model, you can opt out of identity, and then you can further opt out of atomicity and/or null-default. This “pulls up” the atomicity/tearaiblity to a property of the class (I’d prefer safe by default, with opt out), and makes zero-*capability* an opt-in property of the class. Then for those that have opted into zero-capability, at the use site, you can select .ref (null) / .val (zero). Obviously these all need better spellings. This model frames specific capabilities as modifiers on the main bucket, so it could be considered either a two bucket, or a four bucket model, depending on how you look.
The author is in the best place to make the atomicity decision, since they know the integrity constraints. Single field classes, or classes with only single field invariants (denominator != 0), do not need atomicity. Classes with multi-field invariants do.
This differs from the previous stacking in that it moves the spotlight from _references_ and their properties, to the properties themselves. It says to class writers: you should declare the ways in which you are willing to trade safety for performance; you can opt out of the requirement for references and nulls (saving some footprint) and atomicity (faster access). It says to class *users*, you can pick the combination of characteristics, allowed by the author, that meet your needs (can always choose null default if you want, just use a ref.)
There are many choices here about “what are the defaults”. More opting in at the declaration site might mean less need to opt in at the use site. Or not.
(We are now in the stage which I call “shake the box”; we’ve named all the moving parts, and now we’re looking for the lowest-energy state we can get them into.)
More information about the valhalla-spec-experts
mailing list