[External] Foo / Foo.ref is a backward default; should be Foo.val / Foo

Dan Heidinga heidinga at redhat.com
Mon Apr 25 16:24:58 UTC 2022


> In the future world, which of these declarations do we expect to see?
>
>     public final class Integer { … }
>
> or
>
>     public mumble value class int { … }
>
> The tension is apparent here too; I think most Java developers would hope that, were we writing the world from scratch, that we’d declare the latter, and then do something to associate the compatibility shim with the real type.  (Whatever we do, we still need an Integer.class on our class path, because existing code will want to load it.)  This tension carries over into how we declare Complex; are we declaring the “box”, or are we declaring the primitive?
>
> Let’s state the opposing argument up front, because it was our starting point: having to say “Complex.val” for 99% of the utterances of Complex would likely be perceived as “boy those Java guys love their boilerplate” (call this the “lol java” argument for short.)  But, since then, our understanding of how this will all actually work has evolved, so it is appropriate to question whether this argument still holds the weight we thought it did at the outset.

I agree this is the heart of the issue: how do developers say whether
they want a B2 or B3 value class?  If they've opted into a B3 value,
shouldn't we respect author intent (like we do with default methods)
rather than making them repeat "please give me B3 semantics" at every
use site?  I'm deliberately phrasing this a bit antagonistically as
we're going to hear this (and worse) from users who have been waiting
for improved memory density from valhalla.

>
> > 1. The option with fewer hazards should usually be the default. Users won't opt themselves into extra safety, but they will sometimes opt out of it. Here, the value type is the one that has attendant risks -- risk of a bad default value, risk of a bad torn value. We want using `Foo.val` to *feel like* cracking open the shell of a `Foo` object and using its innards directly. But if it's spelled as plain `Foo` it won't "feel like" anything at all.
>
> Let me state it more strongly: unboxed “primitives” are less safe.  Despite all the efforts from the brain trust, the computational physics still points us towards “the default is zero, even if you don’t like that value” and “these things can tear under race, even though they resemble immutable objects, which don’t.”  The insidious thing about tearing is that it is only exhibited in subtly broken programs.  The “subtly” part is the really bad part.  So we have four broad options:
>
>  - neuter primitives so they are always as safe as we might naively hope, which will result in either less performance or a worse programming model;
>  - keep a strong programming model, but allow users to trade some safety (which non-broken programs won’t suffer for) with an explicit declaration-site and/or use-site opt-in (“.val”)
>  - same, but try to educate users about the risk of tearing under data race (good luck)
>  - decide the tradeoff is impossible, and keep the status quo
>
> The previous stake in the ground was #3; you are arguing towards #2.

My understanding was we were going to guide most users towards B2
values and would treat B3 as the rare, "expert" mode, for when density
really matters.  Does that decrease the education problem?

>
> > 2. In the current plan a `Foo.ref` should be a well-behaved bucket 2 object. But it sure looks like that `.ref` is specifically telling it NOT to be -- like it's saying "no, VM, *don't* optimize this to be a value even if you can!" That's of course not what we mean. With the change I'm proposing, `Foo.val` does make sense: it's just saying "hey runtime, while you already *might* have represented this as a value, now I'm demanding that you *definitely* do". That's a normal kind of a thing to do.
>
> A key aspect of this is the bike shed tint; .val is not really the right indicator  given that the reference type is also a “value class”.  I think we’re comfortable giving the “value” name to the whole family of identity-free classes, which means that .val needs a new name.  Bonus points if the name connotes “having burst free of the constraints of reference-hood”: unbound, loose, exploded, compound value, etc.  And also is pretty short.
>

".prim" anyone?  (backs slowly away from the bikeshed)

> > 3. This change would permit compatible migration of an id-less to primitive class. It's a no-op, and use sites are free to migrate to the value type if and when ready. And if they already expose the type in their API, they are free to weigh the costs/benefits of foisting an incompatible change onto *their* users. They have facilities like method deprecation to do it with. In the current plan, this all seems impossible; you would have to fix all your problematic call sites *atomically* with migrating the class.
>
> This is one of my favorite aspects of this direction.  If you recall, you were skeptical from the outset about migrating classes in place at all; the previous stake in the ground said “well, they can migrate to value classes, but will never be able to shed their null footprint or get ultimate flattening.”  With this, we can migrate easily from VBC to B2 with no change in client code, and then _further_ have a crack at migrating to full flatness inside the implementation capsule. That’s sweet.
>

Changing from a B2 -> B3 changes the default spelling from "L" -> "Q".
Why does this have to be done atomically?  Existing descriptors -
spelled with "L" - would still work.  Code that's recompiled would
pick up the Q descriptors.  If the author wants Qs, and gets them
either "for free" or by adding ".val", there's the same compatibility
concerns.... they have to take explicit action to get what they want
and to keep descriptors working.

> > 4. It's much (much) easier on the mental model because *every (id-less) class works in the exact same way*. Some just *also* give you something extra, that's all. This pulls no rugs out from under anyone, which is very very good.
> >
> > 5. The two kinds of types have always been easily distinguishable to date. The current plan would change that. But they have important differences (nullability vs. the default value chief among them) just as Long and long do, and users will need to distinguish them. For example you can spot the redundant check easily in `Foo.val foo = ...; / requireNonNull(foo);`.
>
> It is really nice that *any* unadorned identifier is immediately recognizable as being a reference, with all that entails — initialization safety and nullity.  The “mental database” burden is lower, because Foo is always a reference, and Foo.whatever is always direct/immediate/flat/whatever.
>
> > 6. It's very nice when the *new syntax* corresponds directly to the *new thing*. That is, until a casual developer *sees* `.val` for the first time, they won't have to worry about it.

That's nice initially but a few releases after B3 values are available
will we still want the syntax to highlight (scream?) "new thing"?

> >
> > 7. John seemed to like my last fruit analogy, so consider these two equivalent fruit stand signs:
> >
> > a) "for $1, get one apple OR one orange . . . with every orange purchased you must also take a free apple"
> > b) "apples $1 . . . optional free orange with each purchase"
> >
> > Enough said I think :-)
> >
> > 8. The predefined primitives would need less magic. `int` simply acts like a type alias for `Integer.val`, simple as that. This actually shows that the whole feature will be easier to learn because it works very nearly how people already know primitives to work. Contrast with: we hack it so that what would normally be called `Integer` gets called `int` and what normally gets called `Integer.ref` or maybe `int.ref` gets called `Integer` ... that is much stranger.
>
> One more: the .getClass() anomaly goes away.
>
> If we have
>
>     mumble primitive mumble Complex { … }
>
>     Complex.val c = …
>
> then what do we get when we ask c for its getClass?  The physics again point us at returning Complex.ref.class, not  Complex.val.class, but under the old scheme, where the val projection gets the good name, it would seem anomalous, since we ask a val for its class and get the ref mirror.  But under the Kevin interpretation, we can say “well, the CLASS is Complex, so if you ask getClass(), you get Complex.class.“
>
>



More information about the valhalla-spec-experts mailing list