User model stacking: current status

Thu Jun 30 14:21:14 UTC 2022

----- Original Message -----
> From: "Dan Heidinga" <heidinga at redhat.com>
> To: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "Brian Goetz" <brian.goetz at oracle.com>, "Kevin Bourrillion" <kevinb at google.com>, "daniel smith"
> <daniel.smith at oracle.com>, "valhalla-spec-experts" <valhalla-spec-experts at openjdk.java.net>
> Sent: Thursday, June 30, 2022 3:35:16 PM
> Subject: Re: User model stacking: current status

> <snip>
>> >
>> > I'm confused by your assertion that "nullable becomes less important
>> > because there is a notion of default value." That default value - the
>> > all zeros value that the VM paints on freshly allocated instances - is
>> > something we've agreed many value classes want to encapsulate.  That's
>> > the whole story of "no good default" value classes.  We've spent a lot
>> > of time plumbing those depths before arriving at this point where such
>> > NGD classes want to be expressed with references to ensure their "bad"
>> > default values don't leak.  So I'm kind of confused by this assertion.
>>
>> I would like to separate the concern about null, you have the perspective of the
>> maintainer/writer of a class and the perspective of the user of a class.
>> I was not talking about the maintainer POV which as to deal with the no good
>> default class but from the user POV, that only need to deal with fields and
>> array being initialized with the default value instead of null.
>>
>> I don't disagree with the current model, i think the model is not enough, not
>> exposing a way to declare primary val classes (val is always secondary in the
>> proposed model) is moving the burden to dealing with the val/ref world from the
>> maintainer of a class to the users of a class. I will develop that in a later
>> mail.
>>
> 
> Remember, this was a deliberate choice.  We started with exposing
> values (val default) and allowing tearing by default before finally
> iterating to this solution (ref default).  Now, the "good name" is
> always safe.  It's always the reference type, can't be torn, and
> doesn't leak the uninitialized value. On the downside, it requires
> users to say ".val" when they want the direct value.  And that's ugly.
> But if we're down to arguing syntax, then we're in a pretty good
> place.

There is a swat of val value classes that are also safe by default, all the ones that can not tear (because they are not declared as tearable) and support the all zeroes default. 
By example, Scala and Kotlin restrict their value classes to a single field/single property, which is enough to cover all the things like quantities, units, etc where all you need a lightweigth wrapper on top of a reference or a primitive type.

> 
> Looking forward to the email that clearly outlines the problem you see here.
> 
>> >
>> > Overall - we're winning more than we expected to with this model.
>> > More cases can be scalarized on the stack than we initially thought
>> > and we can still offer heap flattening for the smaller set of use
>> > cases that really benefit from it.
>> >
>> >>
>> >> You are judging your model with the questions of the past, not the questions we
>> >> will have 10 years after the new model is introduced.
>> >
>> > As always, today's solutions are tomorrow's problems.  Can you be more
>> > specific about the questions you think will be asked in the next 10
>> > years so we can dig into those ?
>>
>> The proposed model is similar to the eclair model from the POV of the users of
>> the value class, i think we did not do a good postmortem on why the eclair
>> model fails from the user POV because we discover that the VM could be must
>> smarter that we previously though. So the proposed model exhibits the same
>> issue. I will dig for my note on the eclair model and rewrite them in terms of
>> the current model.
>>
> 
> The downfall of the eclair model was the use of interfaces.  The VM
> can't enforce them during verification due to long standing verifier
> rules which we won't change.  It also forced subtyping relationships
> between the wrapper and the filling of the eclair. And that's
> problematic in other places.  Brian laid out on the EG call that the
> current model is more of a boxing / unboxing (projection / embedding)
> model that uses similar language models to insert the conversions (if
> needed).

yes, the ref/val model is better if like for the VM there is a conversion  (it can be an auto-convertion) between val and ref.
But because of that, we have not explored the other drawbacks of exposing a twin types model front and center to the users.

> 
>> >
>> >>
>> >> If anyone has the choices, then everyone has more responsibility.  And given
>> >> that the performance differences between Point.ref and Point.val accrue pretty
>> >> much exclusively in the heap, which is to say, apply only to implementation
>> >> code and not API, sticking the implementation with this burden seems
>> >> reasonable.
>> >>
>> >>
>> >> no, you can not change a Point.ref to a Point.val without breaking the backward
>> >> compatibility, so it's an issue for APIs.
>> >
>> > Point.ref (the "L" carrier) and Point.val (the "Q" carrier) are
>> > spelled differently from a VM perspective.  So changing from one to
>> > the other is making a new API.  The benefit of the approach we've
>> > landed on though, is that the difference should be small for API
>> > points as we can scalarize the identity-less L on the stack.  For
>> > backwards compatibility, just leave it!  Better to use the L in api
>> > signatures and limit the Q's to heap storage (fields and arrays).
>>
>> I think we can get both, i would like a Point.ref followed by a
>> Objects.requireNonNull to be equivalent to a Point.val from the user POV.
>> By example
>>   public void foo(Point p) {
>>     Object.requireNonNull(p);
>>     ...
>>   }
>>
>> should be equivalent to
>>   public void foo(Point.val p) {
>>     ...
>>   }
>>
>> This requires to never have a Point.val in the method descriptor and to use the
>> attribute TypeRestriction when Point.val is used.
> 
> Is this a question/concern about the parametric VM proposal?  If not,
> I'm confused by the mention of "TypeRestriction".

TypeRestriction is a tool to keep binary backward compatibility and a source backward compatibility change if we have auto-boxing between ref and val so people can modify an existing method signature without having to take car about backward compatibility.

> 
> Looking at the example, those two functions will be equivalent from a
> user point of view, except that they can assign null to "p" in the
> first copy of foo() and can't in the second.

yes, one can say that the version with .val is more safe because a passing null explicitly will result in a compile-time error instead of a runtime error.

If the idea of the language model is that .val is used in implementation and .ref is used in API, then TypeRestriction is a good tool for that because it allows user to have an increment approach, use .ref everywhere and later add .val when you want performance. It has a price, because it's a kind of erasure, we will have signature clashes like with generics.

>  From a VM point of view,
> we should be able to scalarize the value type in both cases (though
> we'll need some extra metadata for the first case).

scalarized yes, but will it work in term of calling convention if those methods are overridden.

> 
> Even when calling them, given the box/unbox rules, I'm not clear where
> you see the difference showing up?  That they may have to cast
> "(Point.val)p" (not clear on the language rules here) to call the
> second if they have a Point in hand?

yes, it depends if we have the equivalent of auto-unboxing or not.

> 
>>
>> I believe this is the kind of heroic efforts we will have to do so users can add
>> ".val" to a parameter type of a method without thinking too much.
>> Obviously, i would prefer a world were the maintainer of a value class have to
>> deal with this kind of stuff instead of the users but if we keep the proposed
>> model, i think we will have to polish it around the edges.
> 
> Can you be more clear on the heroic efforts you see required?  I can
> speculate but I'll probably be wrong =)

yes, i will.

> 
>>
>> >
>> >>
>> >> If your description of the world was true, then we do not need Q-type, the
>> >> attribute Preload which say that a L-type is a value type is enough.
>> >> In that case, then the VM model and the language model you propose are more in
>> >> sync.
>> >
>> > Preload and L-type give identity-less values flattening on the stack.
>> > That's part of the story.  For heap flattening we still need the Q.
>>
>> Yes, i've forgotten that we need Q-type for generics as Brian remember me/us
>> during our meeting.
>>
>> >
>> > I thought we covered this in the EG discussion.  Are you just reading
>> > into the record the concerns raised in the meeting to get the answers
>> > captured ?
>>
>>
>> I think the meeting was very useful to me because i did not understand correctly
>> the proposed model.
>> I have another set of worries now, but as i said, i want to comb through my note
>> before raising another set of concerns.
> 
> +1
> 
> --Dan

Rémi