B3, default values, and implicit initialization

Remi Forax forax at univ-mlv.fr
Thu Apr 27 18:31:30 UTC 2023


> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "valhalla-spec-experts" <valhalla-spec-experts at openjdk.java.net>
> Sent: Thursday, April 27, 2023 7:59:23 PM
> Subject: Fwd: B3, default values, and implicit initialization

> The following was received on valhalla-spec-comments.

> Without deep-diving into the specific syntax suggestion, which I'll summarize as
> "default values are values, and we have fields to specify values, so why
> wouldn't we do that?", the main thing here is that there is a concern that we
> might make the distinction between B2 and B3 "too subtle", and that would "lead
> to overuse of B3".

> There is a challenge with the specifics here, in that the stated goal is to
> explicitly specify the default, but (a) it is not possible to explicitly
> specify an arbitrary default (which using a field would imply), and (b) it is
> not possible to conveniently denote "the instance where all the fields have
> their default values". So I don't think the field-centric approach is a winner,
> but again, that's not the main value of such a suggestion.

> I'm not a fan of arguments that start with "we should force the user..."; while
> we often do go down this road (e.g., subtypes of sealed types must be
> explicitly sealed, final, or non-sealed, to avoid ambiguities over defaults),
> such arguments should start with "here is what would go wrong if people didn't
> understand the defaults properly."

> So, what I'd like to see is more specifics on _how_ B3 might be overused and
> _why_ that is bad before considering these syntactic directions.

> Some additional relevant observations about the current direction:

> - Value classes, without any reference to the special property of "usable
> without initialization" or atomicity under race, are very simple: they are like
> ordinary classes without identity. As a direct consequence, identity-sensitive
> operations (==, synchronized, etc) are either redefined based on state or are
> partialized, and features that require identity (mutability, layout
> polymorphism) are disallowed. They are otherwise like identity classes in every
> other way -- they are reference types, variables of such class types are
> nullable, etc. This is the "simplest possible" interpretation of value classes,
> in that the only thing we take away is identity.

> - This interpretation of value classes is a "safe default"; if you don't go the
> extra mile and talk about uninitialized use, or atomicity, you get something
> with the same safety properties as identity classes. You have to explicitly do
> something extra to get the B3 properties.
Yes ! 
I think this is the important point, the current incarnation of B3 is a B2, you need to add a '!' at use site to explicitly ask for trouble. 

And, just to not forget, this is also the curse of B3, to work well, you need B3! everywhere along the way but the JITs tend to reason locally so it's easy to have a boxing in the middle (e.g. a method parameter using a B3 with no bang or an interface), making the performance of a code using B3! far worst than using B2 or B3 with no bang. 

The proposed syntax, declaring a pseudo field, is not too far from the current one (default constructor), the declaration is done inside the class not as a modifier. 

I do not find this syntax attractive, especially the "new" in "default = new", i can hear my students saying "new what" ? 

regards, 
Rémi 

> -------- Forwarded Message --------
> Subject: 	Re: B3, default values, and implicit initialization
> Date: 	Thu, 27 Apr 2023 11:23:54 +0100
> From: 	Stephen Colebourne [ mailto:scolebourne at joda.org | <scolebourne at joda.org>
> ]

> To: 	[ mailto:valhalla-spec-comments at openjdk.java.net |
> valhalla-spec-comments at openjdk.java.net ]

> From my perspective, the difference between B2 and B3 is vital, as I
> fear developers will greatly overuse B3. I don't think "default
> constructors" are the right focus.

> The initial discussions of B2 vs B3 focussed on one main question -
> does the type have a sensible default. `LocalDate` does not, but
> `Decimal` or `Optional` does (zero or empty). The big issue Valhalla
> faces during adoption from my perspective is the messaging, which is
> far too easy to simplify to B3 faster than B2 faster than B1. The net
> result would be many more 1979-01-01 type bugs. Without great care
> here we could be creating the potential for many null-like
> "million-dollar mistakes".

> To counteract this, the syntax IMO needs to place the issue at hand
> directly in the face of the developer. And the key question is "what
> is the sensible default value for an instance of this type". Given
> this, I think all authors of *all* value types should be forced to
> *explicitly* define what the default value is. ie. it isn't something
> where the language should choose one or the other as the default
> (sic).

> The obvious syntax is a field, which is implicitly public static
> final. I don't feel that a class-level keyword is the right choice
> here:

> public value LocalDate {
> default = null;
> }
> public value Decimal {
> default = new;
> }

> In each case, the author has had to explicitly choose what the
> sensible default is, and therefore implicitly chooses whether it is B2
> and B3 - without any opportunity to be distracted by the performance
> model. Neither B2 or B3 is chosen as the favourite by the language.
> "It is a compilation error when a value class declaration does not
> specify a default value".

> By contrast, default constructors are one or two steps removed from
> the actual decision point that the class author should actually be
> thinking about, which is what the sensible default is. It is also the
> case that the default constructor is never actually invoked, which
> will be an ongoing point of surprise.

> Terminology in specs just talks about what the default value is, eg
> "authors should select the most appropriate default value for their
> domain", "arrays are initialised to the default value of a value type"
> or "if the default is null then ...":

> The syntax is intended to make it perfectly reasonable to ask for
> `LocalDate.default` or `Decimal.default` and get a sensible answer -
> it looks like a "normal" constant in code. The use of `default = new`
> by itself deliberately invokes the idea of a default constructor that
> does nothing, without the need to spell it out.

> Javadoc can be added to the `default` constant, which is very helpful.
> For example it might include justification as to why LocaleDate does
> not have a default value of 0000-01-01 or 1970-01-01.
> Stephen

> On Tue, 28 Mar 2023 at 20:13, Brian Goetz [ mailto:brian.goetz at oracle.com |
> <brian.goetz at oracle.com> ] wrote:

>> The recent update of JEP 401 contained a number of refinements to the user
>> model, specifically, separating the primitive/reference distinction into a
>> number of smaller distinctions (e.g., nullable vs non-nullable, optional vs
>> required construction.)  Overall this has been a very positive step forward.

>> We still have a need for the distinction between what we've been calling B2 and
>> B3; JEP 401 currently frames that in terms of "construction is optional."  This
>> is a big step forward; indeed, the key difference between them is whether the
>> class _needs_ the "variables start out as null, and all instances are created
>> by constructors" protection, or whether it admits the lighter-weight
>> initialization protocol of "there's a a standard zero value, null-free
>> variables are initialized to that" that primitives enjoy today.  (Note that B3
>> classes don't require this lighter protocol, they merely enable it, much as
>> primitives all give you the option of boxing to get the full conservative
>> initialization protocol.)

>> The idea of framing this as "construction is optional" is a good one, but the
>> expression of it proposed in JEP 401 feels "not quite there".  In this note
>> I'll propose an alternative presentation, but the main goal here is around
>> terminology and user model rather than syntax (so please keep the syntax
>> agitation to a reasonable level.)

>> The key distinction between B2 and B3 is that B3 has a _default value_ which the
>> VM can summon at will.  This enables non-nullable heap variables to be
>> flattened, because we can initialize these the same way we initialize other
>> fields and array elements.  Further, that default value is highly constrained;
>> it is a physical zero, the result of initializing all fields to their default
>> value.

>> Flattening is of course a goal, but it is not something that exists in the
>> programming model -- its just an optimization.  What exists in the programming
>> model is the default value, and what this unlocks is the possibility for
>> variables to be _implicitly initializated_.  Reference-typed variables today
>> are _explicitly initialized_; variables start out null and have to be
>> initialized with a constructed value.  A class with a default value has the
>> option (opted in through null-exclusion) for its variables to be implicitly
>> initialized, which, like primitives, means that they start out with a valid
>> default value, and can be further assigned to.

>> Framed this way, the Valhalla performance story simplifies to:

>>  - Give up identity, get flattening on the stack;
>> - Further give up explicit initialization, get flattening for small objects on
>>  the heap;
>>  - Further give up atomicity, get flattening for larger objects on the heap.

>> Giving up explicit initialization entails both the class opting out of explicit
>> initialization, _and_ the variable opting out of nullity.

>> The key new terminology that comes out of this is implicit vs explicit
>> initialization.

>> Syntactically, my preference is to indicate that the default value can be
>> summoned by giving a value class a _default constructor_:

>>     value class Complex {
>>         public final double re, im;

>>         public default Complex();
>>     }

>> A default constructor has no arguments, no body, no throws clause, and
>> implicitly initializes all fields to their default values.  Unlike identity
>> classes, value classes don't get constructions implicitly; a value class must
>> declare at least one constructor, default or otherwise.  This replaces the idea
>> of "optional constructor", which is a negative statement about construction
>> ("but you don't have to call me"), with a more direct and positive statement
>> that there is a _default constructor_ with the required properties.

>> Note that this is similar to the existing concept of "default constructor",
>> which you get for free in an identity class if you don't specify any
>> constructors.  It is possible we can unify these features (and also with
>> constructors in "agnostic" abstract classes), but first let's work out what it
>> would mean in value classes, and see if we like it.

>> In this model, a B3 class is just a value class with a default constructor -> a
>> default constructor means that you have the choice of implicit or explicit
>> initialization -> non-nullity at the use site opts into implicit initialization
>> -> B3! gets flattening (for small layouts.)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/valhalla-spec-experts/attachments/20230427/584e4620/attachment-0001.htm>


More information about the valhalla-spec-experts mailing list