From daniel.smith at oracle.com  Wed Dec  1 15:59:52 2021
From: daniel.smith at oracle.com (Dan Smith)
Date: Wed, 1 Dec 2021 15:59:52 +0000
Subject: EG meeting, 2021-12-01
Message-ID: <B48D84E4-14A7-4FC1-AB05-FC10A51758FD@oracle.com>

EG Zoom meeting today at 5pm UTC (9am PDT, 12pm EDT).

We can discuss "JEP update: Value Objects", the use of the term "value" here, and class file encodings.

From forax at univ-mlv.fr  Wed Dec  1 16:32:00 2021
From: forax at univ-mlv.fr (Remi Forax)
Date: Wed, 1 Dec 2021 17:32:00 +0100 (CET)
Subject: JEP update: Value Objects
In-Reply-To: <68250ADC-90BB-43EC-A646-77127091D4BD@oracle.com>
References: <68250ADC-90BB-43EC-A646-77127091D4BD@oracle.com>
Message-ID: <1464628180.1879848.1638376320697.JavaMail.zimbra@u-pem.fr>

Hi Daniel,
this is really nice.

Here are my remarks.

"It generally requires that an object's data be located at a fixed memory location"
remove "fixed", all OpenJDK GCs move objects.
Again later, remove "fixed" in "That is, a value object does not have a fixed memory address ...".

At the beginning of the section "Value class declarations", before the example, i think we also need a sentence saying that fields are implicitly final.

Class file and representation, about ACC_PERMITS_VALUE, what's the difference between "permits" and "allow" in English ?

In section "Java language compilation",
"Each class file generated by javac includes a Preload attribute naming any value class that appears in one of the class file's field or method descriptors."
+ if a value class is the receiver of a method call/field access (the receiver is not part of the method descriptor in the bytecode).

In section "Performance model"
"... must ensure that fields and arrays storing value objects are updated atomically.",
not only stores, loads has to be done atomically too.

The part "Initially, developers can expect the following from the HotSpot JVM" is dangerous because it will be read as Hotspot will do that forever.
We have to be more vague here, "a Java VM may ..."

regards,
R?mi

----- Original Message -----
> From: "daniel smith" <daniel.smith at oracle.com>
> To: "valhalla-spec-experts" <valhalla-spec-experts at openjdk.java.net>
> Sent: Mardi 30 Novembre 2021 01:09:06
> Subject: JEP update: Value Objects

> I've been exploring possible terminology for "Bucket 2" classes, the ones that
> lack identity but require reference type semantics.
> 
> Proposal: *value classes*, instances of which are *value objects*
> 
> The term "value" is meant to suggest an entity that doesn't rely on mutation,
> uniqueness of instances, or other features that come with identity. A value
> object with certain field values is the same (per ==), now and always, as every
> "other" value object with those field values.
> 
> (A value object is *not* necessarily immutable all the way down, because its
> fields can refer to identity objects. If programmers want clean immutable
> semantics, they shouldn't write code (like 'equals') that depends on these
> identity objects' mutable state. But I think the "value" term is still
> reasonable.)
> 
> This feels like it may be an intuitive way to talk about identity without
> resorting to something verbose and negative like "non-identity".
> 
> If you've been following along all this time, there's potential for confusion: a
> "value class" has little to do with a "primitive value type", as we've used the
> term in JEP 401. We're thinking the latter can just become "primitive type",
> leading to the following two-axis interpretation of the Valhalla features:
> 
> ---------------------------------------------------------------------------------------------
> Value class reference type (B2 & B3.ref)	| Identity class type (B1)
> ---------------------------------------------------------------------------------------------
> Value class primitive type (B3)			|
> ---------------------------------------------------------------------------------------------
> 
> Columns: value class vs. identity class. Rows: reference type vs. primitive
> type. (Avoid "value type", which may not mean what you think it means.)
> 
> Fortunately, the renaming exercise is just a problem for those of us who have
> been closely involved in the project. Everybody else will approach this grid
> with fresh eyes.
> 
> (Another old term that I am still finding useful, perhaps in a slightly
> different way: "inline", describing any JVM implementation strategy that
> encodes value objects directly as a sequence of field values.)
> 
> Here's a new JEP draft that incorporates this terminology and sets us up to
> deliver Bucket 2 classes, potentially as a separate feature from Bucket 3:
> 
> https://bugs.openjdk.java.net/browse/JDK-8277163
> 
> Much of JEP 401 ends up here; a revised JEP 401 would just talk about primitive
> classes and types as a special kind of of value class.

From john.r.rose at oracle.com  Wed Dec  1 20:34:35 2021
From: john.r.rose at oracle.com (John Rose)
Date: Wed, 1 Dec 2021 20:34:35 +0000
Subject: aconst_init
In-Reply-To: <CAJq4Gi5UC9RvGgQ+bDEb3pSWzrYJZCtj7Lnt_bzp3y=qS+gJ8A@mail.gmail.com>
References: <68250ADC-90BB-43EC-A646-77127091D4BD@oracle.com>
 <CAJq4Gi5UC9RvGgQ+bDEb3pSWzrYJZCtj7Lnt_bzp3y=qS+gJ8A@mail.gmail.com>
Message-ID: <40F9B99B-7F37-47FE-BBC1-FBCEEBFC28FB@oracle.com>

On Dec 1, 2021, at 7:58 AM, Dan Heidinga <heidinga at redhat.com> wrote:
> 
> Splitting a new thread off from Dan's email about the jep draft to
> talk about the `aconst_init` bytecode:
> 
>> aconst_init, with a CONSTANT_Class operand, produces an instance of the named value class, with all fields set to their default values. This operation always has private access: a linkage error occurs if anyone other than the value class or its nestmates attempts an aconst_init operation.
> 
> Can you confirm if this is purely a rename of the previous
> defaultvalue / initialvalue bytecodes?

I can confirm this, with one important exception:   The defaultvalue
bytecode has no access restrictions, while the aconst_init/initialvalue
bytecode does.

> I'm wondering how the name fits the eventual primitive values and
> their uses.  Will they also use this bytecode or will they continue to
> use a defaultvalue version?

For this reason, aconst_init/initialvalue is not useful for B3 types.
I think there is no need for yet another bytecode to cover the B3
types.  Instead, Class::__InitialValue should return either null for
B1/B2 types (or any reference types: polys and arrays), and should
return the (boxed) zero for primitives, starting with int.class.

assert Integer.class.__InitialValue() == null;
assert int.class.__InitialValue() == 0;
assert Point.class.__InitialValue() == (new Point[1])[0];
assert Point.ref.class.__InitialValue() == null;

(__InitialValue is not really the eventual method name.)

> The expected bytecode pattern for a "<new>" factory method is something like:
>  aconst_init MyValue
>  iconst1
>  withfield MyValue.x:I
>  areturn
> Correct?

Yes, although it?s likely there are intervening astore_0 and
aload_0 instructions, since ?this? is probably modeled by the
compiler as local[0].

By the way, this raises the question of how vigorously
the JVM should perform structural checks on the new
features, to ensure they are only used in the ways we
expect.  I think in general such checks should be
justified individually, rather than be applied by default.

Since <new> is just a static factory method, I would prefer
(though I understand reasons to the contrary) to have the
JVMS be agnostic about where <new> methods can occur.
In other words, treat <new> like a plain identifier; maybe
require that it be marked ACC_STATIC but allow it to
work like a nameless factory method in any context
where a classfile generator might choose to make use of it.

Taking an agnostic stance now would let us experiment
with translation strategies (in the future) which replace
uses of <init> (which have problematic security
characteristics, even recently) with uses of <new>.

(Reflection might omit off-label uses of <new>, just
like it omits <clinit>.  But the ?guts? of MH reflection
can see <init> today and would see all such <new>s
tomorrow, so exposing it becomes a library issue,
not a JVMS decision.)

? John

From daniel.smith at oracle.com  Wed Dec  1 23:29:37 2021
From: daniel.smith at oracle.com (Dan Smith)
Date: Wed, 1 Dec 2021 23:29:37 +0000
Subject: [External] : Re: JEP update: Value Objects
In-Reply-To: <CAJq4Gi4gjzZHkbuCfSomZsd7vzSuAHbOXuYRG=Wcbq1DD38n=Q@mail.gmail.com>
References: <68250ADC-90BB-43EC-A646-77127091D4BD@oracle.com>
 <CAJq4Gi4gjzZHkbuCfSomZsd7vzSuAHbOXuYRG=Wcbq1DD38n=Q@mail.gmail.com>
Message-ID: <117E6CD9-9D94-4110-BA40-3778FC207977@oracle.com>


> On Dec 1, 2021, at 8:48 AM, Dan Heidinga <heidinga at redhat.com> wrote:
> 
>> class file representation & interpretation
>> 
>> A value class is declared in a class file using the ACC_VALUE modifier (0x0100). At class load time, the class is considered to implement the interface ValueObject; an error occurs if a value class is not final, has a non-final instance field, or implements?directly or indirectly?IdentityObject.
> 
> I'll reiterate my earlier pleas to have javac explicitly make them
> implement ValueObject.  The VM can then check that they have both the
> bit and the interface.

So we went down the path of "maybe there's no need for a flag at all" in today's meeting, and it might be worth more consideration, but I convinced myself that the ACC_VALUE flag serves a useful purpose for validation and clarifying intent that can't be reproduced by a "directly/indirectly extends ValueObject" test.

As you suggest, though, we could mandate that ACC_VALUE implies 'implements ValueObject'. Some reasons not to require this:

- 'implements ValueObject' may be redundant if an ancestor implements ValueObject; but leaving it off risks a separate compilation error (e.g., ancestor used to implement ValueObject, doesn't anymore). So I think the proper compilation strategy would be to always implement it directly, even redundantly. There's an opportunity for a subtle compiler bug.

- It's extra ceremony in the class file. <shrug>

- Inferring is consistent with what we do for at least some identity classes. Inferring everywhere is, in some ways, simpler.*

(*Tangent about the idea of inferring IdentityObject in old versions, but requiring IdentityObject in new versions: the trouble with gating off less-preferred behavior in old versions is that it's still there and still must be supported. JVMs end up with two strategies instead of one. A (great strategy+ok strategy) combination is arguably *worse* than just (ok strategy) everywhere.)

> It's a simpler model if the interface is
> always there for values as the VM won't have to track whether it was
> injected for a value class or explicitly declared.  Why does that
> matter?  For two reasons: JVMTI will need to be consistent in the
> classfile bytes it returns and not included the interface if it was
> injected (less tracking), and given earlier conversations about
> whether to "hide" the injected interface from Class::getInterfaces,
> always having it for values removes one more sharp edge.

The plan of record is to make no distinction between inferred and explicit superinterfaces in reflection. Is that not acceptable for JVMTI? If there's no need for a distinction, does that address your concern about inferred supers?

From john.r.rose at oracle.com  Wed Dec  1 23:56:02 2021
From: john.r.rose at oracle.com (John Rose)
Date: Wed, 1 Dec 2021 23:56:02 +0000
Subject: [External] : Re: JEP update: Value Objects
In-Reply-To: <117E6CD9-9D94-4110-BA40-3778FC207977@oracle.com>
References: <68250ADC-90BB-43EC-A646-77127091D4BD@oracle.com>
 <CAJq4Gi4gjzZHkbuCfSomZsd7vzSuAHbOXuYRG=Wcbq1DD38n=Q@mail.gmail.com>
 <117E6CD9-9D94-4110-BA40-3778FC207977@oracle.com>
Message-ID: <8AD4B184-2937-4146-A763-612E31E64683@oracle.com>

On Dec 1, 2021, at 3:29 PM, Dan Smith <daniel.smith at oracle.com<mailto:daniel.smith at oracle.com>> wrote:

So we went down the path of "maybe there's no need for a flag at all" in today's meeting, and it might be worth more consideration, but I convinced myself that the ACC_VALUE flag serves a useful purpose for validation and clarifying intent that can't be reproduced by a "directly/indirectly extends ValueObject" test.

As you suggest, though, we could mandate that ACC_VALUE implies 'implements ValueObject?.

Assuming ACC_VALUE is part of the design, there are actually four
things we can specify, for the case when a class file has ACC_VALUE set:

A. Inject ValueObject as a direct interface, whether or not it was already inherited.
B. Inject ValueObject as a direct interface, if  it is not already inherited.
C. Require ValueObject to be present as a direct interface, whether or not it was already inherited.
D. Require ValueObject to be present as an interface, either direct or inherited.

A and B will look magic to reflection.
B is slightly more parsimonious and less predictable than A.
C and D are less magic to reflection, and require a bit more ?ceremony? in the class file.
D is less ceremony than C.
Also, the D condition is a normal subtype condition, while the C condition is unusual to the JVM.

I guess I prefer C and D over A and B because of the reflection magic problem,
and also because of Dan H?s issue (IIUC) about ?where do we look for the
metadata, if not in somebody?s constant pool??

Since D and C have about equal practical effect, and D is both simpler to
specify and less ceremony, I prefer D best of all.

I agree that ACC_VALUE is useful to prevent ?action at a distance?.

There is the converse problem that comes from the redundancy:
What happens if the class directly implements or inherits ValueObject
and ACC_VALUE is not set?  I guess that is an error also.

? John


From john.r.rose at oracle.com  Thu Dec  2 00:04:56 2021
From: john.r.rose at oracle.com (John Rose)
Date: Thu, 2 Dec 2021 00:04:56 +0000
Subject: [External] : Re: JEP update: Value Objects
In-Reply-To: <8AD4B184-2937-4146-A763-612E31E64683@oracle.com>
References: <68250ADC-90BB-43EC-A646-77127091D4BD@oracle.com>
 <CAJq4Gi4gjzZHkbuCfSomZsd7vzSuAHbOXuYRG=Wcbq1DD38n=Q@mail.gmail.com>
 <117E6CD9-9D94-4110-BA40-3778FC207977@oracle.com>
 <8AD4B184-2937-4146-A763-612E31E64683@oracle.com>
Message-ID: <6776971B-F8B1-416D-8A4F-32EAE842AC03@oracle.com>

On Dec 1, 2021, at 3:56 PM, John Rose <john.r.rose at oracle.com<mailto:john.r.rose at oracle.com>> wrote:

There is the converse problem that comes from the redundancy:
What happens if the class directly implements or inherits ValueObject
and ACC_VALUE is not set?  I guess that is an error also.

I hit send too soon:  That?s probably true for concrete classes.
For abstracts, ACC_VALUE must not be set (yes?) and ValueObject
?just flows? along with all the other super types, with no particular
notice.  It all comes together when ACC_VALUE appears, and that
must be on a final, concrete class.

I keep wondering what ACC_VALUE ?should mean? for an abstract.
Maybe it ?should mean? that the abstract is thereby also forced to
implement VO, so that all subtypes will be VO?s.

The slightly different meaning of ACC_PERMITS_VALUE is ?hold
off on injecting IdentityObject at this point?.  Because the type
might allow subtypes that implement VO (whether abstract or
concrete).  At this point it also allows IdentityObject to be
introduced in subtypes.  Mmm? It could also have been
spelled ACC_NOT_NECESSARILY_IDENTITY.

As we said in the meeting, it seems to need magic injection of
IdObj, even if we can require non-magic explicit presence of VO.
Dan H., will the metadata pointer of IdObj be a problem to access,
if it is magically injected?

From daniel.smith at oracle.com  Thu Dec  2 00:05:29 2021
From: daniel.smith at oracle.com (Dan Smith)
Date: Thu, 2 Dec 2021 00:05:29 +0000
Subject: JEP update: Value Objects
In-Reply-To: <1464628180.1879848.1638376320697.JavaMail.zimbra@u-pem.fr>
References: <68250ADC-90BB-43EC-A646-77127091D4BD@oracle.com>
 <1464628180.1879848.1638376320697.JavaMail.zimbra@u-pem.fr>
Message-ID: <97407E47-9296-4776-9B7B-22220931B785@oracle.com>

> On Dec 1, 2021, at 9:32 AM, Remi Forax <forax at univ-mlv.fr> wrote:
> 
> Hi Daniel,
> this is really nice.
> 
> Here are my remarks.
> 
> "It generally requires that an object's data be located at a fixed memory location"
> remove "fixed", all OpenJDK GCs move objects.
> Again later, remove "fixed" in "That is, a value object does not have a fixed memory address ...".

Yeah, was hoping I could weasel my way out of that with "generally", but okay. Changed to "particular memory location".

> At the beginning of the section "Value class declarations", before the example, i think we also need a sentence saying that fields are implicitly final.

Eh, this is putting more detail in the introductory paragraph than I want. I think I'm happier going the other direction?putting the rules about 'final' and 'abstract' class modifiers in the "subject to the following restrictions" list after the example. Then the intro is just two sentences about the 'value' keyword.

> Class file and representation, about ACC_PERMITS_VALUE, what's the difference between "permits" and "allow" in English ?

Very close synonyms, I'd say? I would use them interchangeably.

The reason I chose "permits" is because we already have a PermittedSubclasses attribute that serves a similar purpose.

> In section "Java language compilation",
> "Each class file generated by javac includes a Preload attribute naming any value class that appears in one of the class file's field or method descriptors."
> + if a value class is the receiver of a method call/field access (the receiver is not part of the method descriptor in the bytecode).

The need here is to identity inlinable classes at the declaration site. Use sites don't need it. (And the the type of 'this' at the declaration site is, of course, already loaded.)

> In section "Performance model"
> "... must ensure that fields and arrays storing value objects are updated atomically.",
> not only stores, loads has to be done atomically too.

"read and written atomically", then.

> The part "Initially, developers can expect the following from the HotSpot JVM" is dangerous because it will be read as Hotspot will do that forever.
> We have to be more vague here, "a Java VM may ..."

Yes, message received. I'll ask around about the best way to document our intentions for the targeted release (perhaps outside the JEP) without suggesting a constraint on the abstract feature.


From daniel.smith at oracle.com  Thu Dec  2 00:25:08 2021
From: daniel.smith at oracle.com (Dan Smith)
Date: Thu, 2 Dec 2021 00:25:08 +0000
Subject: JEP update: Value Objects
In-Reply-To: <8AD4B184-2937-4146-A763-612E31E64683@oracle.com>
References: <68250ADC-90BB-43EC-A646-77127091D4BD@oracle.com>
 <CAJq4Gi4gjzZHkbuCfSomZsd7vzSuAHbOXuYRG=Wcbq1DD38n=Q@mail.gmail.com>
 <117E6CD9-9D94-4110-BA40-3778FC207977@oracle.com>
 <8AD4B184-2937-4146-A763-612E31E64683@oracle.com>
Message-ID: <39BB24B3-8214-4D39-BF31-2F51E30F75FD@oracle.com>

> On Dec 1, 2021, at 4:56 PM, John Rose <john.r.rose at oracle.com> wrote:
> 
> On Dec 1, 2021, at 3:29 PM, Dan Smith <daniel.smith at oracle.com> wrote:
>> 
>> So we went down the path of "maybe there's no need for a flag at all" in today's meeting, and it might be worth more consideration, but I convinced myself that the ACC_VALUE flag serves a useful purpose for validation and clarifying intent that can't be reproduced by a "directly/indirectly extends ValueObject" test.
>> 
>> As you suggest, though, we could mandate that ACC_VALUE implies 'implements ValueObject?.
> 
> Assuming ACC_VALUE is part of the design, there are actually four
> things we can specify, for the case when a class file has ACC_VALUE set:
> 
> A. Inject ValueObject as a direct interface, whether or not it was already inherited.
> B. Inject ValueObject as a direct interface, if  it is not already inherited.
> C. Require ValueObject to be present as a direct interface, whether or not it was already inherited.
> D. Require ValueObject to be present as an interface, either direct or inherited.

I realize my last sentence there is ambiguous, so thanks for spelling these out. I meant that Dan has suggested (D), and we could consider doing so. (The JEP says do either A or B, it's vague about what "considered to implement" means.)

> A and B will look magic to reflection.

This I'm unclear on. What's the magic? Are you imagining that certain superinterfaces be suppressed by reflection. As I said, our intent is to *not* suppress anything.

> B is slightly more parsimonious and less predictable than A.

Yeah, I'm not sure what I prefer. The distinction only matters, I think, for reflection.

> C and D are less magic to reflection, and require a bit more ?ceremony? in the class file.
> D is less ceremony than C.
> Also, the D condition is a normal subtype condition, while the C condition is unusual to the JVM.

The "normal subtype condition" is a big reason to prefer D over C.

> I guess I prefer C and D over A and B because of the reflection magic problem,
> and also because of Dan H?s issue (IIUC) about ?where do we look for the
> metadata, if not in somebody?s constant pool??

I'll reiterate this point:

>> the trouble with gating off less-preferred behavior in old versions is that it's still there and still must be supported. JVMs end up with two strategies instead of one. A (great strategy+ok strategy) combination is arguably *worse* than just (ok strategy) everywhere.


We haven't really eliminated these problems if we're still inferring IdentityObject elsewhere. We've just (slightly) reduced their footprint. At the expense of living with two strategies instead of one.

> Since D and C have about equal practical effect, and D is both simpler to
> specify and less ceremony, I prefer D best of all.

I'm concerned about D's separate compilation problem: implementing ValueObject at compile time doesn't guarantee implementing ValueObject at runtime. That change is not, strictly speaking, a binary compatible change, but a superinterface author might think they could get away with it, and the resulting error message seems excessively punitive: "you can't load this class because some superinterface changed its mind about allowing identity class implementations". They wanted to allow more, and ended up allowing less.

Which means, to be safe, the compiler should always redundantly implement ValueObject in value classes, but then a compiler might forget to do so and introduce a subtle bug, ...

Tolerable, but it's a rough edge of D.


From forax at univ-mlv.fr  Thu Dec  2 07:08:01 2021
From: forax at univ-mlv.fr (Remi Forax)
Date: Thu, 2 Dec 2021 08:08:01 +0100 (CET)
Subject: [External] : Re: JEP update: Value Objects
In-Reply-To: <8AD4B184-2937-4146-A763-612E31E64683@oracle.com>
References: <68250ADC-90BB-43EC-A646-77127091D4BD@oracle.com>
 <CAJq4Gi4gjzZHkbuCfSomZsd7vzSuAHbOXuYRG=Wcbq1DD38n=Q@mail.gmail.com>
 <117E6CD9-9D94-4110-BA40-3778FC207977@oracle.com>
 <8AD4B184-2937-4146-A763-612E31E64683@oracle.com>
Message-ID: <95379176.1986412.1638428881927.JavaMail.zimbra@u-pem.fr>

> From: "John Rose" <john.r.rose at oracle.com>
> To: "daniel smith" <daniel.smith at oracle.com>
> Cc: "Dan Heidinga" <heidinga at redhat.com>, "valhalla-spec-experts"
> <valhalla-spec-experts at openjdk.java.net>
> Sent: Jeudi 2 D?cembre 2021 00:56:02
> Subject: Re: [External] : Re: JEP update: Value Objects

> On Dec 1, 2021, at 3:29 PM, Dan Smith < [ mailto:daniel.smith at oracle.com |
> daniel.smith at oracle.com ] > wrote:

>> So we went down the path of "maybe there's no need for a flag at all" in today's
>> meeting, and it might be worth more consideration, but I convinced myself that
>> the ACC_VALUE flag serves a useful purpose for validation and clarifying intent
>> that can't be reproduced by a "directly/indirectly extends ValueObject" test.

>> As you suggest, though, we could mandate that ACC_VALUE implies 'implements
>> ValueObject?.

> Assuming ACC_VALUE is part of the design, there are actually four
> things we can specify, for the case when a class file has ACC_VALUE set:

> A. Inject ValueObject as a direct interface, whether or not it was already
> inherited.
> B. Inject ValueObject as a direct interface, if it is not already inherited.
> C. Require ValueObject to be present as a direct interface, whether or not it
> was already inherited.
> D. Require ValueObject to be present as an interface, either direct or
> inherited.

> A and B will look magic to reflection.
> B is slightly more parsimonious and less predictable than A.
> C and D are less magic to reflection, and require a bit more ?ceremony? in the
> class file.
> D is less ceremony than C.
> Also, the D condition is a normal subtype condition, while the C condition is
> unusual to the JVM.

> I guess I prefer C and D over A and B because of the reflection magic problem,
> and also because of Dan H?s issue (IIUC) about ?where do we look for the
> metadata, if not in somebody?s constant pool??

> Since D and C have about equal practical effect, and D is both simpler to
> specify and less ceremony, I prefer D best of all.

> I agree that ACC_VALUE is useful to prevent ?action at a distance?.

> There is the converse problem that comes from the redundancy:
> What happens if the class directly implements or inherits ValueObject
> and ACC_VALUE is not set? I guess that is an error also.

As Daniel said during the meeting and in a following email, from the POV of javac, the compiler should add "implements ValueObject" on all concrete value classes even if ValueObject is already present in the hierarchy to avoid action at distance (to detect when a super type change from implementing ValueObject to implement IdentityObject by example). With that requirement, for the VM, D and C are equivalent for all classes generated by javac. 

So D is Ok. 

R?mi 

> ? John

From daniel.smith at oracle.com  Thu Dec  2 15:04:59 2021
From: daniel.smith at oracle.com (Dan Smith)
Date: Thu, 2 Dec 2021 15:04:59 +0000
Subject: JEP update: Value Objects
In-Reply-To: <CAJq4Gi5jDq8jn=p6kXxSPPhg9PaCD7do+gdp=EzHRN8upGKzVQ@mail.gmail.com>
References: <68250ADC-90BB-43EC-A646-77127091D4BD@oracle.com>
 <CAJq4Gi4gjzZHkbuCfSomZsd7vzSuAHbOXuYRG=Wcbq1DD38n=Q@mail.gmail.com>
 <117E6CD9-9D94-4110-BA40-3778FC207977@oracle.com>
 <8AD4B184-2937-4146-A763-612E31E64683@oracle.com>
 <6776971B-F8B1-416D-8A4F-32EAE842AC03@oracle.com>
 <CAJq4Gi5jDq8jn=p6kXxSPPhg9PaCD7do+gdp=EzHRN8upGKzVQ@mail.gmail.com>
Message-ID: <82A9C5AA-F0F3-4FB7-BF36-B6557103080E@oracle.com>

On Dec 2, 2021, at 7:08 AM, Dan Heidinga <heidinga at redhat.com<mailto:heidinga at redhat.com>> wrote:

When converting back from our internal form to a classfile for the
JVMTI RetransformClasses agents, I need to either filter the interface
out if we injected it or not if it was already there.  JVMTI's
GetImplementedInterfaces call has a similar issue with being
consistent - and that's really the same issue as reflection.

There's a lot of small places that can easily become inconsistent -
and therefore a lot of places that need to be checked - to hide
injected interfaces.  The easiest solution to that is to avoid
injecting interfaces in cases where javac can do it for us so the VM
has a consistent view.

I think you may be envisioning extra complexity that isn't needed here. The plan of record is that we *won't* hide injected interfaces. Our hope is that the implicit/explicit distinction is meaningless?that turning implicit into explicit via JVMTI would be a 100% equivalent change. I don't know JVMTI well, so I'm not sure if there's some reason to think that wouldn't be acceptable...

From daniel.smith at oracle.com  Thu Dec  2 23:11:07 2021
From: daniel.smith at oracle.com (Dan Smith)
Date: Thu, 2 Dec 2021 23:11:07 +0000
Subject: JEP update: Value Objects
In-Reply-To: <CAJq4Gi47XDQHNzOL4JYnNAOiDjAGh9r_zQRqGQGjq=i8R8wE7A@mail.gmail.com>
References: <68250ADC-90BB-43EC-A646-77127091D4BD@oracle.com>
 <CAJq4Gi4gjzZHkbuCfSomZsd7vzSuAHbOXuYRG=Wcbq1DD38n=Q@mail.gmail.com>
 <117E6CD9-9D94-4110-BA40-3778FC207977@oracle.com>
 <8AD4B184-2937-4146-A763-612E31E64683@oracle.com>
 <6776971B-F8B1-416D-8A4F-32EAE842AC03@oracle.com>
 <CAJq4Gi5jDq8jn=p6kXxSPPhg9PaCD7do+gdp=EzHRN8upGKzVQ@mail.gmail.com>
 <82A9C5AA-F0F3-4FB7-BF36-B6557103080E@oracle.com>
 <CAJq4Gi47XDQHNzOL4JYnNAOiDjAGh9r_zQRqGQGjq=i8R8wE7A@mail.gmail.com>
Message-ID: <FDDC8884-C09A-4008-8E4A-EE3553C09250@oracle.com>

> On Dec 2, 2021, at 1:04 PM, Dan Heidinga <heidinga at redhat.com> wrote:
> 
> On Thu, Dec 2, 2021 at 10:05 AM Dan Smith <daniel.smith at oracle.com> wrote:
>> 
>> On Dec 2, 2021, at 7:08 AM, Dan Heidinga <heidinga at redhat.com> wrote:
>> 
>> When converting back from our internal form to a classfile for the
>> JVMTI RetransformClasses agents, I need to either filter the interface
>> out if we injected it or not if it was already there.  JVMTI's
>> GetImplementedInterfaces call has a similar issue with being
>> consistent - and that's really the same issue as reflection.
>> 
>> There's a lot of small places that can easily become inconsistent -
>> and therefore a lot of places that need to be checked - to hide
>> injected interfaces.  The easiest solution to that is to avoid
>> injecting interfaces in cases where javac can do it for us so the VM
>> has a consistent view.
>> 
>> 
>> I think you may be envisioning extra complexity that isn't needed here. The plan of record is that we *won't* hide injected interfaces.
> 
> +1.  I'm 100% on board with this approach.  It cleans up a lot of the
> potential corner cases.
> 
>> Our hope is that the implicit/explicit distinction is meaningless?that turning implicit into explicit via JVMTI would be a 100% equivalent change. I don't know JVMTI well, so I'm not sure if there's some reason to think that wouldn't be acceptable...
> 
> JVMTI's "GetImplementedInterfaces" spec will need some adaptation as
> it currently states "Return the direct super-interfaces of this class.
> For a class, this function returns the interfaces declared in its
> implements clause."
> 
> The ClassFileLoadHook (CFLH) runs either with the original bytecodes
> as passed to the VM (the first time) or with "morally equivalent"
> bytecodes recreated by the VM from its internal classfile formats.
> The first time through the process the agent may see a value class
> that doesn't have the VO interface directly listed while after a call
> to {retransform,redefine}Classes, the VO interface may be directly
> listed.  The same issues apply to the IO interface with legacy
> classfiles so with some minor spec updates, we can paper over that.
> 
> Those are the only two places: GetImplementedInterfaces & CFLH and
> related redefine/retransform functions, I can find in the JVMTI spec
> that would be affected.  Some minor spec updates should be able to
> address both to ensure an inconsistency in the observed behaviour is
> treated as valid.

Useful details, thanks.

Would it be a problem if the ClassFileLoadHook gives different answers depending on the timing of the request (derived from original bytecodes vs. JVM-internal data)? If we need consistent answers, it may be that the "original bytecode" approach needs to reproduce the JVM's inference logic. If it's okay for the answers to change, there's less work to do.

To highlight your last point: we *will* need to work this out for inferred IdentityObject, whether we decide to infer ValueObject or not.

From brian.goetz at oracle.com  Sun Dec  5 18:36:05 2021
From: brian.goetz at oracle.com (Brian Goetz)
Date: Sun, 5 Dec 2021 13:36:05 -0500
Subject: Fwd: Proposal: Static/final constructors for bucket-3 primitive
 classes.
In-Reply-To: <CAGjFO8Zddj+Z2ggQO=Pbs2skY4VXZYZKcCtdyCO4jgvUb4+Xzw@mail.gmail.com>
References: <CAGjFO8Zddj+Z2ggQO=Pbs2skY4VXZYZKcCtdyCO4jgvUb4+Xzw@mail.gmail.com>
Message-ID: <6d0e4bd0-4dd2-9702-1a24-3c7ce5eedf00@oracle.com>


The following was received on valhalla-spec-comments.

Summary: Various syntax options for no-arg constructors of "bucket 3" 
primitives, to enable users to pick a default value other than zero.

Analysis: The suggestion is well-intentioned, but it is built on some 
significant misunderstandings of the problem we are facing.

It assumes that it is sensible to allow a non-zero default value of a 
primitive to be specified by the class declaration.? While it is 
entirely understandable why one would want this, the problem is not that 
there isn't a good syntax for it (there obviously is), nor that running 
the constructor multiple times is the problem -- it is deeper than 
that.? Numerous safety properties derive from the fact that newly 
allocated objects and arrays are bulk-initialized to zero; compromising 
this seems likely to lead to exploits.


-------- Forwarded Message --------
Subject: 	Proposal: Static/final constructors for bucket-3 primitive 
classes.
Date: 	Fri, 3 Dec 2021 21:15:50 -0600
From: 	Clement Cherlin <clement.cherlin at gmail.com>
To: 	valhalla-spec-comments at openjdk.java.net


Motivation: A concern with primitive classes (bucket 3) is that the
all-zeroes default value may be inappropriate or even invalid in some
cases. This proposal suggests a language enhancement to give primitive
class authors control over the default value of their class without,
in most cases, requiring a constructor call to create an instance.

Proposed language change:
Primitive classes can apply either the keyword "static" or the
keyword "final", but not both, to their no-argument constructor.

A "final" no-arg constructor is evaluated once, at compile time. The
constructed object is treated as a static final constant, and can be
folded as a constant, or copied verbatim whenever a default value of
that class is instantiated.

A "static" no-arg constructor is evaluated once, when the class is loaded.
The
constructed object is copied verbatim whenever a default value of that
class is instantiated.

Justification:
Presuming that non-zero default values need to exist, and we're going
to be constructing lots and lots of primitive objects and arrays of
primitive objects, it behooves us to make initialization of default
values as efficient as possible. Much of the time, there will be no
need to call a constructor / factory method, just make a copy of a
pre-existing default value (perhaps lazily).

Related work:
For classes without sensible default values, I have another proposal I
am working on to make initializing arrays of primitive objects possible
and efficient, without resorting to the all-zeroes default.

Cheers,
Clement Cherlin

From ccherlin at gmail.com  Sun Dec  5 23:09:20 2021
From: ccherlin at gmail.com (Clement Cherlin)
Date: Sun, 5 Dec 2021 17:09:20 -0600
Subject: Proposal: Static/final constructors for bucket-3 primitive
 classes.
In-Reply-To: <6d0e4bd0-4dd2-9702-1a24-3c7ce5eedf00@oracle.com>
References: <CAGjFO8Zddj+Z2ggQO=Pbs2skY4VXZYZKcCtdyCO4jgvUb4+Xzw@mail.gmail.com>
 <6d0e4bd0-4dd2-9702-1a24-3c7ce5eedf00@oracle.com>
Message-ID: <CALEU8=wHGWkcQk4ypGJ+D+s6ENxfBdS1RHBJQyiZFqugSqc3JQ@mail.gmail.com>

On Sun, Dec 5, 2021 at 12:36 PM Brian Goetz <brian.goetz at oracle.com> wrote:
>
> The following was received on valhalla-spec-comments.
>
> Summary: Various syntax options for no-arg constructors of "bucket 3"
> primitives, to enable users to pick a default value other than zero.
>
> Analysis: The suggestion is well-intentioned, but it is built on some
> significant misunderstandings of the problem we are facing.
>
> It assumes that it is sensible to allow a non-zero default value of a
> primitive to be specified by the class declaration.  While it is
> entirely understandable why one would want this, the problem is not that
> there isn't a good syntax for it (there obviously is), nor that running
> the constructor multiple times is the problem -- it is deeper than
> that.  Numerous safety properties derive from the fact that newly
> allocated objects and arrays are bulk-initialized to zero; compromising
> this seems likely to lead to exploits.

Thank you for your feedback. However, far from leading to new exploits,
my suggestion is aimed at fixing the flaws inherent in the current
design that make it extremely, unnecessarily difficult to use correctly
as a primitive class author.

It makes the assumption that the all zeroes value can and should be the
default value for every single primitive class. Initializing to zero is
simple, unambiguous and efficient. It is perfectly reasonable to have
all-zeroes as the "default default", so to speak. However, it is
completely unacceptable to make the "default default" the one and only
default, because it creates a value that was never constructed.
Numerous safety properties of existing classes also derive from the fact
that every instance was initialized by a constructor; compromising this
will inevitably lead to the same kinds of exploits that serialization did.

Consider a very slowly-growing, but not constant set of values which
ought to be expandable at runtime, such as, say, media type codes
for a transcoding server that supports dynamic plugins. It's not
constant, so it can't be an enum. We must validate any new instance
against a canonical list of permitted values before allowing it to be
constructed, lest invalid (possibly malicious) values sneak into the
system.

public primitive record MediaCode(byte b1, byte b2, byte b3, byte b4) {
    public MediaCode {
        if (!isValidMediaCode(b1, b2, b3, b4))
            throw new IllegalArgumentException();
    }
}

An invalid MediaCode of 0,0,0,0 is now trivially constructable, perhaps
accidentally, using

MediaCode[] mediaCodes = new MediaCode[numMediaCodes];
// time passes, mediaCodes is partially but not completely filled...
MediaCode whoops = mediaCodes[numMediaCodes - 1];

Which permits injecting "nul" bytes into, say, a byte stream that will be
deserialized by C code expecting null-terminated strings, or recognizing
as a "media file" something that is very much not.

Sounds like that could easily lead to an exploit to me. And class
authors are helpless to prevent this easily foreseeable error. Even
making the constructor private won't help, because the zero default
cannot be suppressed, hidden or prevented in any way.

I've seen the suggestion "Make the class private". If the only solution
to the problem is to hide from it, that is a tacit admission that the
current design is unworkable.

Now consider the problems caused by the unwanted but mandatory
implicit initializers in this class:

public primitive class LongRational {
    private long numerator = 0;
    private long denominator = 0;
    ...
}

which I don't think I need to elaborate.

These are just two examples I thought of off the top of my head. I can
invent dozens more plausible ways that the all-zeroes default will
create exploitable bugs with very little effort, and you know that the,
ahem, professional bug exploiters will have even less trouble.

The following excerpt is from "Towards Better Serialization"
(Brian Goetz, June 2019),
https://cr.openjdk.java.net/~briangoetz/amber/serialization.html

> In an object-oriented system, the role of the constructor is to initialize
> an object with its invariants established; this allows the rest of the
> system to assume a basic degree of object integrity. In theory, we
> should be able to reason about the possible states an object might be
> in by reading the code for its constructors and any methods that
> mutate the object's state. But because serialization constitutes a
> hidden public constructor, you have to also reason about the state
> that objects might be in based on previous versions of the code
> (whose source code might not even exist any more, to say nothing
> of maliciously constructed bytestreams). By bypassing constructors,
> serialization completely subverts the integrity of the object model.

Strong words. "The role of the constructor is to initialize an object
with its invariants established." "Serialization constitutes a hidden
public constructor...", and "...bypassing constructors... completely
subverts the integrity of the object model."

I fully agree with all of those statements and sentiments.

Unless authors waste up to 8 bytes of space in every instance by
including an "isConstructed" boolean, or waste time revalidating the
state of every instance in every method call, the integrity of the
object model is subverted. Is not introducing footguns an important
goal? Is maintaining the integrity of the object model an important
goal?

There will be a compromise somewhere, but forcing all-zeroes on every
primitive class is the *wrong* compromise.

How is the JVM bulk-initializing an array to an author-controlled
default value via memcpy (or equivalent) likely to lead to exploits?
Specifically, how is it any more likely to lead to exploits than the
JVM initializing an array to an arbitrary, uncontrolled, possibly
inherently invalid default value via calloc (or equivalent)?

If static/final were required on primitive class constructors (or there
was another way to initialize an array, more on that later) then there
would be no possible way for an exception to be thrown mid-array-
initialization. Is that not safe?

If you really want belt-and-suspenders safety, the JVM can initialize to
zero, then reinitialize with a constructed default. I don't see the need
for it, but it's a possibility.

Really think about the LongRational case. If default-zero initialization
can make a simple numeric type (one of the primary anticipated use
cases for primitive classes) so unsafe that the *default instance* will
throw ArithmeticException if one so much as looks at it, what are we
doing?

Decreeing that primitive classes cannot ever opt out of an unsanitized,
unvalidated, all-zeroes value will render them completely unsuitable
for some roles that they would otherwise be ideal for. At that point,
we might as well drop Bucket 3 entirely and stick with nullable value
classes, since those have a preexisting, if unfortunate, default.

I do not want to see primitive class initialization become a foreseeable
and preventable disaster like serialization was. Any mistakes in the
design will be a lasting part of Java, for future developers to curse
and future blackhat hackers to exploit.

Cheers,
Clement Cherlin

> -------- Forwarded Message --------
> Subject:        Proposal: Static/final constructors for bucket-3 primitive
> classes.
> Date:   Fri, 3 Dec 2021 21:15:50 -0600
> From:   Clement Cherlin <clement.cherlin at gmail.com>
> To:     valhalla-spec-comments at openjdk.java.net
>
>
>
> Motivation: A concern with primitive classes (bucket 3) is that the
> all-zeroes default value may be inappropriate or even invalid in some
> cases. This proposal suggests a language enhancement to give primitive
> class authors control over the default value of their class without,
> in most cases, requiring a constructor call to create an instance.
>
> Proposed language change:
> Primitive classes can apply either the keyword "static" or the
> keyword "final", but not both, to their no-argument constructor.
>
> A "final" no-arg constructor is evaluated once, at compile time. The
> constructed object is treated as a static final constant, and can be
> folded as a constant, or copied verbatim whenever a default value of
> that class is instantiated.
>
> A "static" no-arg constructor is evaluated once, when the class is loaded.
> The
> constructed object is copied verbatim whenever a default value of that
> class is instantiated.
>
> Justification:
> Presuming that non-zero default values need to exist, and we're going
> to be constructing lots and lots of primitive objects and arrays of
> primitive objects, it behooves us to make initialization of default
> values as efficient as possible. Much of the time, there will be no
> need to call a constructor / factory method, just make a copy of a
> pre-existing default value (perhaps lazily).
>
> Related work:
> For classes without sensible default values, I have another proposal I
> am working on to make initializing arrays of primitive objects possible
> and efficient, without resorting to the all-zeroes default.
>
> Cheers,
> Clement Cherlin

From john.r.rose at oracle.com  Thu Dec  9 04:30:50 2021
From: john.r.rose at oracle.com (John Rose)
Date: Wed, 08 Dec 2021 20:30:50 -0800
Subject: Proposal: Static/final constructors for bucket-3 primitive
 classes.
In-Reply-To: <6d0e4bd0-4dd2-9702-1a24-3c7ce5eedf00@oracle.com>
References: <CAGjFO8Zddj+Z2ggQO=Pbs2skY4VXZYZKcCtdyCO4jgvUb4+Xzw@mail.gmail.com>
 <6d0e4bd0-4dd2-9702-1a24-3c7ce5eedf00@oracle.com>
Message-ID: <92B6DF83-478B-4D69-8E31-C2F25CB5DD08@oracle.com>

We have considered, at various points in the last six years or more, 
allowing user-defined primitive types to define (under user control) 
their own default values.  The syntax is unimportant, but the concept is 
simple:  Surely the user who defines a primitive type can also define 
default initializer expressions for each of the fields.

But this would be a trail of tears, which we have chosen to avoid, each 
time the suggestion comes up.

This feature is often visualized as a predefined bit pattern, which the 
JVM would keep handy, and just stamp down wherever a default initializer 
is needed.  It?s can?t really be that simple, but even such a bit 
pattern is problematic.

First of all is the problem of declaring the bit pattern.  Java natively 
uses the side effects of `<clinit>` to define constants using ad hoc 
bytecodes; it also defines (for some types but not others) a concept of 
constant expression.  Neither of those fits well into a classfile that 
would define a primitive with a default bit pattern.

If the bit pattern is defined using ad hoc bytecode, it must be defined 
in a new pseudo-method (not `<clinit>`), to execute not *during* the 
initialization of the newly-declared primitive class, but *before*.  
(Surely not! a reader might exclaim, but this is the sort of subtlety we 
have to deal with.)  During initialization of a class C, all fields of 
its own type C must be initialized *before* the first bytecode of 
`<clinit>` executes, so that the static initializer code has something 
to write on.  So there must be a ?default value definition? phase, 
call it `<defaultvalueinit>`, added after linking and before 
initialization of C, so C?s `<clinit>` method has something to work 
with.  This `<defaultvalueinit>` is really the body of a no-argument 
constructor of C, or its twin.  A no-argument constructor of C is not a 
problem, but having it execute before C?s `<clinit>` block is a huge 
irregularity, which the JVM spec is not organized to support, at 
present.

This would turn into both JVMS and JLS spec. complexity, and more odd 
corners (and odd states) in the Java user experience.  Sure, a user will 
say, ?but I promise not to do anything odd; I just want *this field* 
to be the value `(int)1`?.  Yes, but a spec. must define not only the 
expected usages, but all possible usages, with no poorly-defined states.

OK, so if `<defaultvalueinit>` is not the place to define to define this 
elusive bit pattern, what about something more declarative, like a 
`ConstantValue` attribute?  Surely we could put a similarly structured 
`DefaultValue` attribute on every non-static field of a value type, and 
that would give the JVM enough information to synthesize the required 
bit pattern *before* it runs `<clinit>`.

Consider the user model here:  A primitive declaration would allow its 
fields to have non-zero default values, *but only drawn from the 
restricted set of constant expressions*, because those are the ones 
which fit in the `ConstantValue` attribute.  (They are true bit patterns 
in the constant pool, plus `String` constants.)  There is no previous 
place in Java where we make such a restriction, except `case` labels.  
Can you hear the groans of users as we try to explain why only constant 
expressions are allowed in that context?  That?s the muzak of the 
trail of tears I mentioned above.

But we have condy to fix that (someone will surely say).  But that?s 
problematic, because the resolution of constant pool constants of a 
class C requires C to be at least linked, and if the condy expression 
makes a self-reference to C itself, that will trigger C?s 
initialization, at an awkward moment.  Have you ever debugged a tangled 
initialization circularity, marked by mysterious NPEs on variables you 
*know* you initialized?  I have.  It?s a stop on the trail of tears I 
mentioned.

But if we really worked hard, and added a bunch of stuff to the JVMS and 
JLS, and persuaded users not to bother us about the odd restrictions (to 
constant expressions, or expressions which ?don?t touch the class 
itself?), we *could* define some sort of declarative default value 
initialization.

What then?  Well, ask the JVM engineers how they initialize heap 
variables, because those are the affected paths.  Those parts of the JVM 
are among the most performance-sensitive.  Currently, when a new object 
or array is created, its whole body (except the header) is sprayed with 
a nice even coat of all-zero-bit machine words.  This is pretty fast, 
and it?s important to keep it fast.  What if creating an array 
required painting some beautifully crafted arabesque of a bit pattern 
defined by a creative user?  Well, it?s doable, but much more 
complicated.  You need to load the bit pattern into live registers and 
(if it?s an array of C) keep them live while you paint the whole 
array.  That?s got to be more expensive than spraying zeroes.  
(There?s even hardware that?s good for spraying zeroes, on some 
machines.)  Basically, if we generously allowed users even a limited set 
of pre-defined default primitive values, we would be inviting them to 
create mysterious performance problems *for their clients*.

Reflective creation of objects and arrays is also complicated by 
non-zero defaults, of course.  When you reflectively create a heap node, 
today you compute its size, allocate its memory, store some metadata to 
its header, and paint the rest zero.  That turns into something more 
complicated (see above about live registers) and metadata-driven, in the 
presence of non-zero defaults.

I haven?t yet mentioned *reference* fields, but those are another can 
of worms.  The JVM vigorously tracks references.  Suppose your primitive 
had a String-valued field, and you were allowed to declare a non-null 
default value for it, say `"empty"`.  If one of your customers creates 
an array of these things, suddenly there is a GC card mark (for many 
GCs) on *every element of the array*, and that is *before you do 
anything useful with it*.

References also support circularity, including indirect cycles from an 
instance of C back to C itself.  Can you guarantee that the computation 
of some tricky reference for your default value of `C.foo` won?t 
require linking of C itself, and a vicious circularity?  No, you 
can?t, and you won?t like the feeling of debugging such a thing 
either.  Trail of tears, again.

Finally, depending on which of the above flawed tactics is chosen for 
representing user-selected default values, there is the possibility that 
JVM code can observe a variable V of type C in its pre-initialization 
state, because (a) C?s initialization specification is being loaded or 
evaluated somehow, and (b) the variable V has been allocated but is 
waiting for an initialization bit pattern.  (V might be a static of C, 
or something in a related dependent class.  Also it could be a 
multi-threading situation, where V is being observed via a race 
condition; those are very hard to keep straight.)  During those moments, 
if V is loaded, then (voila!) it will have either garbage or those good 
old all-zero bits in it.  And the abstraction we were laboring to secure 
will be subverted.  This usually doesn?t happen, but when it?s an 
accident it?s a very subtle bug, and when it?s on purpose it turns 
into a security escalation.

It?s best to keep the simple default all-zero conventions.  They are 
robust and understandable and regular.  When they are inconvenient, 
users will find workarounds.

I hope this helps.

? John

On 5 Dec 2021, at 10:36, Brian Goetz wrote:

> The following was received on valhalla-spec-comments.
>
> Summary: Various syntax options for no-arg constructors of "bucket 3" 
> primitives, to enable users to pick a default value other than zero.
>
> Analysis: The suggestion is well-intentioned, but it is built on some 
> significant misunderstandings of the problem we are facing.
>
> It assumes that it is sensible to allow a non-zero default value of a 
> primitive to be specified by the class declaration.? While it is 
> entirely understandable why one would want this, the problem is not 
> that there isn't a good syntax for it (there obviously is), nor that 
> running the constructor multiple times is the problem --
> it is deeper than that.? Numerous safety properties derive from the 
> fact that newly allocated objects and arrays are bulk-initialized to 
> zero; compromising this seems likely to lead to exploits.

From forax at univ-mlv.fr  Thu Dec  9 07:12:17 2021
From: forax at univ-mlv.fr (Remi Forax)
Date: Thu, 9 Dec 2021 08:12:17 +0100 (CET)
Subject: Proposal: Static/final constructors for bucket-3 primitive
 classes.
In-Reply-To: <92B6DF83-478B-4D69-8E31-C2F25CB5DD08@oracle.com>
References: <CAGjFO8Zddj+Z2ggQO=Pbs2skY4VXZYZKcCtdyCO4jgvUb4+Xzw@mail.gmail.com>
 <6d0e4bd0-4dd2-9702-1a24-3c7ce5eedf00@oracle.com>
 <92B6DF83-478B-4D69-8E31-C2F25CB5DD08@oracle.com>
Message-ID: <2057846228.132574.1639033937029.JavaMail.zimbra@u-pem.fr>

> From: "John Rose" <john.r.rose at oracle.com>
> To: "Brian Goetz" <brian.goetz at oracle.com>
> Cc: "valhalla-spec-experts" <valhalla-spec-experts at openjdk.java.net>, "clement
> cherlin" <clement.cherlin at gmail.com>
> Sent: Thursday, December 9, 2021 5:30:50 AM
> Subject: Re: Proposal: Static/final constructors for bucket-3 primitive classes.

> We have considered, at various points in the last six years or more, allowing
> user-defined primitive types to define (under user control) their own default
> values. The syntax is unimportant, but the concept is simple: Surely the user
> who defines a primitive type can also define default initializer expressions
> for each of the fields.

> But this would be a trail of tears, which we have chosen to avoid, each time the
> suggestion comes up.

> This feature is often visualized as a predefined bit pattern, which the JVM
> would keep handy, and just stamp down wherever a default initializer is needed.
> It?s can?t really be that simple, but even such a bit pattern is problematic.

> First of all is the problem of declaring the bit pattern. Java natively uses the
> side effects of <clinit> to define constants using ad hoc bytecodes; it also
> defines (for some types but not others) a concept of constant expression.
> Neither of those fits well into a classfile that would define a primitive with
> a default bit pattern.

> If the bit pattern is defined using ad hoc bytecode, it must be defined in a new
> pseudo-method (not <clinit> ), to execute not during the initialization of the
> newly-declared primitive class, but before . (Surely not! a reader might
> exclaim, but this is the sort of subtlety we have to deal with.) During
> initialization of a class C, all fields of its own type C must be initialized
> before the first bytecode of <clinit> executes, so that the static initializer
> code has something to write on. So there must be a ?default value definition?
> phase, call it <defaultvalueinit> , added after linking and before
> initialization of C, so C?s <clinit> method has something to work with. This
> <defaultvalueinit> is really the body of a no-argument constructor of C, or its
> twin. A no-argument constructor of C is not a problem, but having it execute
> before C?s <clinit> block is a huge irregularity, which the JVM spec is not
> organized to support, at present.

> This would turn into both JVMS and JLS spec. complexity, and more odd corners
> (and odd states) in the Java user experience. Sure, a user will say, ?but I
> promise not to do anything odd; I just want this field to be the value (int)1
> ?. Yes, but a spec. must define not only the expected usages, but all possible
> usages, with no poorly-defined states.

> OK, so if <defaultvalueinit> is not the place to define to define this elusive
> bit pattern, what about something more declarative, like a ConstantValue
> attribute? Surely we could put a similarly structured DefaultValue attribute on
> every non-static field of a value type, and that would give the JVM enough
> information to synthesize the required bit pattern before it runs <clinit> .

> Consider the user model here: A primitive declaration would allow its fields to
> have non-zero default values, but only drawn from the restricted set of
> constant expressions , because those are the ones which fit in the
> ConstantValue attribute. (They are true bit patterns in the constant pool, plus
> String constants.) There is no previous place in Java where we make such a
> restriction, except case labels. Can you hear the groans of users as we try to
> explain why only constant expressions are allowed in that context? That?s the
> muzak of the trail of tears I mentioned above.

> But we have condy to fix that (someone will surely say).

you read my mind :) 

> But that?s problematic, because the resolution of constant pool constants of a
> class C requires C to be at least linked, and if the condy expression makes a
> self-reference to C itself, that will trigger C?s initialization, at an awkward
> moment. Have you ever debugged a tangled initialization circularity, marked by
> mysterious NPEs on variables you know you initialized? I have. It?s a stop on
> the trail of tears I mentioned.

> But if we really worked hard, and added a bunch of stuff to the JVMS and JLS,
> and persuaded users not to bother us about the odd restrictions (to constant
> expressions, or expressions which ?don?t touch the class itself?), we could
> define some sort of declarative default value initialization.

> What then? Well, ask the JVM engineers how they initialize heap variables,
> because those are the affected paths. Those parts of the JVM are among the most
> performance-sensitive. Currently, when a new object or array is created, its
> whole body (except the header) is sprayed with a nice even coat of all-zero-bit
> machine words. This is pretty fast, and it?s important to keep it fast. What if
> creating an array required painting some beautifully crafted arabesque of a bit
> pattern defined by a creative user? Well, it?s doable, but much more
> complicated. You need to load the bit pattern into live registers and (if it?s
> an array of C) keep them live while you paint the whole array. That?s got to be
> more expensive than spraying zeroes. (There?s even hardware that?s good for
> spraying zeroes, on some machines.) Basically, if we generously allowed users
> even a limited set of pre-defined default primitive values, we would be
> inviting them to create mysterious performance problems for their clients .

> Reflective creation of objects and arrays is also complicated by non-zero
> defaults, of course. When you reflectively create a heap node, today you
> compute its size, allocate its memory, store some metadata to its header, and
> paint the rest zero. That turns into something more complicated (see above
> about live registers) and metadata-driven, in the presence of non-zero
> defaults.

> I haven?t yet mentioned reference fields, but those are another can of worms.
> The JVM vigorously tracks references. Suppose your primitive had a
> String-valued field, and you were allowed to declare a non-null default value
> for it, say "empty" . If one of your customers creates an array of these
> things, suddenly there is a GC card mark (for many GCs) on every element of the
> array , and that is before you do anything useful with it .

> References also support circularity, including indirect cycles from an instance
> of C back to C itself. Can you guarantee that the computation of some tricky
> reference for your default value of C.foo won?t require linking of C itself,
> and a vicious circularity? No, you can?t, and you won?t like the feeling of
> debugging such a thing either. Trail of tears, again.

> Finally, depending on which of the above flawed tactics is chosen for
> representing user-selected default values, there is the possibility that JVM
> code can observe a variable V of type C in its pre-initialization state,
> because (a) C?s initialization specification is being loaded or evaluated
> somehow, and (b) the variable V has been allocated but is waiting for an
> initialization bit pattern. (V might be a static of C, or something in a
> related dependent class. Also it could be a multi-threading situation, where V
> is being observed via a race condition; those are very hard to keep straight.)
> During those moments, if V is loaded, then (voila!) it will have either garbage
> or those good old all-zero bits in it. And the abstraction we were laboring to
> secure will be subverted. This usually doesn?t happen, but when it?s an
> accident it?s a very subtle bug, and when it?s on purpose it turns into a
> security escalation.

> It?s best to keep the simple default all-zero conventions. They are robust and
> understandable and regular. When they are inconvenient, users will find
> workarounds.

> I hope this helps.

I fully agree, i think it's better to do the opposite and force the fact that all primitive value classes (Bucket 3) must have a default constructor and that constructor have a fixed bytecode instructions. 

If a user does not provide a constructor without parameter, the compiler will provide one and the verifier will check that this constructor exist. 
If a user want to provide that constructor to be able to add javadoc on it, it should have only one instruction which is to call default() with no parameter, 
something like 

public primitive value class Complex { 
public Complex() { 
default(); 
} 
} 

>From the VM POV, it's an initfactory with a defaultvalue (or whatever the name of that bytecode) + areturn, 
so this can be easily check by the VM. 

The idea of forcing to have such constructor is to help users to think that whatever they do, people will still be able to create an empty B3. 

> ? John

R?mi 

> On 5 Dec 2021, at 10:36, Brian Goetz wrote:

>> The following was received on valhalla-spec-comments.

>> Summary: Various syntax options for no-arg constructors of "bucket 3"
>> primitives, to enable users to pick a default value other than zero.

>> Analysis: The suggestion is well-intentioned, but it is built on some
>> significant misunderstandings of the problem we are facing.

>> It assumes that it is sensible to allow a non-zero default value of a primitive
>> to be specified by the class declaration. While it is entirely understandable
>> why one would want this, the problem is not that there isn't a good syntax for
>> it (there obviously is), nor that running the constructor multiple times is the
>> problem --
>> it is deeper than that. Numerous safety properties derive from the fact that
>> newly allocated objects and arrays are bulk-initialized to zero; compromising
>> this seems likely to lead to exploits.

From john.r.rose at oracle.com  Thu Dec  9 08:45:11 2021
From: john.r.rose at oracle.com (John Rose)
Date: Thu, 9 Dec 2021 08:45:11 +0000
Subject: [External] : Re: Proposal: Static/final constructors for bucket-3
 primitive classes.
In-Reply-To: <2057846228.132574.1639033937029.JavaMail.zimbra@u-pem.fr>
References: <CAGjFO8Zddj+Z2ggQO=Pbs2skY4VXZYZKcCtdyCO4jgvUb4+Xzw@mail.gmail.com>
 <6d0e4bd0-4dd2-9702-1a24-3c7ce5eedf00@oracle.com>
 <92B6DF83-478B-4D69-8E31-C2F25CB5DD08@oracle.com>
 <2057846228.132574.1639033937029.JavaMail.zimbra@u-pem.fr>
Message-ID: <F9FE05D4-A0D3-4696-A0C7-D212B0A5CD14@oracle.com>

On Dec 8, 2021, at 11:12 PM, Remi Forax <forax at univ-mlv.fr> wrote:
> 
> I fully agree, i think it's better to do the opposite

I snapped a few neurons trying to read that the first time. 

> and force the fact that all primitive value classes (Bucket 3) must have a default constructor and that constructor have a fixed bytecode instructions.

Heavy on ceremony even for Java especially if you can?t do anything valuable in the constructor body. 
> 
> If a user does not provide a constructor without parameter, the compiler will provide one and the verifier will check that this constructor exist.

That?s JVM ceremony, to what end?

Maybe we should disallow no-arg constructors altogether and leave room for a possible future feature along the lines of the special init phase. That future feature would run ad hoc byte codes at class preparation time to build thyroid default value and would throw an error if it touched the class. Kind of like superclass init actions; after those and before the proper clinit call. 

It?s possible but not a priority, because of the various expenses I sketched. So we could leave space for it to put in later if the costs were justified after all. 

From forax at univ-mlv.fr  Thu Dec  9 15:25:50 2021
From: forax at univ-mlv.fr (forax at univ-mlv.fr)
Date: Thu, 9 Dec 2021 16:25:50 +0100 (CET)
Subject: [External] : Re: Proposal: Static/final constructors for
 bucket-3 primitive classes.
In-Reply-To: <F9FE05D4-A0D3-4696-A0C7-D212B0A5CD14@oracle.com>
References: <CAGjFO8Zddj+Z2ggQO=Pbs2skY4VXZYZKcCtdyCO4jgvUb4+Xzw@mail.gmail.com>
 <6d0e4bd0-4dd2-9702-1a24-3c7ce5eedf00@oracle.com>
 <92B6DF83-478B-4D69-8E31-C2F25CB5DD08@oracle.com>
 <2057846228.132574.1639033937029.JavaMail.zimbra@u-pem.fr>
 <F9FE05D4-A0D3-4696-A0C7-D212B0A5CD14@oracle.com>
Message-ID: <1803938440.516237.1639063550354.JavaMail.zimbra@u-pem.fr>

----- Original Message -----
> From: "John Rose" <john.r.rose at oracle.com>
> To: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "Brian Goetz" <brian.goetz at oracle.com>, "valhalla-spec-experts" <valhalla-spec-experts at openjdk.java.net>, "clement
> cherlin" <clement.cherlin at gmail.com>
> Sent: Thursday, December 9, 2021 9:45:11 AM
> Subject: Re: [External] : Re: Proposal: Static/final constructors for bucket-3 primitive classes.

> On Dec 8, 2021, at 11:12 PM, Remi Forax <forax at univ-mlv.fr> wrote:
>> 
>> I fully agree, i think it's better to do the opposite
> 
> I snapped a few neurons trying to read that the first time.

hum, there is a missing 'but' after the comma ...

> 
>> and force the fact that all primitive value classes (Bucket 3) must have a
>> default constructor and that constructor have a fixed bytecode instructions.
> 
> Heavy on ceremony even for Java especially if you can?t do anything valuable in
> the constructor body.
>> 
>> If a user does not provide a constructor without parameter, the compiler will
>> provide one and the verifier will check that this constructor exist.
> 
> That?s JVM ceremony, to what end?

Users are used to constructors, but bucket 3 inherently has an escape hatch because you can create an instance bypassing the constructors.

Bypassing the constructors is bad, we know that because this is what the serialization does, so instead of letting people to figure out out of blue that they should use B2 instead of B3 for such classes, i think it's better to maintain the illusion that there is a default constructor with no parameter for all B3.

It makes the semantics of B3 very clear, by making a public default constructor mandatory.

The JVM ceremony is not strictly necessary, i propose it so if people uses another bytecode generator than javac, things are still nicely aligned between the JLS and the JVMS view of the world, but it's less important.

> 
> Maybe we should disallow no-arg constructors altogether and leave room for a
> possible future feature along the lines of the special init phase. That future
> feature would run ad hoc byte codes at class preparation time to build thyroid
> default value and would throw an error if it touched the class. Kind of like
> superclass init actions; after those and before the proper clinit call.
> 
> It?s possible but not a priority, because of the various expenses I sketched. So
> we could leave space for it to put in later if the costs were justified after
> all.

We may do something like that in a possible future, but i think it's more important to make the semantics of B3 visible front and center.

R?mi

From john.r.rose at oracle.com  Thu Dec  9 18:15:06 2021
From: john.r.rose at oracle.com (John Rose)
Date: Thu, 09 Dec 2021 10:15:06 -0800
Subject: [External] : Re: Proposal: Static/final constructors for bucket-3
 primitive classes.
In-Reply-To: <1803938440.516237.1639063550354.JavaMail.zimbra@u-pem.fr>
References: <CAGjFO8Zddj+Z2ggQO=Pbs2skY4VXZYZKcCtdyCO4jgvUb4+Xzw@mail.gmail.com>
 <6d0e4bd0-4dd2-9702-1a24-3c7ce5eedf00@oracle.com>
 <92B6DF83-478B-4D69-8E31-C2F25CB5DD08@oracle.com>
 <2057846228.132574.1639033937029.JavaMail.zimbra@u-pem.fr>
 <F9FE05D4-A0D3-4696-A0C7-D212B0A5CD14@oracle.com>
 <1803938440.516237.1639063550354.JavaMail.zimbra@u-pem.fr>
Message-ID: <3B4A412A-412D-4081-8BCA-1D1BF89C5564@oracle.com>

On 9 Dec 2021, at 7:25, forax at univ-mlv.fr wrote:

> We may do something like that in a possible future, but i think it's 
> more important to make the semantics of B3 visible front and center.

If you can only say one thing in such an explicit no-arg constructor 
(true initially and maybe forever) then it surely is strange that the 
silly thing has a body.  So that leads to some un-bodied presentation 
like `class P { public default P(); }`, which could be made more 
expressive later (or never, probably).

But that, in turn, hits near to one of the places where Java *already 
set the default* (rightly or wrongly). Java defines, under some 
circumstances, the no-arg constructor for a class implicitly.  Arguably 
this precedent applies (though not exactly) to the current case, of 
default construction of the default value.

I think, in the end, making a new primitive (as opposed to a new value 
class) is going to be an activity for library experts, not end users.  
Maybe the IDEs (not the JLS) can help them avoid pitfalls, but 
primitives are inherently tricky things to define.  This means either 
that (a) it?s OK to force the experts to do the extra ceremony?s or 
(b) it?s OK to assume they know the rules of that game, and the 
ceremony won?t add anything.  I incline towards (b).

The vision I?m assuming here is that a _bare primitive_ is something 
inherently loosely assembled.  It?s really just a bundle of scalar 
values.  If you want a class wrapped around that bundle, you should be 
declaring your value as a _primitive reference_ (assuming the option for 
the bare primitive must also be provided) or declaring your type as a 
true _value class_ (if the option for the bare primitive is not so 
important).

P.S. A friend kindly helped me update my metaphor firmware.  I meant to 
say that pushing the feature under discussion would lead us along a path 
of pain, with various experiences along the way.  But obviously not 
existential Jacksonian pain.  And that?s all I want to say here about 
that.

From kevinb at google.com  Tue Dec 14 01:49:15 2021
From: kevinb at google.com (Kevin Bourrillion)
Date: Mon, 13 Dec 2021 17:49:15 -0800
Subject: basic conceptual model
Message-ID: <CAGKkBksqAtkYHvxFUbkj_7CCiOUOV2azRHDu+iT7ohcEfzf5zg@mail.gmail.com>

Hi,

So I've been threatening for a long time that I've been hard at work
writing up a coherent conceptual model for "how data looks/works inside a
running Java program today". I have a few purposes for it, but one is to
form a basis for explaining "and now here's precisely what parts Valhalla
will change and how".

*Data in Java programs: a basic conceptual model*
<https://docs.google.com/document/d/1J-a_K87P-R3TscD4uW2Qsbt5BlBR_7uX_BekwJ5BLSE/preview>

This model comes filtered through a particular set of perceptions and
biases. It's just *a* documented model and isn't trying to be a *the*. As
such, you don't have to agree with all of it, but it would still be very
helpful to know if it is inconsistent or confusing or ill-founded, or if
you just see a way it could be better. I'll gladly add comment access on
request.

The next document (whenever that is) will try to examine various options
for adjusting that particular model to accommodate Valhalla.

-- 
Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com

From john.r.rose at oracle.com  Tue Dec 14 02:40:58 2021
From: john.r.rose at oracle.com (John Rose)
Date: Mon, 13 Dec 2021 18:40:58 -0800
Subject: basic conceptual model
In-Reply-To: <CAGKkBksqAtkYHvxFUbkj_7CCiOUOV2azRHDu+iT7ohcEfzf5zg@mail.gmail.com>
References: <CAGKkBksqAtkYHvxFUbkj_7CCiOUOV2azRHDu+iT7ohcEfzf5zg@mail.gmail.com>
Message-ID: <72C8172B-A3B5-4825-890F-CFC2D40D4253@oracle.com>

I have some comments.  Since the doc invites directly stuck-on comments, I?ve requested edit permission, as that seems necessary for me to stick on a comment.

Some free-floating notes:

Good use of ?freely copyable? as a concept.  There?s a tough case, happily not relevant to Java, of linear types (IIRC Rust has them) where a value is freely copyable, but only to the extent that the source forgets the value after the sink gets it.  Accounting for that would stress your terminology.

Another (more subtle) stress to your terminology is your assertion that a mutable variable ?forgets? the previous value when a new value is stored.  That isn?t strictly correct in the case of race conditions.  Only a volatile variable reliably ?forgets? its previous value in the presence of races.

You don?t actually define the term ?value? but just illustrate it and make claims about it.  Maybe you have to do it that way?  Actually, you say it?s ?unit of data?.  Referring to ?data? as a known term (for readers who are programmers) is OK.

Saying ?unit? is more mysterious.  You certainly don?t mean units of measure, or functional programming unit types.  Are you meaning to imply that it has no subparts which might also be termed units?  That?s OK as long as you have today?s primitives (which I like to call ?scalar primitives?) and of course references (which are also scalars).  By ?scalar? I mean an item of data that is not composed of further scalars.

From john.r.rose at oracle.com  Tue Dec 14 02:44:59 2021
From: john.r.rose at oracle.com (John Rose)
Date: Mon, 13 Dec 2021 18:44:59 -0800
Subject: basic conceptual model
In-Reply-To: <72C8172B-A3B5-4825-890F-CFC2D40D4253@oracle.com>
References: <CAGKkBksqAtkYHvxFUbkj_7CCiOUOV2azRHDu+iT7ohcEfzf5zg@mail.gmail.com>
 <72C8172B-A3B5-4825-890F-CFC2D40D4253@oracle.com>
Message-ID: <7146FD7D-4901-4F3C-B144-9B3A9E0722ED@oracle.com>

Two more thoughts:  You could get away with saying ?indivisible unit?; I think that would convey much of what you mean.  Also, a footnote drawing the reader?s attention to native hardware types (long, byte, float, reference) would make it clear that a Java computation is meant to ?bottom out? in operations on units of data familiar to assembly programmers.  They are indivisible units, but even more important, their operations are natural to real computers.

On 13 Dec 2021, at 18:40, John Rose wrote:

> I have some comments.  Since the doc invites directly stuck-on comments, I?ve requested edit permission, as that seems necessary for me to stick on a comment.
>
> Some free-floating notes:
>
> Good use of ?freely copyable? as a concept.  There?s a tough case, happily not relevant to Java, of linear types (IIRC Rust has them) where a value is freely copyable, but only to the extent that the source forgets the value after the sink gets it.  Accounting for that would stress your terminology.
>
> Another (more subtle) stress to your terminology is your assertion that a mutable variable ?forgets? the previous value when a new value is stored.  That isn?t strictly correct in the case of race conditions.  Only a volatile variable reliably ?forgets? its previous value in the presence of races.
>
> You don?t actually define the term ?value? but just illustrate it and make claims about it.  Maybe you have to do it that way?  Actually, you say it?s ?unit of data?.  Referring to ?data? as a known term (for readers who are programmers) is OK.
>
> Saying ?unit? is more mysterious.  You certainly don?t mean units of measure, or functional programming unit types.  Are you meaning to imply that it has no subparts which might also be termed units?  That?s OK as long as you have today?s primitives (which I like to call ?scalar primitives?) and of course references (which are also scalars).  By ?scalar? I mean an item of data that is not composed of further scalars.

From kevinb at google.com  Tue Dec 14 03:05:09 2021
From: kevinb at google.com (Kevin Bourrillion)
Date: Mon, 13 Dec 2021 19:05:09 -0800
Subject: basic conceptual model
In-Reply-To: <7146FD7D-4901-4F3C-B144-9B3A9E0722ED@oracle.com>
References: <CAGKkBksqAtkYHvxFUbkj_7CCiOUOV2azRHDu+iT7ohcEfzf5zg@mail.gmail.com>
 <72C8172B-A3B5-4825-890F-CFC2D40D4253@oracle.com>
 <7146FD7D-4901-4F3C-B144-9B3A9E0722ED@oracle.com>
Message-ID: <CAGKkBksb=aXQdK3a_wOn-2Mwst81PQeh+XDjZMcJ07wACo4E0A@mail.gmail.com>

>
> On 13 Dec 2021, at 18:40, John Rose wrote:
>


> > Another (more subtle) stress to your terminology is your assertion that
> a mutable variable ?forgets? the previous value when a new value is
> stored.  That isn?t strictly correct in the case of race conditions.  Only
> a volatile variable reliably ?forgets? its previous value in the presence
> of races.
>

Indeed there was a revision where "(modulo race conditions)" was there and
I'll put it back.


> You don?t actually define the term ?value? but just illustrate it and
> make claims about it.  Maybe you have to do it that way?  Actually, you say
> it?s ?unit of data?.  Referring to ?data? as a known term (for readers who
> are programmers) is OK.
>

Yes, in general I am sure that I can't accomplish actual ground up
non-cyclical definition-definitions here. I think it should suffice to be
descriptive enough for the reader to course-correct their previous notions
in this direction (provided they want to).


> Saying ?unit? is more mysterious.  You certainly don?t mean units of
> measure, or functional programming unit types.  Are you meaning to imply
> that it has no subparts which might also be termed units?


Oh, I actually do not want to imply irreducibility at all. That all values
have had that property in Java is a fact I would label as
incidental-not-essential.,

Glob, gob, blob, hunk, chunk, piece, .....


> That?s OK as long as you have today?s primitives (which I like to call
> ?scalar primitives?) and of course references (which are also scalars).  By
> ?scalar? I mean an item of data that is not composed of further scalars.
>

A tangent, but there's enough math major still in me to object to this. :-)
Scalars are scalar because they scale things! This would be more similar to
a one-dimensional vector space than to a scalar....  imho the best
adjective for today's primitives is "primitive" and I'll plead my case
about that soon too. :-)

-- 
Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com

From john.r.rose at oracle.com  Tue Dec 14 04:36:31 2021
From: john.r.rose at oracle.com (John Rose)
Date: Mon, 13 Dec 2021 20:36:31 -0800
Subject: [External] : Re: basic conceptual model
In-Reply-To: <CAGKkBksb=aXQdK3a_wOn-2Mwst81PQeh+XDjZMcJ07wACo4E0A@mail.gmail.com>
References: <CAGKkBksqAtkYHvxFUbkj_7CCiOUOV2azRHDu+iT7ohcEfzf5zg@mail.gmail.com>
 <72C8172B-A3B5-4825-890F-CFC2D40D4253@oracle.com>
 <7146FD7D-4901-4F3C-B144-9B3A9E0722ED@oracle.com>
 <CAGKkBksb=aXQdK3a_wOn-2Mwst81PQeh+XDjZMcJ07wACo4E0A@mail.gmail.com>
Message-ID: <6EC05409-60DD-4E25-A9FC-026BADB09F74@oracle.com>

On 13 Dec 2021, at 19:05, Kevin Bourrillion wrote:

> ?
> Yes, in general I am sure that I can't accomplish actual ground up
> non-cyclical definition-definitions here. I think it should suffice to be
> descriptive enough for the reader to course-correct their previous notions
> in this direction (provided they want to).

Yup, I see that?s how it?s working in there.
>
>
>> Saying ?unit? is more mysterious.  You certainly don?t mean units of
>> measure, or functional programming unit types.  Are you meaning to imply
>> that it has no subparts which might also be termed units?
>
>
> Oh, I actually do not want to imply irreducibility at all. That all values
> have had that property in Java is a fact I would label as
> incidental-not-essential.,
>
> Glob, gob, blob, hunk, chunk, piece, .....

In that case I claim unit has the wrong connotation, since it does (often) come with an expectation of irreducibility.  With that in mind I like the unassuming term ?piece?, or those other words.  If you are still in thesaurus mode:

https://www.thesaurus.com/browse/portion

>
>
>
>> That?s OK as long as you have today?s primitives (which I like to call
>> ?scalar primitives?) and of course references (which are also scalars).  By
>> ?scalar? I mean an item of data that is not composed of further scalars.
>>
>
> A tangent, but there's enough math major still in me to object to this. :-)
> Scalars are scalar because they scale things! This would be more similar to
> a one-dimensional vector space than to a scalar....  imho the best
> adjective for today's primitives is "primitive" and I'll plead my case
> about that soon too. :-)

Sure, that?s a good position for math majors like you and me.  And I?m sure you/they/we really squirm in the presence of discussions about ?vector processing units? and ?vector ISAs?.  But the squirm-worthy folks that define VPUs also use the term ?scalar? to mean ?the value that?s in a vector lane?, and they assuredly do not mean that ?scalar? can be identified with ?single-lane vector?.

From brian.goetz at oracle.com  Tue Dec 14 19:48:42 2021
From: brian.goetz at oracle.com (Brian Goetz)
Date: Tue, 14 Dec 2021 14:48:42 -0500
Subject: Enhancing java.lang.constant for Valhalla
Message-ID: <bca1d885-db11-3619-493f-fa9322f278d1@oracle.com>

The jl.constant API will have to be updated somewhat for Valhalla.? 
Since it was already on the drawing board when we designed jl.constant, 
shouldn't be too bad, but there are a few subtleties.? Now that the 
descriptors are largely settling down, we can take a stab at this.

ClassDesc (the base abstraction) has several factories to get a CD from 
a String:

 ??? of(String qualifiedName)
 ??? of(String packageName, unqualifiedClassName)
 ??? ofDescriptor(String fieldDescriptor)

Obviously we can already represent extended primitives with the last of 
these, but doing nothing else would make them somewhat second-class.

For reference, we also have combinators:

 ??? nested(String unqualifiedNestedName): give me a ClassDesc for a 
class nested in this one
 ??? arrayType(): give me a ClassDesc for the array with this component type
 ??? componentType(): (partial) give me a ClassDesc for the component 
type of this one, assuming this one is an array type

With the addition of Q descriptors, this library reveals itself to be 
L-biased; ClassDesc.of("com.foo.Bar") gives us an L-Bar. ("You could 
look in the classfile", I hear some of you say.? Not so fast; this is a 
symbolic API, not a reflective one, by design.)? But this is OK; L is a 
reasonable default.

The fully orthogonal version would involve adding:

 ??? static ClassDesc ofValue(qualifiedName)
 ??? static ClassDesc ofValue(String packageName, unqualifiedClassName)
 ??? boolean isValue()
 ??? ClassDesc valueType() // flip to Q
 ??? ClassDesc refType()?? // flip to L

But the first two are not really necessary, since they can be expressed 
both with ClassDesc.of(name).valueType(), or with 
ClassDesc.ofDescriptor(desc), and I'm inclined to go that route -- the 
canonical constructor is ofDescriptor, the others are conveniences 
around that, and complex transforms are done by combinators.

Separately, over in bytecode-API land, we have identified a desire for 
another overload of ClassDesc::of, which is one that takes an internal 
(slash-separated) name.

One of the horrors of classfile APIs is that the classfile format is 
woefully inconsistent about names.? Sometimes it wants an internal 
binary name (foo/Bar), sometimes a descriptor (Lfoo/Bar;), and there are 
other exceptions (e.g., module and package names use dots, operand to 
`new` is sometimes an internal binary name, but a descriptor for 
arrays.)? So accepting any sort of String immediately raises the 
question: "in what format?"? A more strongly typed API would use 
ClassDesc in some places, but given that we might have an internal 
binary name, or external binary name, or descriptor in hand, we need a 
way to convert all of these to a ClassDesc.? For the external binary 
name and descriptor, we have ClassDesc::of and ::ofDescriptor, but we're 
missing one for internal binary names.? So I'm proposing:

 ??? ClassDesc ofInternal(String internalBinaryName)

to round out the set.

So, summarizing the new methods (modulo naming changes to reflect 
changes in Valhalla language syntax):

 ??? ClassDesc ofInternal(String internalBinaryName)
 ??? boolean isValue()
 ??? ClassDesc valueType()
 ??? ClassDesc refType()


Also, eventually, all the *Impl classes in this library can become B2 
primitives, since identity doesn't matter.


From daniel.smith at oracle.com  Wed Dec 15 16:38:03 2021
From: daniel.smith at oracle.com (Dan Smith)
Date: Wed, 15 Dec 2021 16:38:03 +0000
Subject: EG meeting, 2021-12-15
Message-ID: <ADD51DF5-FCDA-4802-84D8-4BB3B98AD90C@oracle.com>

EG Zoom meeting today at 5pm UTC (9am PDT, 12pm EDT).

Possible topics:

"JEP update: Value Objects": discussed some details about inferred superinterfaces, including impact on JVMTI

"basic conceptual model": Kevin shared his notes describing key Java programming model concepts, in anticipation of changes coming from primitive classes

"Enhancing java.lang.constant for Valhalla": Brian explored evolution of java.lang.constant


From kevinb at google.com  Wed Dec 15 18:42:55 2021
From: kevinb at google.com (Kevin Bourrillion)
Date: Wed, 15 Dec 2021 10:42:55 -0800
Subject: We have to talk about "primitive".
Message-ID: <CAGKkBkvWThUhyXkJevCYqUG45yuWTmXAHeJ8e6UWtHzrNhtCvg@mail.gmail.com>

(Okay, so we're doing this)

I think the rename to "primitive classes" happened during my outage last
year. When I came back I made the decision to like it.

Since then, I've found that in my explanatory model I'm fighting against it
constantly. I think it may actually be fatally flawed.

The points I raise here were surely already known at the time, and I know
there were good reasons for overriding them. But I feel the need to come
back and push harder on their importance.

Background: the textbook definition of "primitive" is centered on their
nature of being elements-not-molecules, and I see no dispute about it.
Also, there's no disputing the fact that we're allowed to adopt a different
meaning if we so choose. So that's not even the fatal flaw.

The main problem I think we can't escape is that we'll still need some word
that means only the eight predefined types. (For the sake of argument let's
assume we can pick one and lean hard on it, whether that's "predefined",
"built-in", "elemental", "leaf type", or whatever.)

Definitely, our trying to minimize their specialness is virtuous. They
should be like helium: yes, they are molecules when you want a molecule!
But on any deeper look they will clearly be "actually" elements, and the
distinction will matter often enough.

So we have to attempt to shift users' understanding of "primitive" while at
the same time injecting a new term to mean exactly what primitive used to
mean. That's the old Indiana Jones switch and I don't have to tell you how
that turned out for him.

It would be difficult to pull off in a world where we were just pushing
some new server and the whole world gets the new model at once. But in this
universe where every version of Java ever made all have to coexist, it's
looking to me like a guaranteed source of never-ending confusion.

I also think it robs us of our ability to smoothly portray the real changes
of Valhalla. We want to be able to say "elements are still elements! now we
have molecules too". Pedagogically that is always preferable to "elements
aren't really what you thought they were". Okay, the real comparison is a
little more nuanced than that, but I'll get to that now.

An alternative that seems to work fine, in my mental model at least, is:

   - Primitive types are examples of value types, and have always been.
   - Java never supported any other kinds of value types before, so we
   didn't distinguish the terms before.
   - Everything you associate with primitive types remains true.
   - But most of those traits really come from their value-type-ness.

(I plan to make the above shifts to my model document already.)

   - Now we have user-defined value types too.
   - The way we user-define a type is with a class, so a value type is
   defined by a "value class" (sorry B2).
   - The primitive types will now each get a value class.
   - These 8 classes will look as much like user-defined types as Object
   does.
   - They, like Object, will have a "cheat" in their source code that no
   one else gets to use. (Object's is that there is no implied `extends
   Object` or `super();`; these need no fields because the data they store is
   magically handled by the VM. These feel like similar cheats.)

Then mopping up the rest:

   - Existing classes probably need a term like "reference classes" (in the
   model I'm going to circulate that doubles down on values-are-not-objects,
   then this wants to be "object classes", even though that feels weird at
   first).
   - I think the term for bucket 2 classes really ought to center on
   identitylessness, e.g. "noid", "noident", "idfree", or something. Anything
   else is getting away from the essential meaning of the bucket; plus, we
   want people to call bucket 1 classes "identity classes", don't we?

Footnote: for a more concrete manifestation of this problem: I am sure we
cannot possibly get away with Class.isPrimitive() being true for these
classes. Right?

Thoughts?

-- 
Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com

From brian.goetz at oracle.com  Wed Dec 15 19:17:25 2021
From: brian.goetz at oracle.com (Brian Goetz)
Date: Wed, 15 Dec 2021 14:17:25 -0500
Subject: We have to talk about "primitive".
In-Reply-To: <CAGKkBkvWThUhyXkJevCYqUG45yuWTmXAHeJ8e6UWtHzrNhtCvg@mail.gmail.com>
References: <CAGKkBkvWThUhyXkJevCYqUG45yuWTmXAHeJ8e6UWtHzrNhtCvg@mail.gmail.com>
Message-ID: <e0e56903-782f-e4c8-f1c9-d232cdfa9c46@oracle.com>


> Background: the textbook definition of "primitive" is centered on 
> their nature of being elements-not-molecules, and I see no dispute 
> about it. Also, there's no disputing the fact that we're allowed to 
> adopt a different meaning if we so choose. So that's not even the 
> fatal flaw.

Yes, that's definitely a point against -- these things are "not 
primitive" in the atomic sense.? OTOH, they are *very much like* today's 
primitives in many other ways.? So this is a choice between being 
strictly linguistically accurate and appealing to existing mental 
models.? Tough choice.

> The main problem I think we can't escape is that we'll still need some 
> word that means only the eight predefined types. (For the sake of 
> argument let's assume we can pick one and lean hard on it, whether 
> that's "predefined", "built-in", "elemental", "leaf type", or whatever.)

I've been calling them the built-in primitives; we've test-driven other 
terms like "basic" primitives.? Assume we'll agree on a term. Also, no 
matter how we try, they will be different from the extended primitives 
in some ways, such as:

 ?- Their reference companions have weird names (e.g., Integer);
 ?- They permit a seemingly circular declaration (i.e., the declaration 
of "class int" will use "int" in its representation);
 ?- They will be translated differently, because the VM has built-in 
carriers for I/J/F/D, whereas extended primitives will use the L and Q 
carriers;
 ?- There will probably be some special treatment in reflection for 
these eight types;

Most of these are things about which we can say "OK, fine, these are 
historical warts."

There may be others asymmetries too, that derive from compatibility 
constraints.? As you say, the game is minimization.

> An alternative that seems to work fine, in my mental model at least, is:
>
>   * Primitive types are examples of value types, and have always been.
>   * Java never supported any other kinds of value types before, so we
>     didn't distinguish the terms before.
>   * Everything you associate with primitive types remains true.
>   * But most of those traits really come from their value-type-ness.
>

FTR, there is one big difference, which has a few consequences.? The big 
difference is reference-ness; value and primitive classes give rise to 
reference types, whereas primitive classes additionally give rise to a 
"primitive" type.? That the "primitive" type gives us reference-ness 
means it gives up nullability and non-tearability.

I think what you're saying here, at root, is to give the "value" name to 
extended primitives, and find another name to give to B2?

From daniel.smith at oracle.com  Wed Dec 15 20:10:44 2021
From: daniel.smith at oracle.com (Dan Smith)
Date: Wed, 15 Dec 2021 20:10:44 +0000
Subject: We have to talk about "primitive".
In-Reply-To: <e0e56903-782f-e4c8-f1c9-d232cdfa9c46@oracle.com>
References: <CAGKkBkvWThUhyXkJevCYqUG45yuWTmXAHeJ8e6UWtHzrNhtCvg@mail.gmail.com>
 <e0e56903-782f-e4c8-f1c9-d232cdfa9c46@oracle.com>
Message-ID: <1FB47341-5AC1-4F5F-AC7B-F1A24F53D4D8@oracle.com>

On Dec 15, 2021, at 12:17 PM, Brian Goetz <brian.goetz at oracle.com<mailto:brian.goetz at oracle.com>> wrote:

The main problem I think we can't escape is that we'll still need some word that means only the eight predefined types. (For the sake of argument let's assume we can pick one and lean hard on it, whether that's "predefined", "built-in", "elemental", "leaf type", or whatever.)

I've been calling them the built-in primitives; we've test-driven other terms like "basic" primitives.  Assume we'll agree on a term.  Also, no matter how we try, they will be different from the extended primitives in some ways, such as:

 - Their reference companions have weird names (e.g., Integer);
 - They permit a seemingly circular declaration (i.e., the declaration of "class int" will use "int" in its representation);
 - They will be translated differently, because the VM has built-in carriers for I/J/F/D, whereas extended primitives will use the L and Q carriers;
 - There will probably be some special treatment in reflection for these eight types;

Most of these are things about which we can say "OK, fine, these are historical warts."

There may be others asymmetries too, that derive from compatibility constraints.  As you say, the game is minimization.

Yes, this is a good list. Add to it:
- They are named with a lower-case keyword
- They exclusively get to use special operators (for now)

My high-level response to "primitive=one of 8 types" is that it may be giving the good name to, and drawing attention to, something that doesn't matter much. Sure, we'll need to specify a distinction for the purpose of the things on the list, but I don't think most programmers should really care whether the value they're working with belongs to one of the 8 special types or not.

These especially don't matter:
- Aliased reference type names: going forward, everybody should be saying `int.ref` instead
- Circular declarations: less than 100 people in the world need to care about this (maybe exaggerating)
- Weird JVM features: yes, but the JVM has lots of quirks, ergonomics are not the top priority

And the operator limitation is not fundamental, certainly could be addressed in the future.

So we're left with, for most Java programmers, a set of special types that get spelled with keywords and get some special behavior in the reflection API. My initial sense is that's not enough to put them in their own different-noun category.

Meanwhile, if we can tell programmers "primitives have members/classes now, and libraries can define additional primitives", that can build on existing intuitions pretty well. For example, the primitive type/reference type duality still exists, and pretty much works the same. Asking them to do s/primitive type/value type/ in this context is its own Indiana Jones maneuver.


From kevinb at google.com  Wed Dec 15 22:18:48 2021
From: kevinb at google.com (Kevin Bourrillion)
Date: Wed, 15 Dec 2021 14:18:48 -0800
Subject: We have to talk about "primitive".
In-Reply-To: <e0e56903-782f-e4c8-f1c9-d232cdfa9c46@oracle.com>
References: <CAGKkBkvWThUhyXkJevCYqUG45yuWTmXAHeJ8e6UWtHzrNhtCvg@mail.gmail.com>
 <e0e56903-782f-e4c8-f1c9-d232cdfa9c46@oracle.com>
Message-ID: <CAGKkBkudUX9QDodf8Uc82joCf-OE7OkW7gW0RU0kYdMuBiCMoQ@mail.gmail.com>

On Wed, Dec 15, 2021 at 11:17 AM Brian Goetz <brian.goetz at oracle.com> wrote:

> Background: the textbook definition of "primitive" is centered on their
> nature of being elements-not-molecules, and I see no dispute about it.
> Also, there's no disputing the fact that we're allowed to adopt a different
> meaning if we so choose. So that's not even the fatal flaw.
>
> Yes, that's definitely a point against -- these things are "not primitive"
> in the atomic sense.  OTOH, they are *very much like* today's primitives in
> many other ways.  So this is a choice between being strictly linguistically
> accurate and appealing to existing mental models.  Tough choice.
>

My way of trying to cut through tough choices like that is to ask: Which
traits that we associate with primitives today can we assess as being
*essential* to their meaning, and which are the ones that are *incidental*?


 - Their reference companions have weird names (e.g., Integer);
>  - They permit a seemingly circular declaration (i.e., the declaration of
> "class int" will use "int" in its representation);
>  - They will be translated differently, because the VM has built-in
> carriers for I/J/F/D, whereas extended primitives will use the L and Q
> carriers;
>  - There will probably be some special treatment in reflection for these
> eight types;
>
> Most of these are things about which we can say "OK, fine, these are
> historical warts."
>

I think there's a deeper conceptual need as well. To understand something
that can recursively contain things of its own kind, I think many people
want to have a sense of "but where does that all stop?" What are the leaves
in that tree? The answer is "(builtin-)primitives and references", the buck
stops there. The fact that100% of all of your data is all actually made up
of those things alone (grouped into containers like objects) is
significant, to me. So that's an eternal way that they're special that
isn't a historical wart.

An alternative that seems to work fine, in my mental model at least, is:
>
>    - Primitive types are examples of value types, and have always been.
>    - Java never supported any other kinds of value types before, so we
>    didn't distinguish the terms before.
>    - Everything you associate with primitive types remains true.
>    - But most of those traits really come from their value-type-ness.
>
> FTR, there is one big difference, which has a few consequences.  The big
> difference is reference-ness; value and primitive classes give rise to
> reference types, whereas primitive classes additionally give rise to a
> "primitive" type.  That the "primitive" type gives us reference-ness means
> it gives up nullability and non-tearability.
>

I'm not sure I understood this, but I do want to at least add a bullet to
my list:

   - Now, every value type will come along with a corresponding reference
   type. (We didn't need that before because we could just hand-code 8
   reference types and done.)

As for tearability: from *this* perspective 64-bit values are already
technically tearable, so nothing new here. It's from a different
perspective, that of writing a class, where expectations have to be
weakened.


I think what you're saying here, at root, is to give the "value" name to
> extended primitives, and find another name to give to B2?
>

Yes, that's somewhere near the end of my message. I think B2 should stay
centered on the concept of identitylessness.

-- 
Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com

From brian.goetz at oracle.com  Wed Dec 15 23:06:12 2021
From: brian.goetz at oracle.com (Brian Goetz)
Date: Wed, 15 Dec 2021 18:06:12 -0500
Subject: [External] : Re: We have to talk about "primitive".
In-Reply-To: <CAGKkBkudUX9QDodf8Uc82joCf-OE7OkW7gW0RU0kYdMuBiCMoQ@mail.gmail.com>
References: <CAGKkBkvWThUhyXkJevCYqUG45yuWTmXAHeJ8e6UWtHzrNhtCvg@mail.gmail.com>
 <e0e56903-782f-e4c8-f1c9-d232cdfa9c46@oracle.com>
 <CAGKkBkudUX9QDodf8Uc82joCf-OE7OkW7gW0RU0kYdMuBiCMoQ@mail.gmail.com>
Message-ID: <77745ddc-aded-b656-113f-6f687e7a1236@oracle.com>

It took us a while to unravel this one, but I think we did.

The JMM says that loads and stores of references, and of 
32-bit-and-smaller primitive values, are atomic with respect to other 
loads and stores of the same variable.? This means that you'll see a 
valid value, though it could be a stale one.? For 64 bit primitives, it 
is treated as 2 32 bit loads, which are individually atomic.

The initialization safety guarantees -- that you see correct values of 
final fields even when loading a reference with a race -- rests on the 
atomicity properties above.

What this says is that tearing/non-tearing is a property of 
reference-vs-primitive-ness; accessing a (fat) value through a reference 
gives you *more guarantees* than accessing it directly. 
(Correspondingly, this has more costs.)

All of this is to say, as I think you are saying: primitives of a 
certain size were always tearable, and they still are; references never 
were, and they are still not.

> As for tearability: from /this/?perspective 64-bit values are already 
> technically tearable, so nothing new here. It's from a different 
> perspective, that of writing a class, where expectations have to be 
> weakened.

From kevinb at google.com  Wed Dec 15 23:14:21 2021
From: kevinb at google.com (Kevin Bourrillion)
Date: Wed, 15 Dec 2021 15:14:21 -0800
Subject: We have to talk about "primitive".
In-Reply-To: <1FB47341-5AC1-4F5F-AC7B-F1A24F53D4D8@oracle.com>
References: <CAGKkBkvWThUhyXkJevCYqUG45yuWTmXAHeJ8e6UWtHzrNhtCvg@mail.gmail.com>
 <e0e56903-782f-e4c8-f1c9-d232cdfa9c46@oracle.com>
 <1FB47341-5AC1-4F5F-AC7B-F1A24F53D4D8@oracle.com>
Message-ID: <CAGKkBks7+_aU=4Awrh0xCzUQpBY_-c5pWQfDxAA-Hydaako+cg@mail.gmail.com>

On Wed, Dec 15, 2021 at 12:10 PM Dan Smith <daniel.smith at oracle.com> wrote:

Yes, this is a good list. Add to it:
> - They are named with a lower-case keyword
> - They exclusively get to use special operators (for now)
>

(Well that parenthetical turns my blood cold....)
Leaning away from that though: I'm most worried about ==/!= because they
are overloaded across ALL types, of all kinds, and including types that
will be migrating behind the scenes. All the combinations here seem like
potential Puzzler-Whack-A-Mole. But the only thing for it is to sit down
and look at the whole matrix...


My high-level response to "primitive=one of 8 types" is that it may be
> giving the good name to, and drawing attention to, something that doesn't
> matter much. Sure, we'll need to specify a distinction for the purpose of
> the things on the list, but I don't think most programmers should really
> care whether the value they're working with belongs to one of the 8 special
> types or not.
>

I'm not sure "primitive" IS the good name. Maybe "value" is the good name?
Agreed that most programmers most of the time can interact with all
"molecules" in the same consistent way, and that is very good.
But I don't think the need for the concept ever fades *too* far into the
background. A mental graph needs leaves.
If the concept is still needed *sometimes*, then I think it's a problem if
the term you always knew that concept by got taken away.


- Circular declarations: less than 100 people in the world need to care
> about this (maybe exaggerating)
>

Oh, many more people will want to understand how we square that circle than
have any absolute technical need to. They'll wake up one night thinking
"wait, what the hell is an int made of then?" As they descend into that pit
we want them to hit some simple workable explanation they can bounce off of
and get back to work.

Something like "the contents of a value are either (a) the other values
they contain (see their fields), or (b) for primitives, the contents
defined by the platform itself (see no fields)".

(At least I believe an `int` class would not need or want a `value` field
since it can just use `this` for that... right?)


> So we're left with, for most Java programmers, a set of special types that
> get spelled with keywords and get some special behavior in the reflection
> API. My initial sense is that's not enough to put them in their own
> different-noun category.
>

Example: many usages of Class.isPrimitive() are basically recursing an
object graph and simply need to know where to stop.
Did we hit bottom or not? It's a basic kind of question.


> Meanwhile, if we can tell programmers "primitives have members/classes
> now, and libraries can define additional primitives", that can build on
> existing intuitions pretty well. For example, the primitive type/reference
> type duality still exists, and pretty much works the same. Asking them to
> do s/primitive type/value type/ in this context is its own Indiana Jones
> maneuver.
>

Obviously I see it as meaningfully different from an Indiana Jones
maneuver, and wouldn't have used the term if I didn't.

On an island where only pear trees grow they wouldn't have a word for
"fruit".
A traveler comes, "here, these are apples".
Well, they're going to make a word for fruit.
Most times they used to say "pear", it was really the fruitness that
mattered.
They start saying "fruit" more than "pear". They still need the word "pear"
sometimes.

Contrast: traveler says "here, these are pears."
"What?"
"These are the apple kind of pears, and what you have are heritage pears."
For a while no one knows what the hell "bring me a pear" means anymore.
Also, for some reason everyone is being chased by a giant spherical boulder.

One feels more destabilizing than the other.

-- 
Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com

From john.r.rose at oracle.com  Thu Dec 16 01:49:34 2021
From: john.r.rose at oracle.com (John Rose)
Date: Wed, 15 Dec 2021 17:49:34 -0800
Subject: [External] : Re: We have to talk about "primitive".
In-Reply-To: <77745ddc-aded-b656-113f-6f687e7a1236@oracle.com>
References: <CAGKkBkvWThUhyXkJevCYqUG45yuWTmXAHeJ8e6UWtHzrNhtCvg@mail.gmail.com>
 <e0e56903-782f-e4c8-f1c9-d232cdfa9c46@oracle.com>
 <CAGKkBkudUX9QDodf8Uc82joCf-OE7OkW7gW0RU0kYdMuBiCMoQ@mail.gmail.com>
 <77745ddc-aded-b656-113f-6f687e7a1236@oracle.com>
Message-ID: <BBCFE4E9-65DB-40EF-90F1-4FB28F708D26@oracle.com>

On 15 Dec 2021, at 15:06, Brian Goetz wrote:

> It took us a while to unravel this one, but I think we did.
> ? What this says is that tearing/non-tearing is a property of 
> reference-vs-primitive-ness; accessing a (fat) value through a 
> reference gives you *more guarantees* than accessing it directly. 
> (Correspondingly, this has more costs.)
>
> All of this is to say, as I think you are saying: primitives of a 
> certain size were always tearable, and they still are; references 
> never were, and they are still not.

Of course references don?t tear, and more to the point, `final` fields 
reached by references also don?t tear, because they are (a) safely 
published and (b) never mutated after publication.  So, as Brian says, 
wrapping a reference around some chunks of state has a special benefit 
(as well as a special cost).  The reference wrapper freezes those chunks 
in place, relative to each other.

From john.r.rose at oracle.com  Thu Dec 16 03:15:32 2021
From: john.r.rose at oracle.com (John Rose)
Date: Wed, 15 Dec 2021 19:15:32 -0800
Subject: We have to talk about "primitive".
In-Reply-To: <CAGKkBkvWThUhyXkJevCYqUG45yuWTmXAHeJ8e6UWtHzrNhtCvg@mail.gmail.com>
References: <CAGKkBkvWThUhyXkJevCYqUG45yuWTmXAHeJ8e6UWtHzrNhtCvg@mail.gmail.com>
Message-ID: <842DC86C-7AA5-4D9F-BEA0-89BD195EA7A2@oracle.com>

On 15 Dec 2021, at 10:42, Kevin Bourrillion wrote:

> ?
> The main problem I think we can't escape is that we'll still need some 
> word
> that means only the eight predefined types. (For the sake of argument 
> let's
> assume we can pick one and lean hard on it, whether that's 
> "predefined",
> "built-in", "elemental", "leaf type", or whatever.)

As others have said, we?ll pick a term for this.  The idea of calling 
out a ?leaf? in a data graph is compelling to me.  As you say, 
people are going to wonder what is the foundation of the whole scheme.  
(No it?s not objects all the way down, at least that?s not what we 
are aiming for.)

(But?spoiler alert?the division between leaf/scalar/basic type and 
composite/class type is *less important in daily practice* than the ad 
hoc mental models programmers make about which types they choose to view 
as composite and which are indivisible.  Typical example:  Most 
programmers choose to regard `String` as a sort of nullable primitive.  
I?ll pick up that thread later.)

I like the term ?basic type?, and (as we already discussed) I like 
?scalar? also, because ?scalar? correctly suggests something 
about how it?s processed in hardware.

Here?s a point I think is also important and has not been discussed 
much yet:  A concept like ?basic type? (or ?scalar type?) should 
include references as well as Java?s eight current primitive types.  
Like an `int` or other basic primitive, a reference is copied by value, 
processed efficiently (probably in a hardware register), and is a 
?leaf item? with respect to a single object layout or method type 
signature.  Also, like `int`, a reference has its own special operators 
in the language and special bytecodes in the JVM.  Like `int`, it has a 
default value `null` (instead of `0`).

The main difference of a reference from an `int` is the fact that it has 
a far end:  You can often (not always) find other values by indirecting 
the reference and loading a field or calling a method or querying a 
super type.  (Because it has a far end, it also has a nominal subtype to 
classify what might be at the far end.  But I?m speaking here about 
references per se, apart from their subtypes.)  Despite their ?far 
end?, people treat some reference types, like `String`, as if they 
were leaves; you stop at the `String` and don?t bother thinking about 
its fields.  Users don?t care that there?s an array somewhere on the 
other end, unless they are engineering the string class itself.  So a 
reference has a far end, unlike an `int`, but, like an `int`, a 
reference *often* is treated like an unstructured value, in code.

Bottom line:  There are a handful of built-in basic types.  These are 
used to compose classes.  They are the primitives and the references.  
When we consider a reference apart from its class (say, as `jl.Object`), 
it can be comfortably called a *basic type*, and then that handful of 
built-in basic types consists of the (basic) primitives and references.

OK, that?s enough on that.  Whether ?reference? is a basic type is 
less important than how we choose to extend (or not extend) the reach of 
the term ?primitive?.

For historic reasons we use the word ~~fruit~~ *primitive* to mean a 
basic type other than a reference.  Now that we have user-defined 
`int`-like things, we have to decide whether and how to connect the old 
word to the new things.  Since user-defined `int`-like things are (we 
think) very like `int` in many ways, a term like ?extended 
primitive? makes sense.

This is how I get to the terms ?basic primitive? and ?extended 
primitive?.  Or ?scalar primitive? and ?extended primitive?.

As I read your messages, you would prefer to keep the term 
?primitive? narrow, because of the possible confusion of telling 
users ?hey, what you think of as primitives are now the ~~heirloom~~ 
basic primitives.?  Personally, I think users will say, to our 
unveiling ?extended primitives?, something like this:

>> Well, that?s not exactly what the dictionary says primitive means, 
>> if you can make new composite ones.  But I do know that Java has 
>> non-reference types and calls them ?primitive?.  And I also know 
>> it would be really cool to define new types that work like `int`, 
>> such as `UnsignedInt` or `HalfFloat` or the like.  I get why they 
>> don?t want to build all such types into the language; in fact maybe 
>> I?d like to try my hand someday at defining my own.  So, 
>> ?extended primitive?.  It?s on:  The Java primitives are now an 
>> open-ended set just like the Java objects.

In other words, in saying ?extended primitive? (and also ?basic 
primitive?) we lean away from the dictionary definition of 
?primitive? and into the Java definition.  That feels like a 
non-confusing choice to me.

>
> Definitely, our trying to minimize their specialness is virtuous.

Yep.  We also call this ?healing the rift?, sometimes.

> ?
> So we have to attempt to shift users' understanding of "primitive" 
> while at
> the same time injecting a new term to mean exactly what primitive used 
> to
> mean. That's the old Indiana Jones switch and I don't have to tell you 
> how
> that turned out for him.

So, no, it?s not the Indy switch, at all.  Users know what ~~fruit~~ 
primitives are in Java, and they will have no problem with adding new 
~~imported exotic apples~~ extended primitive to the familiar set of 
primitive types.  And in exchange for this infusion of wonderful new 
types, they will learn a new term for the old types, which is ~~pears~~ 
basic primitives (or scalar primitives).

>
> It would be difficult to pull off in a world where we were just 
> pushing
> some new server and the whole world gets the new model at once. But in 
> this
> universe where every version of Java ever made all have to coexist, 
> it's
> looking to me like a guaranteed source of never-ending confusion.
>
> I also think it robs us of our ability to smoothly portray the real 
> changes
> of Valhalla. We want to be able to say "elements are still elements! 
> now we
> have molecules too".

There are two kinds of users w.r.t. the question of ?what?s a 
primitive? and you can?t please both.  You and I want to please 
different kinds.  The user I want to please is one who thinks of ?Java 
primitive? as a kind of non-nullable scalar number (or boolean or 
char).  The user you want to please thinks of ?Java primitive? as 
?all leaves in the Big Graph?.  The latter user will be disappointed 
if we say ?Java primitives? can be non-leaves.  The former user will 
be delighted.  The latter user sees a `String` and wants to crack out 
its underlying array, in a Gollum-like quest for the roots of the 
mountains.  The latter user treats a `String` as a primitive.  There are 
more of the former than the latter; we should cater to them.  It?s the 
former who I was channeling above, concluding with ?The Java 
primitives are now an open-ended set just like the Java objects.?

> Pedagogically that is always preferable to "elements
> aren't really what you thought they were". Okay, the real comparison 
> is a
> little more nuanced than that, but I'll get to that now.
>
> An alternative that seems to work fine, in my mental model at least, 
> is:
>
>    - Primitive types are examples of value types, and have always 
> been.
>    - Java never supported any other kinds of value types before, so we
>    didn't distinguish the terms before.
>    - Everything you associate with primitive types remains true.
>    - But most of those traits really come from their value-type-ness.
>
> (I plan to make the above shifts to my model document already.)

The term ?value? can be applied to composites in B3 alone, to 
composites in B2 alone, or to both.  (Or neither.)  All the basic types, 
including references, are values as well.

This is big choice, where to ?spend? the term ?value?.

Our choice will be informed and supported by our account about what *we 
mean* by the term ?value?.

If the word value means ?a primitive thing that can be stored in a 
register?, then we can?t extend it.  So that won?t fly.

For us the word value means something like that but adjusted, ?a thing 
that is freely copyable and can be stored in one or more registers?.

But look how that affects B2 and B3:

B3 are values, obviously; there is no reference to confuse their free 
copying.  (There is also no reference to help us adjoin `null` to the 
value set, and no reference to help us perform safe publication.)

B2 are references to? well, values as well.  They might be on the 
heap, or they might be elsewhere; we don?t care because the freely 
copyable values are not also accompanied by object identity.

Both B1 and B2 *references* (per se) are, confusingly, also values, 
since basic types (and/or references) are freely copyable.

But a B2 reference is a value, which refers to another value.  (Proof 
they are distinct values:  One is possibly null, the other isn?t.)  
And like a user using `String`, the value-ness of a B2 reference can be 
treated as a single, simple, atomic thing, without further reference to 
substructure.  In particular, because it?s not B1, there?s no 
possibility of state under the B2 reference; there?s just the value 
you care about.

I think, because the term value applies in so many places (including B1 
references), it will be tricky to use it as a classification (like 
?pear?) instead of an assertion of use (like ?fruit?).

But given the choice between using the term ?value? to classify 
types, distinguishing them from B1 types, I think the correct choice is 
to apply the term to B2, as ?value object? vs. ?identity 
object?.

The value-ness of B3 (as loose aggregates) and B1 (as references) is 
going to add a bit of confusion.  Dan did a round of naming where he 
used the term ?pure object? as the opposite of ?identity 
object?; now we are at ?value object? vs. ?identity object?, I 
think.

>
>    - Now we have user-defined value types too.
>    - The way we user-define a type is with a class, so a value type is
>    defined by a "value class" (sorry B2).
>    - The primitive types will now each get a value class.
>    - These 8 classes will look as much like user-defined types as 
> Object
>    does.
>    - They, like Object, will have a "cheat" in their source code that 
> no
>    one else gets to use. (Object's is that there is no implied 
> `extends
>    Object` or `super();`; these need no fields because the data they 
> store is
>    magically handled by the VM. These feel like similar cheats.)

I don?t disagree with any of the above, but I think the value classes 
live in B2 not in B3.  The B3 types are derived from the B2 types, by 
?dumping out? the class fields.  Note that every single B3 type 
(non-reference) has a unique companion B2 type (reference).  The 
semantic difference between those types is like the semantic difference 
between `int` and `Integer`.  Narrow but useful.

Separate question:  Does the declaring form for a B3/B2 type pair 
?look like? a B2-only declaration, but with an added mode switch?  
Or does it ?look like? a B3-declaration, something that?s not a 
full-on class-that-defines-objects?  We could go either way on that.  
Either way, one declaration will define two related types.

Suppose we have this B2-only class declaration syntax:

```
__ByValue class NamedInt { String name; int value; ? }
```

Then a B2-tilted syntax for a B3/B2 pair might look like:

```
__ByValue __AlsoPrimitive class Point { double x, y; ? }
```

And a B3-tilted syntax for the same pair might look like:

```
__ExtendedPrimitive Point { double x, y; ? }
```

(F.D.: I think the B3-tilted syntax is less likely to succeed.)

Either way, you can draw out a B3 type from the first and a B2 type from 
the second.

As a sort of mental experiment, you can also imagine a ?two headed? 
declaration syntax that would provide independent specification of the 
names of both types:

```
__PrimitiveType int &  /*int is B3*/
__PrimitiveBox class Integer /*int.ref=Integer is B2*/
     extends Comparable<Integer> {
   ? one body with two heads ?
}
```

Why do that?  Well, it makes it clear that a one-headed declaration 
could in principle start with either the B3 or the B2 end of the stick.  
Also it helps us think, a little, about retrofitting the very odd legacy 
wrapper names.


>
> Then mopping up the rest:
>
>    - Existing classes probably need a term like "reference classes" 
> (in the
>    model I'm going to circulate that doubles down on 
> values-are-not-objects,
>    then this wants to be "object classes", even though that feels 
> weird at
>    first).
>    - I think the term for bucket 2 classes really ought to center on
>    identitylessness, e.g. "noid", "noident", "idfree", or something. 
> Anything
>    else is getting away from the essential meaning of the bucket; 
> plus, we
>    want people to call bucket 1 classes "identity classes", don't we?

If we spend the good word ?value? on B3, we must then find a word 
like ?noid? for B2.  But since I think ?value-ness? is centered 
in B2 from the start, I?d rather find a one-off term for B3!  (And 
that?s ?primitive? as argued above.)

But let?s grant, for a moment, that we don?t want ?value? for 
B2.  What term characterizes B2 types?  As you say, they are objects but 
they don?t have identity, so ?noid?, etc.  That?s a true 
description.  But it?s not the main point of B2 types.  The point of 
B2 types is not that we dislike object identity (we like it a lot in 
many cases!).  The point of B2 types is they can be regarded as tidy 
bundles of field values, and/or tidy abstractions (like `String`) of 
simple values, without confounding state changes.  After looking at this 
from many angles, I prefer to say that, while B2 has the *negative* 
characteristic of being identity-free, it has the *positive* 
characteristic of being *freely copyable*.  The ?freely? is so free 
that copying often happens outside of the JVM heap.  In fact, a B2 type 
is a value.

Maybe there?s a different way of characterizing the *positive* nature 
of B2, but I think it comes down to, ?B2 types are plain values?.  
Until I get an even better account for B2?s special power (one that 
doesn?t begin with the word ?not? or ?no? or ?doesn?t?), 
I?m going to be very happy to declare B2 types as ?value classes? 
and work with their instances as ?value objects?.

So, while I see why you want to avoid the paradox of ?extended 
primitives?, and your very correct identification of ?values? in 
B3, I prefer to talk about B3 as primitives (primitive values) and B2 as 
value objects.

BTW, I agree that B3 values should not be objects; maybe we can call 
them instances, although instance/class/object are terms that usually 
appear together.  Obviously both B1 and B2 contain 
instances/classes/objects.

BTW again, I updated my own Zoo of Field Types diagram here, and you 
might wish to give it a look, since it?s relevant to this discussion:

http://cr.openjdk.java.net/~jrose/values/type-kinds-venn.pdf

(that?s cr.openjdk.java.net/~jrose/values/type-kinds-venn.pdf if the 
URL police got the previous line)

> Footnote: for a more concrete manifestation of this problem: I am sure 
> we
> cannot possibly get away with Class.isPrimitive() being true for these
> classes. Right?

Yeah, `Class::isPrimitive` is a query on types, not classes.  In other 
words, the `Class` mirror, for this call, is serving to reflect a type, 
for example one of `int.class` or `Integer.class`.  If we apply the term 
?primitive? to classes, then we will need a not-so-good name, like 
`Class::isPrimitiveClass`.  However, if we choose to make extended 
primitives reflect very similarly to basic primitives, then we can 
choose to have `Class::isPrimitive` to return true *for their 
non-reference types*.

There is no reference type for which `Class::isPrimitive` is true.  
Despite my fondness for the concept of ?basic types? there is no 
`Class::isBasicType`.  There could be, in the future, though I don?t 
think it pulls its weight.  We could also have 
`Class::isBasicPrimitive`.  Or we could choose to break less code by 
keeping `Class::isPrimitive` true only for nine mirrors, and define 
`Class::isReferenceType` and/or `Class::isNonReferenceType` to provide 
the query for ~~fruit~~ basic or extended primitive types.

From brian.goetz at oracle.com  Thu Dec 16 17:50:44 2021
From: brian.goetz at oracle.com (Brian Goetz)
Date: Thu, 16 Dec 2021 12:50:44 -0500
Subject: [External] : Re: Enhancing java.lang.constant for Valhalla
In-Reply-To: <CAJq4Gi4WOm39QJXqBtYPno-PokTg78apGvc0rDOuS50MFj38_A@mail.gmail.com>
References: <bca1d885-db11-3619-493f-fa9322f278d1@oracle.com>
 <CAJq4Gi4WOm39QJXqBtYPno-PokTg78apGvc0rDOuS50MFj38_A@mail.gmail.com>
Message-ID: <bcf55909-1ee8-79ab-0f44-affcf7fea910@oracle.com>

This reminds me of an earlier version of the jl.constant API, where we 
tried to track the varargs bit.? In the end, we dropped this, because it 
washed off too easily in the API.? We could have a preload() bit that 
travels with the ClassDesc, which would then have to be propagated into 
a bit mask in MethodTypeDesc, which would have to carry the bits around 
(and expose them) through combinators like dropArguments().? Seems 
possible, but also seems like there's gonna be some whack-a-mole 
handling the wash-off cases.

Stepping back, the ClassDesc type was originally intended to model the 
C_Class constant pool entry.? And there's not L* flavor of C_Class.? 
But, it is also reasonable to use ClassDesc as a way of describing a 
field descriptor in a bytecode API (and similar for MethodTypeDesc.)

Presumably the preload attribute is one attribute for the whole class.? 
That means that a classfile reader would have to parse that attribute, 
and when dispensing ClassDesc to clients, would have to look in the 
table to see whether the class is there?

Also, how would this affect ClassDesc::equals?? Would LFoo and L*Foo be 
equal?


On 12/16/2021 12:26 PM, Dan Heidinga wrote:
> The updated api looks pretty good for handling both L and Q descriptors.
>
> There's one case that isn't handled here though - L* descriptors.
> WIth the bucket 2 & 3 design, we really have 3 kinds of descriptors:
> L, Q, and L*.  Over the years, we've spent a lot of time as an EG
> talking about stars on descriptors and working through the issues
> related to "stars washing off".  I think this API is one of the cases
> where stars may wash off and therefore needs some way to indicate that
> a given L descriptor is actually an L* descriptor.
>
> Why are stars important?  If we don't have stars - which are the L
> form of Q's "go and look" preload contract - we lose out on calling
> convention and layout optimizations for the bucket 2 classes with
> their L descriptors.  Without the "*" on the descriptor, we may miss
> the preload signal for a given ClassDesc and lose our chance to
> optimize.
>
> Tracking this extra boolean state in the ClassDesc seems like a
> reasonable thing to do to let the stars flow through the system and
> ensure they are available at classfile generation time when we want to
> write the preload attribute.
>
>> So, summarizing the new methods (modulo naming changes to reflect changes in Valhalla language syntax):
>>
>>      ClassDesc ofInternal(String internalBinaryName)
>>      boolean isValue()
>>      ClassDesc valueType()
>>      ClassDesc refType()
> Having 3 forms of descriptors unfortunately causes some issues with
> the "isValue()" and "valueType()" apis.  Do both bucket 2 and 3 return
> "true" for isValue()?  There are also two possible results for
> valueType() when called on an L - add a star or convert to a Q
> descriptor.
>
> Figuring out the right methods to add - and naming them - overlaps
> somewhat with the other thread about the meaning of "primitive".  So
> borrowing John's terminology, I think we can get by with a single new
> potential API after redefining the contract for some of the other
> proposed apis:
>
>       ClassDesc ofInternal(String internalBinaryName)
>       boolean isValue()                         // true for both L* & Q
>       ClassDesc valueType()                // creates L* descriptor
>       ClassDesc extendedPrimitive()    // creates Q descriptor
>       ClassDesc refType()
>
> Alternatively, a single 'ClassDesc valueType(boolean isQ)' could be
> added but I think the multiple method approach is better as aligning
> the names makes the intention clearer.
>
> --Dan
>

From kevinb at google.com  Thu Dec 16 18:31:15 2021
From: kevinb at google.com (Kevin Bourrillion)
Date: Thu, 16 Dec 2021 10:31:15 -0800
Subject: We have to talk about "primitive".
In-Reply-To: <842DC86C-7AA5-4D9F-BEA0-89BD195EA7A2@oracle.com>
References: <CAGKkBkvWThUhyXkJevCYqUG45yuWTmXAHeJ8e6UWtHzrNhtCvg@mail.gmail.com>
 <842DC86C-7AA5-4D9F-BEA0-89BD195EA7A2@oracle.com>
Message-ID: <CAGKkBkse3WhkW=08OZtY99xOj4rLUbjQ_61o9xosTkWLOaeGtQ@mail.gmail.com>

Really appreciate the attention and insight here. I must respond on the
installment plan.


On Wed, Dec 15, 2021 at 7:15 PM John Rose <john.r.rose at oracle.com> wrote:

> On 15 Dec 2021, at 10:42, Kevin Bourrillion wrote:
>
> ?
> The main problem I think we can't escape is that we'll still need some
> word
> that means only the eight predefined types. (For the sake of argument
> let's
> assume we can pick one and lean hard on it, whether that's "predefined",
> "built-in", "elemental", "leaf type", or whatever.)
>
> As others have said, we?ll pick a term for this. The idea of calling out a
> ?leaf? in a data graph is compelling to me. As you say, people are going to
> wonder what is the foundation of the whole scheme. (No it?s not objects all
> the way down, at least that?s not what we are aiming for.)
>
> (But?spoiler alert?the division between leaf/scalar/basic type and
> composite/class type is *less important in daily practice* than the ad
> hoc mental models programmers make about which types they choose to view as
> composite and which are indivisible. Typical example: Most programmers
> choose to regard String as a sort of nullable primitive. I?ll pick up
> that thread later.)
>
Yes, I agree.
(Because I hate to drop a metaphor) Physicists want to know that the proton
is divisible, but they can do a hell of a lot without paying attention to
that fact.


> I like the term ?basic type?, and (as we already discussed) I like
> ?scalar? also, because ?scalar? correctly suggests something about how it?s
> processed in hardware.
>
Note I'm stipulating that we'll find the most perfect term there is (next
to "primitive"), and all my arguments remain.


> Here?s a point I think is also important and has not been discussed much
> yet: A concept like ?basic type? (or ?scalar type?) should include
> references as well as Java?s eight current primitive types.
>
I think I've said this somewhere in this threads as well, "I'm quite
comfortable with the idea that references are the ninth primitive type",
but I backpedaled from that by the time I'd finished the conceptual model
<https://docs.google.com/document/d/1J-a_K87P-R3TscD4uW2Qsbt5BlBR_7uX_BekwJ5BLSE/preview>.
To suggest that "reference" is a type implies that each reference has
*two* static
types. Pros/cons:

+ Each of those static types always functions in the exact same way.
Non-reference values just don't have the second one (or it equals the
first).
? It makes *three* types involved overall (1. it's a reference / 2. it has
this constraint on the referent's dynamic type / 3. the referent has this
dynamic type).
? It means that what users see in their code might be the first *or* the
second of those, which seems like losing ground on the opaqueness of
references. Valhalla acts to strengthen the implementation-detailness of
references, e.g. we want users to think, "the dot means member access, and
Java dereferences first if necessary". (I see the main distinction between
the "values-are-objects" and "values-ain't-objects" candidate models as
being how *far* it goes down that line.)

So my alternative is: just let *valueness* be what unifies them. They are
only special in that (a) the static type functions totally differently and
(b) their opaqueness and everything that is done to provide that. Right now
I feel like this gets the job done.

As I read your messages, you would prefer to keep the term ?primitive?
> narrow, because of the possible confusion of telling users ?hey, what you
> think of as primitives are now the heirloom basic primitives.?
> Personally, I think users will say, to our unveiling ?extended primitives?,
> something like this:
>
> Well, that?s not exactly what the dictionary says primitive means, if you
> can make new composite ones. But I do know that Java has non-reference
> types and calls them ?primitive?. And I also know it would be really cool
> to define new types that work like `int`, such as `UnsignedInt` or
> `HalfFloat` or the like. I get why they don?t want to build all such types
> into the language; in fact maybe I?d like to try my hand someday at
> defining my own. So, ?extended primitive?. It?s on: The Java primitives are
> now an open-ended set just like the Java objects.
>
> I have quibbles here and there but I definitely agree that everyone can
find a map through this. But:


> In other words, in saying ?extended primitive? (and also ?basic
> primitive?) we lean away from the dictionary definition of ?primitive? and
> into the Java definition. That feels like a non-confusing choice to me.
>
This might be okay except for my central point: that we simultaneously need
a new term meaning exactly the dictionary definition.

So we have to attempt to shift users' understanding of "primitive" while at
> the same time injecting a new term to mean exactly what primitive used to
> mean. That's the old Indiana Jones switch and I don't have to tell you how
> that turned out for him.
>
> So, no, it?s not the Indy switch, at all. Users know what fruit
> primitives are in Java, and they will have no problem with adding new imported
> exotic apples extended primitive to the familiar set of primitive types.
> And in exchange for this infusion of wonderful new types, they will learn a
> new term for the old types, which is pears basic primitives (or scalar
> primitives).
>
It would be a *worse* "Indiana Jones switch" if these were sibling
concepts. But even if he was swapping the idol for a less detailed idol
he'd better start runnin'.


> It would be difficult to pull off in a world where we were just pushing
> some new server and the whole world gets the new model at once. But in
> this
> universe where every version of Java ever made all have to coexist, it's
> looking to me like a guaranteed source of never-ending confusion.
>
> I also think it robs us of our ability to smoothly portray the real
> changes
> of Valhalla. We want to be able to say "elements are still elements! now
> we
> have molecules too".
>
> There are two kinds of users w.r.t. the question of ?what?s a primitive?
> and you can?t please both. You and I want to please different kinds. The
> user I want to please is one who thinks of ?Java primitive? as a kind of
> non-nullable scalar number (or boolean or char). The user you want to
> please thinks of ?Java primitive? as ?all leaves in the Big Graph?. The
> latter user will be disappointed if we say ?Java primitives? can be
> non-leaves. The former user will be delighted. The latter user sees a
> String and wants to crack out its underlying array, in a Gollum-like
> quest for the roots of the mountains. The latter user treats a String as
> a primitive. There are more of the former than the latter; we should cater
> to them. It?s the former who I was channeling above, concluding with ?The
> Java primitives are now an open-ended set just like the Java objects.?
>
Here's where I suggest that we categorize our existing associations with
"primitive" into essential vs. incidental.

And generally, to claim essentialness for a meaning that's at odds with the
generally accepted meaning should be subject to a form of Sagan's razor
<https://www.google.com/search?q=sagan%27s+razor>. But I suppose the
question is what that generally accepted meaning is. For one thing, if any
other language has user-defined compound types that it calls "primitives"
already that would be very useful to know.


> The term ?value? can be applied to composites in B3 alone, to composites
> in B2 alone, or to both. (Or neither.) All the basic types, including
> references, are values as well.
>
> This is big choice, where to ?spend? the term ?value?.
>
Just a reminder (esp. for others observing) that my conceptual model
document shows at least one thorough viewpoint that "value" has a strong
existing meaning, and one that Valhalla doesn't even have to shift at all.

> B2 are references to? well, values as well.

Just lacking identity already allows for substitutable copies; that isn't
necessarily valueness, and if users don't have to think B2 instances are a
whole new kind of thing they've never seen before, that is very (sorry)
valuable.

If I get to hang onto the meaning of "value" in my document, then I can use
it to explain B2: a B2 instance is an object, whose identity is either
nonexistent or completely unobservable (no difference). That makes the VM
free to substitute equal copies, or (whenever indistinguishable) to
*represent* it as a compound value instead, or even to box that compound
value up again as needed. In any case it is still, meaningfully, an object."

--
Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com

From brian.goetz at oracle.com  Thu Dec 16 20:57:41 2021
From: brian.goetz at oracle.com (Brian Goetz)
Date: Thu, 16 Dec 2021 15:57:41 -0500
Subject: [External] : Re: Enhancing java.lang.constant for Valhalla
In-Reply-To: <CAJq4Gi76ZWSp8ishgqTUU9B9T93SXWvwrUTKbPHfm3bmCE2Dcg@mail.gmail.com>
References: <bca1d885-db11-3619-493f-fa9322f278d1@oracle.com>
 <CAJq4Gi4WOm39QJXqBtYPno-PokTg78apGvc0rDOuS50MFj38_A@mail.gmail.com>
 <bcf55909-1ee8-79ab-0f44-affcf7fea910@oracle.com>
 <CAJq4Gi76ZWSp8ishgqTUU9B9T93SXWvwrUTKbPHfm3bmCE2Dcg@mail.gmail.com>
Message-ID: <189870bb-c730-0c69-2119-3b71a00cd34a@oracle.com>


> If the preload() bit is tied to the ClassDesc, do we need to worry
> about a bit mask in MethodTypeDesc?  Isn't the MethodTypeDesc composed
> of ClassDesc returnType and ClassDesc[] of parameters?  I feel like
> I'm missing some complexity here...

That's how the implementation happens to work, but conceptually, a 
MethodType has N+1 descriptors and N pre-load bits.? (Its convenient 
that the the implementation works in terms of ClassDesc.)

>
>> Also, how would this affect ClassDesc::equals?  Would LFoo and L*Foo be equal?
> I think they'd have to be equal as they represent the same descriptor.
> It's only the presence of side channel info - the star - that makes
> them different and the difference is only in the "go and look"
> behaviour.  The VM will treat them as identical when dealing with
> descriptor strings.
>

Which makes me ask ... if it really is a side channel, does it really go 
*in* the ClassDesc?

From daniel.smith at oracle.com  Fri Dec 17 00:08:02 2021
From: daniel.smith at oracle.com (Dan Smith)
Date: Fri, 17 Dec 2021 00:08:02 +0000
Subject: JEP update: Primitive Classes
Message-ID: <C0D5FC99-F100-416D-A185-285679813E44@oracle.com>

First, I've made some minor revisions to the Value Objects JEP in the last couple of weeks. You can see it here:
https://openjdk.java.net/jeps/8277163

Second, I've put together a draft of a revised JEP 401, Primitive Classes. This removes content that became part of the Value Objects feature, and refines how we talk about the relationship between primitive types and reference types. Working outside of JBS for now, because I don't want to disrupt the already-Candidate JEP 401 artifact until we're at least ready to Submit the Value Objects piece.

A key idea is that primitive values and value objects are distinct entities, with different types, but they're both instances of the same class (thanks for the good ideas here, Kevin!).

(I'll acknowledge the ongoing discussion about whether "primitive" is the right term to use here. But for now, sticking with the status quo.)

Happy to hear your thoughts!

---

Summary
-------

Support new, developer-declared primitive types in Java. This is a
[preview language and VM feature](http://openjdk.java.net/jeps/12).


Goals
-----

This JEP introduces primitive classes, special kinds of
[value classes][jep-values] that define new primitive types.

The Java programming language will be enhanced to recognize primitive class
declarations and support new primitive types in its type system.

The Java Virtual Machine will be enhanced with a new `Q` carrier type to encode
declared primitive types.


Non-Goals
---------

This JEP is concerned with the core treatment of developer-declared primitives.
Additional features to improve integration with the Java programming language
are not covered here, but are expected to be developed in parallel.
Specifically:

-   [JEP 402][jep402] will enhance the basic primitives (`int`, `boolean`, etc.)
    by giving them primitive class declarations.

-   [A separate JEP][jep-generics] will update Java's generics so that primitive
    types can be used as type arguments.

Other followup efforts may enhance existing APIs to take advantage of primitive
classes, or introduce new language features and APIs built on top of primitive
classes.


Motivation
----------

Java developers work with two kinds of values: primitives and objects.

Primitives offer better performance, because they are typically *inlined*?stored
directly (without headers or pointers) in variables, on the computation stack,
and, ultimately, in CPU registers. Hence, memory reads do not have additional
indirections, primitive arrays are stored densely and contiguously in memory,
primitive-typed fields can be similarly compact, primitive values do not require
garbage collection, and primitive operations are performed within the CPU.

Objects offer better abstractions, including fields, methods, constructors,
access control, and nominal subtyping. But objects traditionally perform poorly
in comparison to primitives, because they are primarily stored in heap-allocated
memory and accessed by reference.

*Value objects*, introduced by [another JEP][jep-values], significantly improve
object performance in many contexts, providing a good fusion of the better
abstractions of objects with the better performance of primitives.

However, certain invariant properties of objects limit how much they can be
optimized?particularly when stored in fields and arrays. Specifically:

-   A variable of a reference type may be `null`, so the inlined layout of a
    value object typically requires some additional bits to encode `null`.
    For example, a variable storing an `int` can fit in 32 bits, but for a value
    class with a single `int` field, a variable of that class type could
    use up to 64 bits.
    
-   A variable of a reference type must be modified atomically. This often makes
    it impractical to inline a value object, because its layout would be too
    large for efficient atomic modification. Large primitive types (currently,
    `double` and `long`) make no such atomicity guarantees, so variables of
    these types can be modified efficiently without indirect representations
    (concurrency is instead managed at a higher level).

Primitive classes give developers the capability to define new primitive types
that aren't subject to these limitations. Programs can make use of class
features without giving up any of the performance benefits of primitives.

Applications of developer-declared primitives include:

-   Numbers of varieties not supported by the basic primitives, such as
    unsigned bytes, 128-bit integers, and half-precision floats;

-   Points, complex numbers, colors, vectors, and other multi-dimensional
    numerics;

-   Numbers with units?sizes, rates of change, currency, etc.;

-   Bitmasks and other compressed encodings of data;

-   Map entries and other data structure internals;

-   Data-carrying tuples and multiple returns;

-   Aggregations of other primitive types, potentially multiple layers deep


Description
-----------

The features described below are preview features, enabled with the
`--enable-preview` compile-time and runtime flags.


### Primitive classes

A *primitive class* is a special kind of value class that introduces a new
primitive type.

As value classes, primitive classes have no identity. This allows their
instances to be freely converted between value objects and simpler *primitive
values*. A primitive value can be thought of as a bare sequence of field values,
without any headers or extra pointers.

A primitive class is declared with the `primitive` contextual keyword.

```
primitive class Point implements Shape {
    private double x;
    private double y;

    public Point(double x, double y) {
        this.x = x;
        this.y = y;
    }

    public double x() { return x; }
    public double y() { return y; }

    public Point translate(double dx, double dy) {
        return new Point(x+dx, y+dy);
    }

    public boolean contains(Point p) {
        return equals(p);
    }
}

interface Shape {
    boolean contains(Point p);
}
```

(Alternatively, we might prefer the class to be declared as `primitive Point`.)

Primitive class declarations are subject to the [same restrictions][jep-values]
as other value class declarations. For example, the instance fields of a
primitive class are implicitly `final`, so cannot be assigned outside of a
constructor or initializer.

In addition, no instance field of a primitive class declaration may have a
primitive type that depends?directly or indirectly?on the declaring class. In
other words, with the exception of reference-typed fields, the class must allow
for flat, fixed-size layouts without cycles.

In most other ways, a primitive class declaration is just like any other class
declaration. It can have superinterfaces, type parameters, enclosing instances,
inner classes, overloaded constructors, `static` members, and the full range of
access restrictions on its members.


### Primitive types

The name of a primitive class denotes that class's primitive type. Primitive
types store instances of the named class as primitive values. Instances can be
created with normal class instance creation expressions.

```
Point p1 = new Point(1.0, -0.5);
```

Field access and method invocation are supported by primitive types. The members
of a primitive type are the same as the members of the class.

```
assert p1.x() == 1.0;
Point p2 = p1.translate(0.0, 1.0);
System.out.println(p2.toString());
```

Primitive types support the `==` and `!=` operators when comparing two values of
the same type. As is the case for value objects, the `==` comparison recursively
compares the values' fields.

```
Point p3 = new Point(1.8, 3.6);
Point p4 = p3.translate(0.0, 0.0);
assert p3 == p4;
```

Like a value class reference type, an expression of a primitive type cannot be
used as the operand of a `synchronized` statement.

*Unlike* other value classes, a `this` expression in the body of a primitive
class has a primitive type.


### Default values and `null`

Like the basic primitive types (`int`, `boolean`, etc.), declared primitive
types do not allow `null`.

Whenever a field or array component is created, the longstanding behavior is to
set its initial value to the *default value* of its type. For reference types,
this value is `null`, and for the basic primitive types, this value is 0 or
`false`.

For a declared primitive type, the default value is the *initial instance* of
the class: an instance whose fields are all set to their own default values.

```
Object[] os = new Object[5];
assert os[0] == null;
Point[] ps = new Point[5];
assert ps[0].x() == 0.0 && ps[0].y() == 0.0;
```

As shorthand, the default value of a primitive type can be expressed with the
class name followed by the `default` keyword.

```
assert Point.default.x() == 0.0 &&
       Point.default.y() == 0.0;
```

Note that the initial instance of a primitive class is created without invoking
any constructors or instance initializers, and is available to anyone with
access to the class (or its reflective `Class` object). Primitive classes are
not able to specify an initial instance that sets fields to something other than
their default values.

Methods of primitive classes should be designed to work on the initial instance.
If this isn't feasible (for example, a reference-typed field is expected to be
non-null), it may not be appropriate for the class to have a primitive type.
Instead, it can be declared as a normal value class.


### Multi-threaded reads and writes

As for the basic primitive types `double` and `long`, when a field or array
component has a declared primitive type, reads and writes might not be atomic.
As a result, in a multi-threaded program, unexpected instances may be
encountered.

``` 
Point[] ps = new Point[]{ new Point(0.0, 1.0) }; 
new Thread(() -> ps[0] = new Point(1.0, 0.0)).run(); 
Point p = ps[0]; // may be (1.0, 1.0), among other possibilities 
``` 

Like initial instances, primitive class instances produced by non-atomic reads
and writes are created without invoking any constructors or instance
initializers. There is no opportunity for the class to ensure that the field
values of the new object are compatible with each other (for example, a `start`
index may end up being greater than an `end` index).

To ensure that a particular primitive-typed field is always read from and
written to atomically, the field can be declared `volatile`. But there is no
mechanism for a primitive class to ensure that *all* fields and array components
of its type are considered volatile.

A class with a complex integrity constraint in its constructor may not be a good
candidate to be a primitive class. Instead, it can be declared as a normal value
class.


### Reference types

Primitive values are *monomorphic*?they belong to a single type with a specific
set of fields known at compile time and runtime. Values of different primitive
types can't be mixed.

To participate in the *polymorphic* reference type hierarchy, primitive values
are converted to value objects with a *value object conversion*. This occurs
implicitly when assigning from a primitive type to a reference type. The result
is an instance of the same class, just in a different form.

```
Shape s = p1; // value object conversion
assert s.getClass() == Point.class;
```

When invoking an inherited method of a primitive type, the receiver value
undergoes value object conversion to have the type expected by the method
declaration.

```
Point p = new Point(0.3, 7.2);
// toString is declared by Object
p.toString(); // value object conversion
```

It is sometimes useful to talk about the reference type of a primitive class.
This type is expressed with the class name followed by the `ref` contextual
keyword. A variable with a primitive class reference type stores either a value
object belonging to the named class or `null`. 

```
Point.ref[] prs = new Point.ref[10];
prs[1] = new Point(1.0, 1.0);
prs[4] = new Point(4.0, 4.0);
for (Point.ref pr : prs) {
    if (pr != null)
        System.out.println(pr);
}
```

The `ref` type is useful when `null` is needed or when the runtime
characteristics of reference types are preferred (for example, a large sparse
array might be more efficiently encoded with references).

The relationship between the types `Point` and `Point.ref` is similar to the
traditional relationship between the types `int` and `Integer`. However, `Point`
and `Point.ref` both correspond to the same class declaration; the values of
both types are instances of a single `Point` class. At run time, the conversion
between a primitive value and a value object is more lightweight than
traditional boxing conversion.

Value objects can be converted back to primitive values with a *primitive value
conversion*. `null` cannot be converted to a primitive value, so attempts to
convert it cause an exception.

```
Point p = prs[1]; // primitive value conversion
prs[1] = null;
p = prs[1]; // NullPointerException
```

When invoking a method overridden by a primitive class, the receiver object
undergoes primitive value conversion to have the type expected by the method
declaration.

```
Shape s = new Point(0.7, 3.2);
// 'contains' is declared by Point
s.contains(Point.default); // primitive value conversion
```


#### Overload resolution and type arguments

Value object conversion and primitive value conversion are allowed in *loose*,
but not *strict*, invocation contexts. This follows the pattern of boxing and
unboxing: a method overload that is applicable without applying the conversions
takes priority over one that requires them.

```
void m(Point p, int i) { ... }
void m(Point.ref pr, Integer i) { ... }

void test(Point.ref pr, Integer i) {
    m(pr, i); // prefers the second declaration
    m(pr, 0); // ambiguous
}
```

For now, Java's generics only work with reference types.
[Another JEP][jep-generics] will enhance generics to interoperate with primitive
types.

Thus, provisionally, type arguments must be inferred to be reference types. Type
inference treats value object and primitive value conversions the same as boxing
and unboxing?for example, a primitive value passed where an inferred type is
expected will lead to a reference-typed inference constraint.

```
var list = List.of(new Point(1.0, 5.0));
// infers List<Point.ref>
```


#### Array subtyping

Traditionally, primitive array types are not related to reference array
types?an `int[]` cannot be assigned to an `Object[]` variable.

Arrays of declared primitive types are more flexible: the type `Point[]` is a
subtype of `Point.ref[]`, which is a subtype of `Object[]`.

(Basic primitive array types like `int[]` will also gain this capability with
[JEP 402][jep402].)

When a reference is stored in an array of static type `Object[]`, if the array's
runtime component type is `Point` then the operation will perform both an array
store check (checking that the object is an instance of class `Point`) and a
primitive value conversion (converting the object to a primitive value).

Similarly, reading from an array of static type `Object[]` will cause a
value object conversion if the array stores primitive values.

```
Object replace(Object[] objs, int i, Object val) {
    Object result = objs[i]; // may perform value object conversion
    objs[i] = val; // may perform primitive value conversion
    return result;
}

Point[] ps = new Point[]{ new Point(3.0, -2.1) };
replace(ps, 0, new Point(-2.1, 3.0));
replace(ps, 0, null); // NPE from primitive value conversion
```


### `class` file representation & interpretation

A primitive class is declared in a `class` file using the `ACC_PRIMITIVE`
modifier (`0x0800`). At class load time, an error occurs if a primitive class is
not a value class (via `ACC_VALUE`, `0x0100`). At preparation time, an error
occurs if a primitive class has a primitive type circularity in its instance
fields.

A declared primitive type is represented with a new `Q` descriptor prefix
(`QPoint;`). The class's reference type is represented using the usual `L`
descriptor (`LPoint;`).

Primitive values with `Q` types are one-slot stack values, even though they may
represent aggregates of much more than 32 or 64 bits. No particular encoding of
primitive values is mandated.

Verification treats a `Q` type as a subtype of the corresponding `L` type?e.g.,
`QPoint;` is a subtype of `LPoint;`. Conversions from primitive values to value
objects occur implicitly, as needed.

The `this` parameter of a primitive class's instance method has a primitive
type.

Classes mentioned by primitive types in field and method descriptors are loaded
during linkage, before the first access of that field or method.

A `CONSTANT_Class` constant pool entry may refer to a primitive type using a `Q`
descriptor as a "class name". A `CONSTANT_Class` using the plain name of a
primitive class represents the class's reference type.

The `aconst_init` instruction may refer to either a primitive type or a
reference type. This determines whether a primitive value or a value object is
produced.

Similarly, a `CONSTANT_Fieldref` or `CONSTANT_Methodref` may refer to a field or
method as a member of a primitive type or a reference type. In the case of
`withfield`, this determines the result type of the operation.

The `anewarray` and `multianewarray` instructions can be used to create arrays
of declared primitive types. Array subtyping allows these arrays to be viewed as
instances of reference array types.

The `checkcast`, `instanceof`, and `aastore` opcodes support primitive value
types, performing primitive value conversions (including `null` checks) when
necessary.

Primitive classes may be initialized for the same reasons as other classes (for
example, before a static method is invoked). In addition, primitive class
initialization is triggered by the `aconst_init` instruction, by each of the
`anewarray` and `multianewarray` instructions when used with a primitive type,
and (recursively) by initialization of another class that declares a
primitive-typed field mentioning the primitive class.


### Core reflection

Every primitive class has a `java.lang.Class` object representing the class.
For both primitive values and value objects, the `getClass` method of the
class's instances returns this object. A class literal?`Point.class`?can also
be used to express this object.

Tentatively: this `Class` object returns `true` from the `isPrimitive` method,
and `getModifiers` shows its `Modifier.PRIMITIVE` flag set.

For uses that need to model *types*, there is one `Class` object representing
the primitive type, and another representing the reference type. Each of these
have the same behavior as the `Class` object representing the class in most
respects, except for methods to explicitly tell them apart and map from one to
the other.

Tentatively: the `Class` object representing the class doubles as a
representation of the primitive type. A separate `Class` object exist for the
purpose of representing the reference type.


### Other APIs

The following APIs also gain new behaviors:

-   `java.lang.constant` encodes `Q` types in `CONSTANT_Class` structures and
    field and method descriptors

-   `java.lang.invoke` recognizes `Q` types and supports `L`-to-`Q` conversions

-   `javax.lang.model` recognizes primitive class declarations


### Performance model

In typical usage, in heap storage and during fully-optimized code execution,
declared primitive types should have a footprint and execution overhead
comparable to the basic primitive types. For example, a `Point`, as declared
above, can be expected to directly occupy 128 bits in local variables,
parameters, fields, and array components. A field access simply extracts the
first or second 64 bits. There are no additional pointers or metadata fields.

Notably, a primitive class with a single instance field can be expected to have
minimal overhead compared to operating on a value of the field's type directly.

However, JVMs are ultimately free to encode primitive values however they see
fit. Some classes may be considered too large to represent inline. Certain
JVM components, in particular those that are less performance-tuned, may prefer
to interact with primitive values as objects. A primitive value might carry with
it a cached value object pointer to reduce the overhead of future conversions.
Etc.

Value objects that are instances of primitive classes can be expected to behave
much like instances of [other value classes][jep-values].


### HotSpot implementation

This section describes implementation details of this release of the HotSpot
virtual machine, for the information of OpenJDK engineers. These details are
subject to change in future releases and should not be assumed by users of
HotSpot or other JVMs.

Values of `Q` types in HotSpot are encoded as follows:

-   Primitive classes whose field layouts exceed a size threshold are always
    encoded as regular heap objects. Fields marked `volatile` always store
    regular heap objects.

-   Otherwise, primitive values are encoded in fields and arrays as a flattened
    sequence of field values. Array components may be padded to achieve good
    alignment.

-   In the interpreter and C1, primitive values on the stack are represented as
    value objects. Each read of a primitive-typed field or array allocates a
    heap object.

-   In C2, primitive values on the stack are scalarized, effectively encoding
    each field as a separate variable. Methods with Q-typed parameters support
    both a pointer-based entry point (for interpreter and C1 calls) and a
    scalarized entry point (for C2-to-C2 calls). Value objects are also
    scalarized when working with the primitive class's reference type. Heap
    allocations occur where any other supertype is used.

Default values are generally encoded as sequences of zeros, simplifying the task
of field and array creation. However, in cases where a field or array encodes
primitive values as heap pointers, the default value is a non-zero pointer.
(Circularities may require this value to be `null` temporarily, but the `null`
must be hidden from program code.)

Some array types, like `[Ljava/lang/Object;` and `[LPoint;`, allow for both
pointer-based and flattened arrays. Reads and writes for these types dynamically
check a flag and perform the necessary conversions when operating on flattened
arrays.


Alternatives
------------

Making use of the basic primitive types, rather than declaring new primitives,
will often produce a program with equivalent or slightly better performance.
However, this approach gives up the valuable abstractions provided by classes.
It's easy to, say, interpret a `double` with the wrong units, pass an
out-of-range `int` to a library method, or fail to keep two `boolean` flags
together in the right order.

Normal value classes provide many of the benefits of primitive classes, without
the substantial disruptions to the language and JVM type systems. With
additional innovation in JVM implementation techniques and hardware
capabilities, the gap may close further. However, the limitations outlined in
the "Motivation" section are pretty fundamental. For example, a value class type
wrapping a single `long` field and supporting the full range of `long` values
for that field can never be encoded in fewer than 65 bits. Primitive classes
give programmers who need fine-grained control a more reliable performance
model.

We considered many different approaches to boxing and polymorphism before
settling on a model in which primitive values and value objects are two
different representations, with two different types, of the same class
instances. This strategy balances the traditional understanding of primitive
types, with familiar semantics, performance expectations, and conversions to
objects, with the simplicity of a single named class declaration for modeling
data in both the primitive and reference spaces. Strategies in which a primitive
value *is a* object obscure some important differences between the types.
Strategies in which conversions occur between two different class-like entities
introduce distracting complexity.


Risks and Assumptions
---------------------

There are security risks involved in allowing instance creation outside of
constructors, via default instances and non-atomic reads and writes. Developers
will need to understand the implications, and recognize when it would be unsafe
to declare a class `primitive`.

This JEP does not address the interaction of primitive classes with the basic
primitives or generics; these features will be addressed by other JEPs (see
below). But, ultimately, all three JEPs will need to be completed to deliver a
cohesive language design.


Dependencies
-----------

This JEP depends on [Value Objects][jep-values], which establishes the semantics
of primitives when treated as objects. Primitive classes are a special case of
value classes.

In support of this JEP, there are separate efforts to improve the JVM
Specification (in particular its treatment of `class` file validation) and the
Java Language Specification (in particular its treatment of types). These
changes address technical debt and facilitate the specification of these new
features.

In [JEP 402][jep402] we propose to update the basic primitive types (`int`,
`boolean`, etc.) to be represented by primitive classes, unifying the two kinds
of primitive types. The existing wrapper classes will be repurposed to represent
the corresponding types' primitive classes.

In another JEP we will propose modifying the generics model in Java to make type
parameters *universal*?instantiable by all types, both reference and primitive.

In the future, JVM class and method specialization ([JEP 218][jep218], with
revisions) will allow generic classes and methods to specialize field, array,
and local variable layouts when parameterized by primitive types.

[jep402]: https://openjdk.java.net/jeps/402
[jep218]: https://openjdk.java.net/jeps/218
[jep-values]: https://openjdk.java.net/jeps/8277163
[jep-generics]: https://openjdk.java.net/jeps/8261529


From brian.goetz at oracle.com  Fri Dec 17 13:11:14 2021
From: brian.goetz at oracle.com (Brian Goetz)
Date: Fri, 17 Dec 2021 08:11:14 -0500
Subject: [External] : Re: Enhancing java.lang.constant for Valhalla
In-Reply-To: <CAJq4Gi7NdZWxDtn9agkh1Qx3pfMzGf7W2GsE2irr65M6bBBGmA@mail.gmail.com>
References: <bca1d885-db11-3619-493f-fa9322f278d1@oracle.com>
 <CAJq4Gi4WOm39QJXqBtYPno-PokTg78apGvc0rDOuS50MFj38_A@mail.gmail.com>
 <bcf55909-1ee8-79ab-0f44-affcf7fea910@oracle.com>
 <CAJq4Gi76ZWSp8ishgqTUU9B9T93SXWvwrUTKbPHfm3bmCE2Dcg@mail.gmail.com>
 <189870bb-c730-0c69-2119-3b71a00cd34a@oracle.com>
 <CAJq4Gi7NdZWxDtn9agkh1Qx3pfMzGf7W2GsE2irr65M6bBBGmA@mail.gmail.com>
Message-ID: <b556dc68-27c5-e394-a695-6bd85c157d4a@oracle.com>

Let's do an ASM thought experiment.

The descriptors live in (a) {method,field}_info metadata, and (b) 
C_{Field,Method}Ref constants referred to by invoke/field access 
instructions.

The stars, though, live somewhere completely different: the Preload 
attribute, which is not on the instruction, or the code attribute, or 
the method/field, but on the class.

I would expect ClassVisitor to be enhanced with something like

 ??? visitPreload(String clazz)

So, when reading a classfile, you get a bunch of preload "events", and 
then eventually, when you get to method/field metadata, or instructions, 
you get a bunch of events that have L descriptors in them, with no stars.

ASM commits to delivering certain events before others, so when 
adapting, you might accumulate the visitPreload events into a List, and 
then if you are inserting new instructions that are supposed to use L*, 
if they're not in the list, you'd emit extra visitPreload calls.? 
(Presumably also ASM would want to filter the Preload values to 
eliminate duplicates.)

Similarly, when writing a classfile, if you want to do a getstatic of a 
field known to be an L* field, you might do something like:

 ??? b.visitPreload(internalName(C))
 ??? b.visitFieldInsn(GETSTATIC, receiverClass, "foo", eLdescriptor(C))

Which is to say, one of the costs of this scheme is that the stars go 
far away from the descriptors they are attached to (not even in 1:1 
correspondence), and classfile manglers will have to keep this mapping 
somewhere.

My first instinct is that putting the stars in the ClassDesc is putting 
the bookkeeping in the wrong place.


Let's look at other uses of ClassDesc; one was the constant folding 
example.? We want to be able to intrinsify LDC operations, including 
condy, and indy calls.? Do any of them need preloading to work 
properly?? (The *s are about preloading constraints.)

LDC'ing a C_Class will already force loading of the class (and besides, 
C_Class has no use for a *.)

Invoking an indy which returns an L*Foo might want Foo preloaded.? I 
don't know enough about the timing of indy linkage to know whether all 
the classes in the type descriptor are loaded by the time the calling 
convention is set up, but I suspect it may already be?


On 12/16/2021 4:24 PM, Dan Heidinga wrote:
>> Which makes me ask ... if it really is a side channel, does it really go
>> *in* the ClassDesc?
> If it's not in the ClassDesc, then how do we communicate the side
> channel to users - e.g. class file generators?
>
> I recently rewatched your JVMLS talk from 2018 [1] where javac
> converted the jl.constant version of the descriptions into an `ldc`.
> Now none of that is in the language yet but if we drop the stars from
> descriptors now, the info won't be available when/if that vision comes
> to fruition.
>
> --Dan
>
> [1]https://urldefense.com/v3/__https://www.youtube.com/watch?v=iSEjlLFCS3E&list=PLX8CzqL3ArzVnxC6PYxMlngEMv3W1pIkn&index=2&t=2s__;!!ACWV5N9M2RV99hQ!cSDE-Mpt9NsrFZwwEnoJ2sUm92OQqIlb9bDfBvr96zxWf-9NjmWc3mGzBCLje9p5EQ$  
>

From forax at univ-mlv.fr  Sat Dec 18 11:53:07 2021
From: forax at univ-mlv.fr (Remi Forax)
Date: Sat, 18 Dec 2021 12:53:07 +0100 (CET)
Subject: We have to talk about "primitive".
In-Reply-To: <CAGKkBkvWThUhyXkJevCYqUG45yuWTmXAHeJ8e6UWtHzrNhtCvg@mail.gmail.com>
References: <CAGKkBkvWThUhyXkJevCYqUG45yuWTmXAHeJ8e6UWtHzrNhtCvg@mail.gmail.com>
Message-ID: <9352049.3304121.1639828387890.JavaMail.zimbra@u-pem.fr>

> From: "Kevin Bourrillion" <kevinb at google.com>
> To: "valhalla-spec-experts" <valhalla-spec-experts at openjdk.java.net>
> Sent: Mercredi 15 D?cembre 2021 19:42:55
> Subject: We have to talk about "primitive".

> (Okay, so we're doing this)
> I think the rename to "primitive classes" happened during my outage last year.
> When I came back I made the decision to like it.

> Since then, I've found that in my explanatory model I'm fighting against it
> constantly. I think it may actually be fatally flawed.

> The points I raise here were surely already known at the time, and I know there
> were good reasons for overriding them. But I feel the need to come back and
> push harder on their importance.

> Background: the textbook definition of "primitive" is centered on their nature
> of being elements-not-molecules, and I see no dispute about it. Also, there's
> no disputing the fact that we're allowed to adopt a different meaning if we so
> choose. So that's not even the fatal flaw.

As already said by John, there are atoms in term of user defined types but not at runtime, apart if declared volatile, a long or a double is two 32 bits values. 

> The main problem I think we can't escape is that we'll still need some word that
> means only the eight predefined types. (For the sake of argument let's assume
> we can pick one and lean hard on it, whether that's "predefined", "built-in",
> "elemental", "leaf type", or whatever.)

I still hope that we can see used defined primitive and a builtin type the same way from a JLS point of view. 
Obviously, from the JVMS POV they are different, but i think one of our goal should be that the distinction between a builtin primitive and a user defined primitive should not visible in the JLS. 

> Definitely, our trying to minimize their specialness is virtuous. They should be
> like helium: yes, they are molecules when you want a molecule! But on any
> deeper look they will clearly be "actually" elements, and the distinction will
> matter often enough.

> So we have to attempt to shift users' understanding of "primitive" while at the
> same time injecting a new term to mean exactly what primitive used to mean.
> That's the old Indiana Jones switch and I don't have to tell you how that
> turned out for him.

> It would be difficult to pull off in a world where we were just pushing some new
> server and the whole world gets the new model at once. But in this universe
> where every version of Java ever made all have to coexist, it's looking to me
> like a guaranteed source of never-ending confusion.

> I also think it robs us of our ability to smoothly portray the real changes of
> Valhalla. We want to be able to say "elements are still elements! now we have
> molecules too". Pedagogically that is always preferable to "elements aren't
> really what you thought they were". Okay, the real comparison is a little more
> nuanced than that, but I'll get to that now.

I agree retconing is better pedagogically because a lot of people think in term of analogy. 

> An alternative that seems to work fine, in my mental model at least, is:

>     * Primitive types are examples of value types, and have always been.
>    * Java never supported any other kinds of value types before, so we didn't
>     distinguish the terms before.
>     * Everything you associate with primitive types remains true.
>     * But most of those traits really come from their value-type-ness.

> (I plan to make the above shifts to my model document already.)

>     * Now we have user-defined value types too.
>    * The way we user-define a type is with a class, so a value type is defined by a
>     "value class" (sorry B2).
>     * The primitive types will now each get a value class.
>     * These 8 classes will look as much like user-defined types as Object does.
>    * They, like Object, will have a "cheat" in their source code that no one else
>    gets to use. (Object's is that there is no implied `extends Object` or
>    `super();`; these need no fields because the data they store is magically
>     handled by the VM. These feel like similar cheats.)

> Then mopping up the rest:

>    * Existing classes probably need a term like "reference classes" (in the model
>    I'm going to circulate that doubles down on values-are-not-objects, then this
>     wants to be "object classes", even though that feels weird at first).
>    * I think the term for bucket 2 classes really ought to center on
>    identitylessness, e.g. "noid", "noident", "idfree", or something. Anything else
>    is getting away from the essential meaning of the bucket; plus, we want people
>     to call bucket 1 classes "identity classes", don't we?

> Footnote: for a more concrete manifestation of this problem: I am sure we cannot
> possibly get away with Class.isPrimitive() being true for these classes. Right?

> Thoughts?

I agree but i don't think we should use "value type" as a term to encompass user defined primitive and builtin primitive. 

BTW, i think it's very interesting to have this discussion now that we have scramble the model by introducing the B1/B2/B3 model. 

This is how I see the thing, 
technically, we have 4 category, 
B1: user defined object with an identity, used by reference (nullable) 
B2: user defined object with no identity, used by reference (nullable) 
B3: user defined primitive with no identity, used as direct value 
B4: builtin primitive with no identity, used as direct value 

With the previous model of Valhalla, with had only the category B1, B2 and B3, so the cut was between having an identity or not to the point were we have introduced IdentityObject/ValueObject in the type system. 
I believe that introducing B2 change where we introduce the cut, we still hope that at the end we have only two category right ? 

I believe we should piggyback on the difference between reference vs direct value and do the cut here. 
After all, introducing B2 means that having identityless objects used by reference is useful. 

So, for me, it seems logical to group B1 and B2 together and to group B3 and B4 together and see B2 as a special king of B1 and B4 as a special kind of B3. 

So we have object or primitive, among the object, we have the one with identity and the one identityless, among the primitive, we have the one with Ref box and the one with historical box (Integer, etc). 

On the subject of boxes, i think we should go the other way, aka "you never really understood how box worked" because most people don't care about how a box work, and rightly so, 
once we get better generics, box will mostly disappear. 

I think we should not introduce the interfaces IdentityObject/ValueObject because it does not seem useful anymore to explain the new model, it's not the center of the model anymore, and their usefulness in term of typing is low. 
(We still need to consider empty abstract class as special but it's a detail for people wanting to play with primitive class + inheritance, so it's fairly specific). 

For a regular user of Java that does not care about the JVM details, 
- class/enum/record/lambda are handled by references, a class/enum/record can be declared identityless with a modifier, a lambda is identityless. 
- primitive are handled by direct values (so not nullable), they are tearable, have a default value/a non overridable default constructor, 
they are defined by the keyword primitive, the bulitins (the one written in lowercase) have a named box instead of Primitive.Box 

Examples: 
String is a class, Optional is an identityless class, Complex is a primitive, int is a builtin primitive 

R?mi 

From brian.goetz at oracle.com  Mon Dec 20 17:54:01 2021
From: brian.goetz at oracle.com (Brian Goetz)
Date: Mon, 20 Dec 2021 12:54:01 -0500
Subject: JEP update: Value Objects
In-Reply-To: <FDDC8884-C09A-4008-8E4A-EE3553C09250@oracle.com>
References: <68250ADC-90BB-43EC-A646-77127091D4BD@oracle.com>
 <CAJq4Gi4gjzZHkbuCfSomZsd7vzSuAHbOXuYRG=Wcbq1DD38n=Q@mail.gmail.com>
 <117E6CD9-9D94-4110-BA40-3778FC207977@oracle.com>
 <8AD4B184-2937-4146-A763-612E31E64683@oracle.com>
 <6776971B-F8B1-416D-8A4F-32EAE842AC03@oracle.com>
 <CAJq4Gi5jDq8jn=p6kXxSPPhg9PaCD7do+gdp=EzHRN8upGKzVQ@mail.gmail.com>
 <82A9C5AA-F0F3-4FB7-BF36-B6557103080E@oracle.com>
 <CAJq4Gi47XDQHNzOL4JYnNAOiDjAGh9r_zQRqGQGjq=i8R8wE7A@mail.gmail.com>
 <FDDC8884-C09A-4008-8E4A-EE3553C09250@oracle.com>
Message-ID: <e663d0c3-ce48-3dd1-1f1c-2d372ea82b48@oracle.com>

I was working on some docs and am not sure if we came to a conclusion on 
the rules about who may, may not, or must declare ValueObject or 
IdentityObject.

Let me see if I can chart the boundaries of the design space. I'll start 
with IdentityObject since it is more constrained.

 ?- Clearly for legacy classes, the VM is going to have to infer and 
inject IdentityObject.
 ?- Since IdentityObject is an interface, it is inherited; if my super 
implements IO, so am I.
 ?- It seems desirable that a user be *allowed* to name IdentityObject 
as a superinterface of an interface or abstract class, which constrains 
what subclasses can do.? (Alternately we could spell this "value 
interface" or "value abstract class"; this is a separate set of tradeoffs.)
 ?- There is value in having exactly one way to say certain things; it 
reduces the space of what has to be specified and tested.
 ?- I believe our goal is to know everything we need to know at class 
load time, and not to have to go back and do complex checks on a 
supertype when a subclass is loaded.

The choice space seems to be
 ? user { must, may, may not } specify IO on concrete classes
 ? x compiler { must, may, may not } specify IO when ACC_VALUE present
 ? x VM (and reflection) { mops up }

where "mopping up" minimally includes dealing with legacy classfiles.

Asking the user to say "IdentityObject" on each identity class seems 
ridiculous, so we can drop that one.

 ? user { may, may not } specify IO on concrete classes
 ? x compiler { must, may, may not } specify IO when ACC_VALUE present
 ? x VM (and reflection) { mops up }

 From a user model perspective, it seems arbitrary to say the user may 
not explicitly say IO for concrete classes, but may so do for abstract 
classes.? So the two consistent user choices are either:

 ?- User can say "implements IO" anywhere they like
 ?- User cannot say "implements IO" anywhere, and instead we have an 
"identity" modifier which is optional on concrete classes and acts as a 
constraint on abstract classes/interfaces.

While having an "identity" modifier is nice from a completeness 
perspective, the fact that it is probably erased to "implements 
IdentityObject" creates complication for reflection (and another 
asymmetry between reflection and javax.lang.model).? So it seems that 
just letting users say "implements IdentityObject" is reasonable.

Given that the user has a choice, there is little value in "compiler may 
not inject", so the choice for the compiler here is "must" vs "may" 
inject.? Which is really asking whether we want to draw the VM line at 
legacy vs new classfiles, or merely adding IO as a default when nothing 
else has been selected. Note that asking the compiler to inject based on 
ACC_VALUE is also asking pretty much everything that touches bytecode to 
do this too, and likely to generate more errors from bytecode manglers.? 
The VM is doing inference either way, what we get to choose here is the 
axis.

Let's put a pin in IO and come back to VO.

The user is already saying "value", and we're stuck with the default 
being "identity".? Unless we want to have the user say "value interface" 
for a value-only interface (which moves some complexity into reflection, 
but is also a consistent model), I think we're stuck with letting the 
user specify either IO/VO on an abstract class / interface, which sort 
of drags us towards letting the user say it (redundantly) on concrete 
classes too.

The compiler and VM will always type-check the consistency of the value 
keyword/bit and the implements clause.? So the real question is where 
the inference/injection happens.? And the VM will have to do injection 
for at least IO at least for legacy classes.

So the choices for VM infer&inject seem to be:

 ?- Only inject IO for legacy concrete classes, based on classfile 
version, otherwise require everything to be explicit;
 ?- Inject IO for concrete classes when ACC_VALUE is not present, 
require VO to be explicit;
 ?- Inject IO for concrete classes when ACC_VALUE is not present; inject 
VO for concrete classes when ACC_VALUE is present

Is infer&inject measurably more costly than just ordinary classfile 
checking?? It seems to me that if all things are equal, the simpler 
injection rule is preferable (the third), mostly on the basis of what it 
asks of humans who write code to manipulate bytecode, but if there's a 
real cost to the injection, then having the compiler help out is 
reasonable. (But in that case, it probably makes sense for the compiler 
to help out in all cases, not just VO.)


On 12/2/2021 6:11 PM, Dan Smith wrote:
>> On Dec 2, 2021, at 1:04 PM, Dan Heidinga<heidinga at redhat.com>  wrote:
>>
>> On Thu, Dec 2, 2021 at 10:05 AM Dan Smith<daniel.smith at oracle.com>  wrote:
>>> On Dec 2, 2021, at 7:08 AM, Dan Heidinga<heidinga at redhat.com>  wrote:
>>>
>>> When converting back from our internal form to a classfile for the
>>> JVMTI RetransformClasses agents, I need to either filter the interface
>>> out if we injected it or not if it was already there.  JVMTI's
>>> GetImplementedInterfaces call has a similar issue with being
>>> consistent - and that's really the same issue as reflection.
>>>
>>> There's a lot of small places that can easily become inconsistent -
>>> and therefore a lot of places that need to be checked - to hide
>>> injected interfaces.  The easiest solution to that is to avoid
>>> injecting interfaces in cases where javac can do it for us so the VM
>>> has a consistent view.
>>>
>>>
>>> I think you may be envisioning extra complexity that isn't needed here. The plan of record is that we *won't* hide injected interfaces.
>> +1.  I'm 100% on board with this approach.  It cleans up a lot of the
>> potential corner cases.
>>
>>> Our hope is that the implicit/explicit distinction is meaningless?that turning implicit into explicit via JVMTI would be a 100% equivalent change. I don't know JVMTI well, so I'm not sure if there's some reason to think that wouldn't be acceptable...
>> JVMTI's "GetImplementedInterfaces" spec will need some adaptation as
>> it currently states "Return the direct super-interfaces of this class.
>> For a class, this function returns the interfaces declared in its
>> implements clause."
>>
>> The ClassFileLoadHook (CFLH) runs either with the original bytecodes
>> as passed to the VM (the first time) or with "morally equivalent"
>> bytecodes recreated by the VM from its internal classfile formats.
>> The first time through the process the agent may see a value class
>> that doesn't have the VO interface directly listed while after a call
>> to {retransform,redefine}Classes, the VO interface may be directly
>> listed.  The same issues apply to the IO interface with legacy
>> classfiles so with some minor spec updates, we can paper over that.
>>
>> Those are the only two places: GetImplementedInterfaces & CFLH and
>> related redefine/retransform functions, I can find in the JVMTI spec
>> that would be affected.  Some minor spec updates should be able to
>> address both to ensure an inconsistency in the observed behaviour is
>> treated as valid.
> Useful details, thanks.
>
> Would it be a problem if the ClassFileLoadHook gives different answers depending on the timing of the request (derived from original bytecodes vs. JVM-internal data)? If we need consistent answers, it may be that the "original bytecode" approach needs to reproduce the JVM's inference logic. If it's okay for the answers to change, there's less work to do.
>
> To highlight your last point: we *will* need to work this out for inferred IdentityObject, whether we decide to infer ValueObject or not.

From forax at univ-mlv.fr  Mon Dec 20 19:05:58 2021
From: forax at univ-mlv.fr (Remi Forax)
Date: Mon, 20 Dec 2021 20:05:58 +0100 (CET)
Subject: JEP update: Value Objects
In-Reply-To: <e663d0c3-ce48-3dd1-1f1c-2d372ea82b48@oracle.com>
References: <68250ADC-90BB-43EC-A646-77127091D4BD@oracle.com>
 <8AD4B184-2937-4146-A763-612E31E64683@oracle.com>
 <6776971B-F8B1-416D-8A4F-32EAE842AC03@oracle.com>
 <CAJq4Gi5jDq8jn=p6kXxSPPhg9PaCD7do+gdp=EzHRN8upGKzVQ@mail.gmail.com>
 <82A9C5AA-F0F3-4FB7-BF36-B6557103080E@oracle.com>
 <CAJq4Gi47XDQHNzOL4JYnNAOiDjAGh9r_zQRqGQGjq=i8R8wE7A@mail.gmail.com>
 <FDDC8884-C09A-4008-8E4A-EE3553C09250@oracle.com>
 <e663d0c3-ce48-3dd1-1f1c-2d372ea82b48@oracle.com>
Message-ID: <816087489.174195.1640027158110.JavaMail.zimbra@u-pem.fr>

Brian, 
the last time we talked about IdentityObject and ValueObject, you said that you were aware that introducing those interfaces will break some existing codes, 
but you wanted to know if it was a lot of codes or not. 

So i do not understand now why you want to mix IdentityObject/ValueObject with the runtime behavior, it seems risky and if we need to backout the introduction of those interfaces, it will more work than it should. 
Decoupling the typing part and the runtime behavior seems a better solution. 

Moreover, the split between IdentityObject and ValueObject makes less sense now that we have 3 kinds of value objects, the identityless reference (B2), the primitive (B3) and the builtin primitive (B4). 
Why do we want these types to be seen in the type system but not by example the set containing only B3 and B4 ? 

R?mi 

> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "daniel smith" <daniel.smith at oracle.com>, "Dan Heidinga"
> <heidinga at redhat.com>
> Cc: "John Rose" <john.r.rose at oracle.com>, "valhalla-spec-experts"
> <valhalla-spec-experts at openjdk.java.net>
> Sent: Lundi 20 D?cembre 2021 18:54:01
> Subject: Re: JEP update: Value Objects

> I was working on some docs and am not sure if we came to a conclusion on the
> rules about who may, may not, or must declare ValueObject or IdentityObject.

> Let me see if I can chart the boundaries of the design space. I'll start with
> IdentityObject since it is more constrained.

> - Clearly for legacy classes, the VM is going to have to infer and inject
> IdentityObject.
> - Since IdentityObject is an interface, it is inherited; if my super implements
> IO, so am I.
> - It seems desirable that a user be *allowed* to name IdentityObject as a
> superinterface of an interface or abstract class, which constrains what
> subclasses can do. (Alternately we could spell this "value interface" or "value
> abstract class"; this is a separate set of tradeoffs.)
> - There is value in having exactly one way to say certain things; it reduces the
> space of what has to be specified and tested.
> - I believe our goal is to know everything we need to know at class load time,
> and not to have to go back and do complex checks on a supertype when a subclass
> is loaded.

> The choice space seems to be
> user { must, may, may not } specify IO on concrete classes
> x compiler { must, may, may not } specify IO when ACC_VALUE present
> x VM (and reflection) { mops up }

> where "mopping up" minimally includes dealing with legacy classfiles.

> Asking the user to say "IdentityObject" on each identity class seems ridiculous,
> so we can drop that one.

> user { may, may not } specify IO on concrete classes
> x compiler { must, may, may not } specify IO when ACC_VALUE present
> x VM (and reflection) { mops up }

> From a user model perspective, it seems arbitrary to say the user may not
> explicitly say IO for concrete classes, but may so do for abstract classes. So
> the two consistent user choices are either:

> - User can say "implements IO" anywhere they like
> - User cannot say "implements IO" anywhere, and instead we have an "identity"
> modifier which is optional on concrete classes and acts as a constraint on
> abstract classes/interfaces.

> While having an "identity" modifier is nice from a completeness perspective, the
> fact that it is probably erased to "implements IdentityObject" creates
> complication for reflection (and another asymmetry between reflection and
> javax.lang.model). So it seems that just letting users say "implements
> IdentityObject" is reasonable.

> Given that the user has a choice, there is little value in "compiler may not
> inject", so the choice for the compiler here is "must" vs "may" inject. Which
> is really asking whether we want to draw the VM line at legacy vs new
> classfiles, or merely adding IO as a default when nothing else has been
> selected. Note that asking the compiler to inject based on ACC_VALUE is also
> asking pretty much everything that touches bytecode to do this too, and likely
> to generate more errors from bytecode manglers. The VM is doing inference
> either way, what we get to choose here is the axis.

> Let's put a pin in IO and come back to VO.

> The user is already saying "value", and we're stuck with the default being
> "identity". Unless we want to have the user say "value interface" for a
> value-only interface (which moves some complexity into reflection, but is also
> a consistent model), I think we're stuck with letting the user specify either
> IO/VO on an abstract class / interface, which sort of drags us towards letting
> the user say it (redundantly) on concrete classes too.

> The compiler and VM will always type-check the consistency of the value
> keyword/bit and the implements clause. So the real question is where the
> inference/injection happens. And the VM will have to do injection for at least
> IO at least for legacy classes.

> So the choices for VM infer&inject seem to be:

> - Only inject IO for legacy concrete classes, based on classfile version,
> otherwise require everything to be explicit;
> - Inject IO for concrete classes when ACC_VALUE is not present, require VO to be
> explicit;
> - Inject IO for concrete classes when ACC_VALUE is not present; inject VO for
> concrete classes when ACC_VALUE is present

> Is infer&inject measurably more costly than just ordinary classfile checking? It
> seems to me that if all things are equal, the simpler injection rule is
> preferable (the third), mostly on the basis of what it asks of humans who write
> code to manipulate bytecode, but if there's a real cost to the injection, then
> having the compiler help out is reasonable. (But in that case, it probably
> makes sense for the compiler to help out in all cases, not just VO.)

> On 12/2/2021 6:11 PM, Dan Smith wrote:

>>> On Dec 2, 2021, at 1:04 PM, Dan Heidinga [ mailto:heidinga at redhat.com |
>>> <heidinga at redhat.com> ] wrote:

>>> On Thu, Dec 2, 2021 at 10:05 AM Dan Smith [ mailto:daniel.smith at oracle.com |
>>> <daniel.smith at oracle.com> ] wrote:

>>>> On Dec 2, 2021, at 7:08 AM, Dan Heidinga [ mailto:heidinga at redhat.com |
>>>> <heidinga at redhat.com> ] wrote:

>>>> When converting back from our internal form to a classfile for the
>>>> JVMTI RetransformClasses agents, I need to either filter the interface
>>>> out if we injected it or not if it was already there.  JVMTI's
>>>> GetImplementedInterfaces call has a similar issue with being
>>>> consistent - and that's really the same issue as reflection.

>>>> There's a lot of small places that can easily become inconsistent -
>>>> and therefore a lot of places that need to be checked - to hide
>>>> injected interfaces.  The easiest solution to that is to avoid
>>>> injecting interfaces in cases where javac can do it for us so the VM
>>>> has a consistent view.

>>>> I think you may be envisioning extra complexity that isn't needed here. The plan
>>>> of record is that we *won't* hide injected interfaces.

>>> +1.  I'm 100% on board with this approach.  It cleans up a lot of the
>>> potential corner cases.

>>>> Our hope is that the implicit/explicit distinction is meaningless?that turning
>>>> implicit into explicit via JVMTI would be a 100% equivalent change. I don't
>>>> know JVMTI well, so I'm not sure if there's some reason to think that wouldn't
>>>> be acceptable...

>>> JVMTI's "GetImplementedInterfaces" spec will need some adaptation as
>>> it currently states "Return the direct super-interfaces of this class.
>>> For a class, this function returns the interfaces declared in its
>>> implements clause."

>>> The ClassFileLoadHook (CFLH) runs either with the original bytecodes
>>> as passed to the VM (the first time) or with "morally equivalent"
>>> bytecodes recreated by the VM from its internal classfile formats.
>>> The first time through the process the agent may see a value class
>>> that doesn't have the VO interface directly listed while after a call
>>> to {retransform,redefine}Classes, the VO interface may be directly
>>> listed.  The same issues apply to the IO interface with legacy
>>> classfiles so with some minor spec updates, we can paper over that.

>>> Those are the only two places: GetImplementedInterfaces & CFLH and
>>> related redefine/retransform functions, I can find in the JVMTI spec
>>> that would be affected.  Some minor spec updates should be able to
>>> address both to ensure an inconsistency in the observed behaviour is
>>> treated as valid.

>> Useful details, thanks.

>> Would it be a problem if the ClassFileLoadHook gives different answers depending
>> on the timing of the request (derived from original bytecodes vs. JVM-internal
>> data)? If we need consistent answers, it may be that the "original bytecode"
>> approach needs to reproduce the JVM's inference logic. If it's okay for the
>> answers to change, there's less work to do.

>> To highlight your last point: we *will* need to work this out for inferred
>> IdentityObject, whether we decide to infer ValueObject or not.

From brian.goetz at oracle.com  Mon Dec 20 19:26:01 2021
From: brian.goetz at oracle.com (Brian Goetz)
Date: Mon, 20 Dec 2021 14:26:01 -0500
Subject: Do we even need IO/VO interfaces? (was: JEP update: Value Objects)
In-Reply-To: <816087489.174195.1640027158110.JavaMail.zimbra@u-pem.fr>
References: <68250ADC-90BB-43EC-A646-77127091D4BD@oracle.com>
 <8AD4B184-2937-4146-A763-612E31E64683@oracle.com>
 <6776971B-F8B1-416D-8A4F-32EAE842AC03@oracle.com>
 <CAJq4Gi5jDq8jn=p6kXxSPPhg9PaCD7do+gdp=EzHRN8upGKzVQ@mail.gmail.com>
 <82A9C5AA-F0F3-4FB7-BF36-B6557103080E@oracle.com>
 <CAJq4Gi47XDQHNzOL4JYnNAOiDjAGh9r_zQRqGQGjq=i8R8wE7A@mail.gmail.com>
 <FDDC8884-C09A-4008-8E4A-EE3553C09250@oracle.com>
 <e663d0c3-ce48-3dd1-1f1c-2d372ea82b48@oracle.com>
 <816087489.174195.1640027158110.JavaMail.zimbra@u-pem.fr>
Message-ID: <d8c6fb44-e6df-2767-db24-0782fcd464f6@oracle.com>

I thought we were wrapping this up; I'm not sure how we got back to "do 
we even need these at all", but OK.? Splitting off a separate (hopefully 
short) thread.

These interfaces serve both a dynamic and static role. Statically, they 
allow us to constrain inputs, such as:

 ??? void runWithLock(IdentityObject lock, Runnable task)

and similar use in generic type bounds.

Dynamically, they allow code to check before doing something partial:

 ??? if (x instanceof IdentityObject) { synchronized(x) { ... } }

rather than trying and dealing with IMSE.

Introducing new interfaces that have no methods is clearly source- and 
binary compatible, so I am not particularly compelled by "some very 
brittle and badly written code might break."? So far, no one has 
proposed any examples that would make us reconsider that.

As to "value class" vs "primitive class" vs "built in primitive", I see 
no reason to add *additional* mechanisms by which to distinguish these 
in either the static or dynamic type systems; the salient difference is 
identity vs value. (Reflection will almost certainly give us means to 
ask questions about how the class was declared, though.)

As to B3: instanceof operates on reference types, so (at least from a 
pure spec / model perspective), `x instanceof T` gets answered on value 
instances by lifting to the reference type, and answering the question 
there.? So it would not even be a sensible question to ask "are you a 
primitive value vs primitive reference"; subtyping is a "reference 
affordance", and questions about subtyping are answered in the reference 
domain.

And to B4: the goal is to make B3 and B4 as similar as possible; there 
are going to be obvious ways in which we can't do this, but this should 
not be relevant to either the static or dynamic type system.


On 12/20/2021 2:05 PM, Remi Forax wrote:
> Brian,
> the last time we talked about IdentityObject and ValueObject, you said 
> that you were aware that introducing those interfaces will break some 
> existing codes,
> but you wanted to know if it was a lot of codes or not.
>
> So i do not understand now why you want to mix 
> IdentityObject/ValueObject with the runtime behavior, it seems risky 
> and if we need to backout the introduction of those interfaces, it 
> will more work than it should.
> Decoupling the typing part and the runtime behavior seems a better 
> solution.
>
> Moreover, the split between IdentityObject and ValueObject makes less 
> sense now that we have 3 kinds of value objects, the identityless 
> reference (B2), the primitive (B3) and the builtin primitive (B4).
> Why do we want these types to be seen in the type system but not by 
> example the set containing only B3 and B4 ?
>
> R?mi
>
> ------------------------------------------------------------------------
>
>     *From: *"Brian Goetz" <brian.goetz at oracle.com>
>     *To: *"daniel smith" <daniel.smith at oracle.com>, "Dan Heidinga"
>     <heidinga at redhat.com>
>     *Cc: *"John Rose" <john.r.rose at oracle.com>,
>     "valhalla-spec-experts" <valhalla-spec-experts at openjdk.java.net>
>     *Sent: *Lundi 20 D?cembre 2021 18:54:01
>     *Subject: *Re: JEP update: Value Objects
>
>     I was working on some docs and am not sure if we came to a
>     conclusion on the rules about who may, may not, or must declare
>     ValueObject or IdentityObject.
>
>     Let me see if I can chart the boundaries of the design space.?
>     I'll start with IdentityObject since it is more constrained.
>
>     ?- Clearly for legacy classes, the VM is going to have to infer
>     and inject IdentityObject.
>     ?- Since IdentityObject is an interface, it is inherited; if my
>     super implements IO, so am I.
>     ?- It seems desirable that a user be *allowed* to name
>     IdentityObject as a superinterface of an interface or abstract
>     class, which constrains what subclasses can do.? (Alternately we
>     could spell this "value interface" or "value abstract class"; this
>     is a separate set of tradeoffs.)
>     ?- There is value in having exactly one way to say certain things;
>     it reduces the space of what has to be specified and tested.
>     ?- I believe our goal is to know everything we need to know at
>     class load time, and not to have to go back and do complex checks
>     on a supertype when a subclass is loaded.
>
>     The choice space seems to be
>     ? user { must, may, may not } specify IO on concrete classes
>     ? x compiler { must, may, may not } specify IO when ACC_VALUE present
>     ? x VM (and reflection) { mops up }
>
>     where "mopping up" minimally includes dealing with legacy classfiles.
>
>     Asking the user to say "IdentityObject" on each identity class
>     seems ridiculous, so we can drop that one.
>
>     ? user { may, may not } specify IO on concrete classes
>     ? x compiler { must, may, may not } specify IO when ACC_VALUE present
>     ? x VM (and reflection) { mops up }
>
>     From a user model perspective, it seems arbitrary to say the user
>     may not explicitly say IO for concrete classes, but may so do for
>     abstract classes.? So the two consistent user choices are either:
>
>     ?- User can say "implements IO" anywhere they like
>     ?- User cannot say "implements IO" anywhere, and instead we have
>     an "identity" modifier which is optional on concrete classes and
>     acts as a constraint on abstract classes/interfaces.
>
>     While having an "identity" modifier is nice from a completeness
>     perspective, the fact that it is probably erased to "implements
>     IdentityObject" creates complication for reflection (and another
>     asymmetry between reflection and javax.lang.model).? So it seems
>     that just letting users say "implements IdentityObject" is
>     reasonable.
>
>     Given that the user has a choice, there is little value in
>     "compiler may not inject", so the choice for the compiler here is
>     "must" vs "may" inject.? Which is really asking whether we want to
>     draw the VM line at legacy vs new classfiles, or merely adding IO
>     as a default when nothing else has been selected.? Note that
>     asking the compiler to inject based on ACC_VALUE is also asking
>     pretty much everything that touches bytecode to do this too, and
>     likely to generate more errors from bytecode manglers.? The VM is
>     doing inference either way, what we get to choose here is the axis.
>
>     Let's put a pin in IO and come back to VO.
>
>     The user is already saying "value", and we're stuck with the
>     default being "identity".? Unless we want to have the user say
>     "value interface" for a value-only interface (which moves some
>     complexity into reflection, but is also a consistent model), I
>     think we're stuck with letting the user specify either IO/VO on an
>     abstract class / interface, which sort of drags us towards letting
>     the user say it (redundantly) on concrete classes too.
>
>     The compiler and VM will always type-check the consistency of the
>     value keyword/bit and the implements clause.? So the real question
>     is where the inference/injection happens.? And the VM will have to
>     do injection for at least IO at least for legacy classes.
>
>     So the choices for VM infer&inject seem to be:
>
>     ?- Only inject IO for legacy concrete classes, based on classfile
>     version, otherwise require everything to be explicit;
>     ?- Inject IO for concrete classes when ACC_VALUE is not present,
>     require VO to be explicit;
>     ?- Inject IO for concrete classes when ACC_VALUE is not present;
>     inject VO for concrete classes when ACC_VALUE is present
>
>     Is infer&inject measurably more costly than just ordinary
>     classfile checking?? It seems to me that if all things are equal,
>     the simpler injection rule is preferable (the third), mostly on
>     the basis of what it asks of humans who write code to manipulate
>     bytecode, but if there's a real cost to the injection, then having
>     the compiler help out is reasonable.? (But in that case, it
>     probably makes sense for the compiler to help out in all cases,
>     not just VO.)
>
>
>
>     On 12/2/2021 6:11 PM, Dan Smith wrote:
>
>             On Dec 2, 2021, at 1:04 PM, Dan Heidinga<heidinga at redhat.com>  wrote:
>
>             On Thu, Dec 2, 2021 at 10:05 AM Dan Smith<daniel.smith at oracle.com>  wrote:
>
>                 On Dec 2, 2021, at 7:08 AM, Dan Heidinga<heidinga at redhat.com>  wrote:
>
>                 When converting back from our internal form to a classfile for the
>                 JVMTI RetransformClasses agents, I need to either filter the interface
>                 out if we injected it or not if it was already there.  JVMTI's
>                 GetImplementedInterfaces call has a similar issue with being
>                 consistent - and that's really the same issue as reflection.
>
>                 There's a lot of small places that can easily become inconsistent -
>                 and therefore a lot of places that need to be checked - to hide
>                 injected interfaces.  The easiest solution to that is to avoid
>                 injecting interfaces in cases where javac can do it for us so the VM
>                 has a consistent view.
>
>
>                 I think you may be envisioning extra complexity that isn't needed here. The plan of record is that we *won't* hide injected interfaces.
>
>             +1.  I'm 100% on board with this approach.  It cleans up a lot of the
>             potential corner cases.
>
>                 Our hope is that the implicit/explicit distinction is meaningless?that turning implicit into explicit via JVMTI would be a 100% equivalent change. I don't know JVMTI well, so I'm not sure if there's some reason to think that wouldn't be acceptable...
>
>             JVMTI's "GetImplementedInterfaces" spec will need some adaptation as
>             it currently states "Return the direct super-interfaces of this class.
>             For a class, this function returns the interfaces declared in its
>             implements clause."
>
>             The ClassFileLoadHook (CFLH) runs either with the original bytecodes
>             as passed to the VM (the first time) or with "morally equivalent"
>             bytecodes recreated by the VM from its internal classfile formats.
>             The first time through the process the agent may see a value class
>             that doesn't have the VO interface directly listed while after a call
>             to {retransform,redefine}Classes, the VO interface may be directly
>             listed.  The same issues apply to the IO interface with legacy
>             classfiles so with some minor spec updates, we can paper over that.
>
>             Those are the only two places: GetImplementedInterfaces & CFLH and
>             related redefine/retransform functions, I can find in the JVMTI spec
>             that would be affected.  Some minor spec updates should be able to
>             address both to ensure an inconsistency in the observed behaviour is
>             treated as valid.
>
>         Useful details, thanks.
>
>         Would it be a problem if the ClassFileLoadHook gives different answers depending on the timing of the request (derived from original bytecodes vs. JVM-internal data)? If we need consistent answers, it may be that the "original bytecode" approach needs to reproduce the JVM's inference logic. If it's okay for the answers to change, there's less work to do.
>
>         To highlight your last point: we *will* need to work this out for inferred IdentityObject, whether we decide to infer ValueObject or not.
>
>
>

From forax at univ-mlv.fr  Tue Dec 21 00:00:36 2021
From: forax at univ-mlv.fr (forax at univ-mlv.fr)
Date: Tue, 21 Dec 2021 01:00:36 +0100 (CET)
Subject: Do we even need IO/VO interfaces? (was: JEP update: Value Objects)
In-Reply-To: <d8c6fb44-e6df-2767-db24-0782fcd464f6@oracle.com>
References: <68250ADC-90BB-43EC-A646-77127091D4BD@oracle.com>
 <CAJq4Gi5jDq8jn=p6kXxSPPhg9PaCD7do+gdp=EzHRN8upGKzVQ@mail.gmail.com>
 <82A9C5AA-F0F3-4FB7-BF36-B6557103080E@oracle.com>
 <CAJq4Gi47XDQHNzOL4JYnNAOiDjAGh9r_zQRqGQGjq=i8R8wE7A@mail.gmail.com>
 <FDDC8884-C09A-4008-8E4A-EE3553C09250@oracle.com>
 <e663d0c3-ce48-3dd1-1f1c-2d372ea82b48@oracle.com>
 <816087489.174195.1640027158110.JavaMail.zimbra@u-pem.fr>
 <d8c6fb44-e6df-2767-db24-0782fcd464f6@oracle.com>
Message-ID: <410768406.203519.1640044836675.JavaMail.zimbra@u-pem.fr>

> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "daniel smith" <daniel.smith at oracle.com>, "Dan Heidinga"
> <heidinga at redhat.com>, "John Rose" <john.r.rose at oracle.com>,
> "valhalla-spec-experts" <valhalla-spec-experts at openjdk.java.net>
> Sent: Lundi 20 D?cembre 2021 20:26:01
> Subject: Do we even need IO/VO interfaces? (was: JEP update: Value Objects)

> I thought we were wrapping this up; I'm not sure how we got back to "do we even
> need these at all", but OK. Splitting off a separate (hopefully short) thread.

> These interfaces serve both a dynamic and static role. Statically, they allow us
> to constrain inputs, such as:

> void runWithLock(IdentityObject lock, Runnable task)

> and similar use in generic type bounds.

> Dynamically, they allow code to check before doing something partial:

> if (x instanceof IdentityObject) { synchronized(x) { ... } }

> rather than trying and dealing with IMSE.
The static role is defeated by having a java.lang.Object, a super type for both IdentityObject and ValueObject. 
java.io.Serializable is useless as a type, ObjectOutputStream.writeObject() takes an Object not a Serializable, 
same for Arrays.sort() that takes an Object[] and not an array of Comparable, 
IdentityObject (like Serializable or Comparable) as a type can be easily lost because of the existence of Object. 

If the type IdentityObject can be lost, as a designer, there is little point to have a method that takes an IdentityObject as parameter, because it forces the user of the API to use a cast, trading a CCE for an IMSE. 

For the dynamic role, x.getClass().isValue() does the same thing in a more efficient way (apart if the VM has a special optimization for IdentityObject). 

Moreover, there are very few methods that synchronize on a user provided object because it makes the concurrent code hard to reason about it. 
Adding a bit in the type system to support codes that people should not write is not exactly a win. 

> Introducing new interfaces that have no methods is clearly source- and binary
> compatible, so I am not particularly compelled by "some very brittle and badly
> written code might break." So far, no one has proposed any examples that would
> make us reconsider that.
??; 
you are forgetting inference, this code will fail to compile 
class A {} 
class B {} 
var list = List.of(new A(), new B()); 
List<Object> list2 = list: 

> As to "value class" vs "primitive class" vs "built in primitive", I see no
> reason to add *additional* mechanisms by which to distinguish these in either
> the static or dynamic type systems; the salient difference is identity vs
> value. (Reflection will almost certainly give us means to ask questions about
> how the class was declared, though.)
Primitive (builtin or not) allows tearing, so we should introduce two interfaces TearableObject and NonTeareableObject, because knowing if something is tearable or not clearly changes the algorithm that can be used. 

> As to B3: instanceof operates on reference types, so (at least from a pure spec
> / model perspective), `x instanceof T` gets answered on value instances by
> lifting to the reference type, and answering the question there. So it would
> not even be a sensible question to ask "are you a primitive value vs primitive
> reference"; subtyping is a "reference affordance", and questions about
> subtyping are answered in the reference domain.

> And to B4: the goal is to make B3 and B4 as similar as possible; there are going
> to be obvious ways in which we can't do this, but this should not be relevant
> to either the static or dynamic type system.
I agree that B3 and B4 should be as similar as possible, we still need Class.isPrimitive() to only return true for builtin primitives to be backward compatible. 

R?mi 

> On 12/20/2021 2:05 PM, Remi Forax wrote:

>> Brian,
>> the last time we talked about IdentityObject and ValueObject, you said that you
>> were aware that introducing those interfaces will break some existing codes,
>> but you wanted to know if it was a lot of codes or not.

>> So i do not understand now why you want to mix IdentityObject/ValueObject with
>> the runtime behavior, it seems risky and if we need to backout the introduction
>> of those interfaces, it will more work than it should.
>> Decoupling the typing part and the runtime behavior seems a better solution.

>> Moreover, the split between IdentityObject and ValueObject makes less sense now
>> that we have 3 kinds of value objects, the identityless reference (B2), the
>> primitive (B3) and the builtin primitive (B4).
>> Why do we want these types to be seen in the type system but not by example the
>> set containing only B3 and B4 ?

>> R?mi

>>> From: "Brian Goetz" [ mailto:brian.goetz at oracle.com | <brian.goetz at oracle.com> ]
>>> To: "daniel smith" [ mailto:daniel.smith at oracle.com | <daniel.smith at oracle.com>
>>> ] , "Dan Heidinga" [ mailto:heidinga at redhat.com | <heidinga at redhat.com> ]
>>> Cc: "John Rose" [ mailto:john.r.rose at oracle.com | <john.r.rose at oracle.com> ] ,
>>> "valhalla-spec-experts" [ mailto:valhalla-spec-experts at openjdk.java.net |
>>> <valhalla-spec-experts at openjdk.java.net> ]
>>> Sent: Lundi 20 D?cembre 2021 18:54:01
>>> Subject: Re: JEP update: Value Objects

>>> I was working on some docs and am not sure if we came to a conclusion on the
>>> rules about who may, may not, or must declare ValueObject or IdentityObject.

>>> Let me see if I can chart the boundaries of the design space. I'll start with
>>> IdentityObject since it is more constrained.

>>> - Clearly for legacy classes, the VM is going to have to infer and inject
>>> IdentityObject.
>>> - Since IdentityObject is an interface, it is inherited; if my super implements
>>> IO, so am I.
>>> - It seems desirable that a user be *allowed* to name IdentityObject as a
>>> superinterface of an interface or abstract class, which constrains what
>>> subclasses can do. (Alternately we could spell this "value interface" or "value
>>> abstract class"; this is a separate set of tradeoffs.)
>>> - There is value in having exactly one way to say certain things; it reduces the
>>> space of what has to be specified and tested.
>>> - I believe our goal is to know everything we need to know at class load time,
>>> and not to have to go back and do complex checks on a supertype when a subclass
>>> is loaded.

>>> The choice space seems to be
>>> user { must, may, may not } specify IO on concrete classes
>>> x compiler { must, may, may not } specify IO when ACC_VALUE present
>>> x VM (and reflection) { mops up }

>>> where "mopping up" minimally includes dealing with legacy classfiles.

>>> Asking the user to say "IdentityObject" on each identity class seems ridiculous,
>>> so we can drop that one.

>>> user { may, may not } specify IO on concrete classes
>>> x compiler { must, may, may not } specify IO when ACC_VALUE present
>>> x VM (and reflection) { mops up }

>>> From a user model perspective, it seems arbitrary to say the user may not
>>> explicitly say IO for concrete classes, but may so do for abstract classes. So
>>> the two consistent user choices are either:

>>> - User can say "implements IO" anywhere they like
>>> - User cannot say "implements IO" anywhere, and instead we have an "identity"
>>> modifier which is optional on concrete classes and acts as a constraint on
>>> abstract classes/interfaces.

>>> While having an "identity" modifier is nice from a completeness perspective, the
>>> fact that it is probably erased to "implements IdentityObject" creates
>>> complication for reflection (and another asymmetry between reflection and
>>> javax.lang.model). So it seems that just letting users say "implements
>>> IdentityObject" is reasonable.

>>> Given that the user has a choice, there is little value in "compiler may not
>>> inject", so the choice for the compiler here is "must" vs "may" inject. Which
>>> is really asking whether we want to draw the VM line at legacy vs new
>>> classfiles, or merely adding IO as a default when nothing else has been
>>> selected. Note that asking the compiler to inject based on ACC_VALUE is also
>>> asking pretty much everything that touches bytecode to do this too, and likely
>>> to generate more errors from bytecode manglers. The VM is doing inference
>>> either way, what we get to choose here is the axis.

>>> Let's put a pin in IO and come back to VO.

>>> The user is already saying "value", and we're stuck with the default being
>>> "identity". Unless we want to have the user say "value interface" for a
>>> value-only interface (which moves some complexity into reflection, but is also
>>> a consistent model), I think we're stuck with letting the user specify either
>>> IO/VO on an abstract class / interface, which sort of drags us towards letting
>>> the user say it (redundantly) on concrete classes too.

>>> The compiler and VM will always type-check the consistency of the value
>>> keyword/bit and the implements clause. So the real question is where the
>>> inference/injection happens. And the VM will have to do injection for at least
>>> IO at least for legacy classes.

>>> So the choices for VM infer&inject seem to be:

>>> - Only inject IO for legacy concrete classes, based on classfile version,
>>> otherwise require everything to be explicit;
>>> - Inject IO for concrete classes when ACC_VALUE is not present, require VO to be
>>> explicit;
>>> - Inject IO for concrete classes when ACC_VALUE is not present; inject VO for
>>> concrete classes when ACC_VALUE is present

>>> Is infer&inject measurably more costly than just ordinary classfile checking? It
>>> seems to me that if all things are equal, the simpler injection rule is
>>> preferable (the third), mostly on the basis of what it asks of humans who write
>>> code to manipulate bytecode, but if there's a real cost to the injection, then
>>> having the compiler help out is reasonable. (But in that case, it probably
>>> makes sense for the compiler to help out in all cases, not just VO.)

>>> On 12/2/2021 6:11 PM, Dan Smith wrote:

>>>>> On Dec 2, 2021, at 1:04 PM, Dan Heidinga [ mailto:heidinga at redhat.com |
>>>>> <heidinga at redhat.com> ] wrote:

>>>>> On Thu, Dec 2, 2021 at 10:05 AM Dan Smith [ mailto:daniel.smith at oracle.com |
>>>>> <daniel.smith at oracle.com> ] wrote:

>>>>>> On Dec 2, 2021, at 7:08 AM, Dan Heidinga [ mailto:heidinga at redhat.com |
>>>>>> <heidinga at redhat.com> ] wrote:

>>>>>> When converting back from our internal form to a classfile for the
>>>>>> JVMTI RetransformClasses agents, I need to either filter the interface
>>>>>> out if we injected it or not if it was already there.  JVMTI's
>>>>>> GetImplementedInterfaces call has a similar issue with being
>>>>>> consistent - and that's really the same issue as reflection.

>>>>>> There's a lot of small places that can easily become inconsistent -
>>>>>> and therefore a lot of places that need to be checked - to hide
>>>>>> injected interfaces.  The easiest solution to that is to avoid
>>>>>> injecting interfaces in cases where javac can do it for us so the VM
>>>>>> has a consistent view.

>>>>>> I think you may be envisioning extra complexity that isn't needed here. The plan
>>>>>> of record is that we *won't* hide injected interfaces.

>>>>> +1.  I'm 100% on board with this approach.  It cleans up a lot of the
>>>>> potential corner cases.

>>>>>> Our hope is that the implicit/explicit distinction is meaningless?that turning
>>>>>> implicit into explicit via JVMTI would be a 100% equivalent change. I don't
>>>>>> know JVMTI well, so I'm not sure if there's some reason to think that wouldn't
>>>>>> be acceptable...

>>>>> JVMTI's "GetImplementedInterfaces" spec will need some adaptation as
>>>>> it currently states "Return the direct super-interfaces of this class.
>>>>> For a class, this function returns the interfaces declared in its
>>>>> implements clause."

>>>>> The ClassFileLoadHook (CFLH) runs either with the original bytecodes
>>>>> as passed to the VM (the first time) or with "morally equivalent"
>>>>> bytecodes recreated by the VM from its internal classfile formats.
>>>>> The first time through the process the agent may see a value class
>>>>> that doesn't have the VO interface directly listed while after a call
>>>>> to {retransform,redefine}Classes, the VO interface may be directly
>>>>> listed.  The same issues apply to the IO interface with legacy
>>>>> classfiles so with some minor spec updates, we can paper over that.

>>>>> Those are the only two places: GetImplementedInterfaces & CFLH and
>>>>> related redefine/retransform functions, I can find in the JVMTI spec
>>>>> that would be affected.  Some minor spec updates should be able to
>>>>> address both to ensure an inconsistency in the observed behaviour is
>>>>> treated as valid.

>>>> Useful details, thanks.

>>>> Would it be a problem if the ClassFileLoadHook gives different answers depending
>>>> on the timing of the request (derived from original bytecodes vs. JVM-internal
>>>> data)? If we need consistent answers, it may be that the "original bytecode"
>>>> approach needs to reproduce the JVM's inference logic. If it's okay for the
>>>> answers to change, there's less work to do.

>>>> To highlight your last point: we *will* need to work this out for inferred
>>>> IdentityObject, whether we decide to infer ValueObject or not.

From brian.goetz at oracle.com  Tue Dec 21 00:07:15 2021
From: brian.goetz at oracle.com (Brian Goetz)
Date: Mon, 20 Dec 2021 19:07:15 -0500
Subject: [External] : Re: Do we even need IO/VO interfaces? (was: JEP
 update: Value Objects)
In-Reply-To: <410768406.203519.1640044836675.JavaMail.zimbra@u-pem.fr>
References: <68250ADC-90BB-43EC-A646-77127091D4BD@oracle.com>
 <CAJq4Gi5jDq8jn=p6kXxSPPhg9PaCD7do+gdp=EzHRN8upGKzVQ@mail.gmail.com>
 <82A9C5AA-F0F3-4FB7-BF36-B6557103080E@oracle.com>
 <CAJq4Gi47XDQHNzOL4JYnNAOiDjAGh9r_zQRqGQGjq=i8R8wE7A@mail.gmail.com>
 <FDDC8884-C09A-4008-8E4A-EE3553C09250@oracle.com>
 <e663d0c3-ce48-3dd1-1f1c-2d372ea82b48@oracle.com>
 <816087489.174195.1640027158110.JavaMail.zimbra@u-pem.fr>
 <d8c6fb44-e6df-2767-db24-0782fcd464f6@oracle.com>
 <410768406.203519.1640044836675.JavaMail.zimbra@u-pem.fr>
Message-ID: <9cc80909-6f9d-d604-26f7-bb387ccc677b@oracle.com>


>
>
>     Introducing new interfaces that have no methods is clearly source-
>     and binary compatible, so I am not particularly compelled by "some
>     very brittle and badly written code might break."? So far, no one
>     has proposed any examples that would make us reconsider that. 
>
>
> ??;
> you are forgetting inference, this code will fail to compile
> ? class A {}
> ? class B {}
> ? var list = List.of(new A(), new B());
> ? List<Object> list2 = list:
>

Good catch.? There is precedent for leaving certain interfaces out of 
inference, though; I suspect we will want to do this for these 
interfaces too.


From forax at univ-mlv.fr  Tue Dec 21 01:00:05 2021
From: forax at univ-mlv.fr (forax at univ-mlv.fr)
Date: Tue, 21 Dec 2021 02:00:05 +0100 (CET)
Subject: [External] : Re: Do we even need IO/VO interfaces? (was: JEP
 update: Value Objects)
In-Reply-To: <9cc80909-6f9d-d604-26f7-bb387ccc677b@oracle.com>
References: <68250ADC-90BB-43EC-A646-77127091D4BD@oracle.com>
 <CAJq4Gi47XDQHNzOL4JYnNAOiDjAGh9r_zQRqGQGjq=i8R8wE7A@mail.gmail.com>
 <FDDC8884-C09A-4008-8E4A-EE3553C09250@oracle.com>
 <e663d0c3-ce48-3dd1-1f1c-2d372ea82b48@oracle.com>
 <816087489.174195.1640027158110.JavaMail.zimbra@u-pem.fr>
 <d8c6fb44-e6df-2767-db24-0782fcd464f6@oracle.com>
 <410768406.203519.1640044836675.JavaMail.zimbra@u-pem.fr>
 <9cc80909-6f9d-d604-26f7-bb387ccc677b@oracle.com>
Message-ID: <802119714.214685.1640048405163.JavaMail.zimbra@u-pem.fr>

> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "daniel smith" <daniel.smith at oracle.com>, "Dan Heidinga"
> <heidinga at redhat.com>, "John Rose" <john.r.rose at oracle.com>,
> "valhalla-spec-experts" <valhalla-spec-experts at openjdk.java.net>
> Sent: Mardi 21 D?cembre 2021 01:07:15
> Subject: Re: [External] : Re: Do we even need IO/VO interfaces? (was: JEP
> update: Value Objects)

>>> Introducing new interfaces that have no methods is clearly source- and binary
>>> compatible, so I am not particularly compelled by "some very brittle and badly
>>> written code might break." So far, no one has proposed any examples that would
>>> make us reconsider that.
>> ??;
>> you are forgetting inference, this code will fail to compile
>> class A {}
>> class B {}
>> var list = List.of(new A(), new B());
>> List<Object> list2 = list:

> Good catch. There is precedent for leaving certain interfaces out of inference,
> though; I suspect we will want to do this for these interfaces too.
The problem is that these interfaces are only useful if they are propagated along the expression flow. 
But 
- if something is typed Object or Object[], that information is lost 
- if something is typed with an interface, that information is lost (only the concrete classes implement those interfaces) 
- you are saying that in case of inference, they are removed from the flow too. 

It seems they are only useful on a blue moon. 

R?mi 

From daniel.smith at oracle.com  Tue Dec 21 20:07:17 2021
From: daniel.smith at oracle.com (Dan Smith)
Date: Tue, 21 Dec 2021 20:07:17 +0000
Subject: JEP update: Value Objects
In-Reply-To: <e663d0c3-ce48-3dd1-1f1c-2d372ea82b48@oracle.com>
References: <68250ADC-90BB-43EC-A646-77127091D4BD@oracle.com>
 <CAJq4Gi4gjzZHkbuCfSomZsd7vzSuAHbOXuYRG=Wcbq1DD38n=Q@mail.gmail.com>
 <117E6CD9-9D94-4110-BA40-3778FC207977@oracle.com>
 <8AD4B184-2937-4146-A763-612E31E64683@oracle.com>
 <6776971B-F8B1-416D-8A4F-32EAE842AC03@oracle.com>
 <CAJq4Gi5jDq8jn=p6kXxSPPhg9PaCD7do+gdp=EzHRN8upGKzVQ@mail.gmail.com>
 <82A9C5AA-F0F3-4FB7-BF36-B6557103080E@oracle.com>
 <CAJq4Gi47XDQHNzOL4JYnNAOiDjAGh9r_zQRqGQGjq=i8R8wE7A@mail.gmail.com>
 <FDDC8884-C09A-4008-8E4A-EE3553C09250@oracle.com>
 <e663d0c3-ce48-3dd1-1f1c-2d372ea82b48@oracle.com>
Message-ID: <F1C67BCB-DBAB-49A4-B9BD-6253C093BA0F@oracle.com>

> On Dec 20, 2021, at 10:54 AM, Brian Goetz <brian.goetz at oracle.com> wrote:
> 
> 
> So the choices for VM infer&inject seem to be:
> 
>  - Only inject IO for legacy concrete classes, based on classfile version, otherwise require everything to be explicit;
>  - Inject IO for concrete classes when ACC_VALUE is not present, require VO to be explicit;
>  - Inject IO for concrete classes when ACC_VALUE is not present; inject VO for concrete classes when ACC_VALUE is present
> 

One more dimension to this is whether "inject" and "require" are talking about an element in the `interfaces` array of the declaration, or simply the presence of the interface via some combination of inheritance/declaration.

The latter seems more natural. But in "require" cases, it leads to surprising binary incompatibilities (per some comments I made earlier in the thread):

1) declare `interface Foo extends ValueObject` and `value class Bar extends Foo`

2) compile; javac excludes ValueObject from Bar's `interfaces`

3) Modify Foo, removing `extends ValueObject` (turns out I was overly eager when I put in that constraint, and I actually wouldn't mind subclasses that are identity classes)

4) recompile Foo separately, which succeeds

5) Try running, and discover that class Bar refuses to load, with an error saying it doesn't implement ValueObject ("of course it does!" you say?"it's a value class")

Inference is nice in that it will happily paper over these sorts of separate compilation mismatches.

From brian.goetz at oracle.com  Thu Dec 23 17:14:43 2021
From: brian.goetz at oracle.com (Brian Goetz)
Date: Thu, 23 Dec 2021 17:14:43 +0000
Subject: Updated State of Valhalla documents
Message-ID: <09EED588-7A6E-4BB8-8DCA-08C29F4E3D73@oracle.com>

Just in time for Christmas, the latest State of Valhalla is available!

https://openjdk.java.net/projects/valhalla/design-notes/state-of-valhalla/01-background
https://openjdk.java.net/projects/valhalla/design-notes/state-of-valhalla/02-object-model
https://openjdk.java.net/projects/valhalla/design-notes/state-of-valhalla/03-vm-model

The main focus for the last year has been finding the right way to expose the Valhalla features in the user model, in a way that is cleanly factored, intuitive, and clearly connects with where the platform has come from.  I am very pleased with where this has landed.

There are several more installments in the works, but these should give plenty to chew on for now!

Simple corrections accepted as PRs against valhalla-docs.


From forax at univ-mlv.fr  Thu Dec 23 18:35:16 2021
From: forax at univ-mlv.fr (Remi Forax)
Date: Thu, 23 Dec 2021 19:35:16 +0100 (CET)
Subject: Updated State of Valhalla documents
In-Reply-To: <09EED588-7A6E-4BB8-8DCA-08C29F4E3D73@oracle.com>
References: <09EED588-7A6E-4BB8-8DCA-08C29F4E3D73@oracle.com>
Message-ID: <1471584940.1211017.1640284516318.JavaMail.zimbra@u-pem.fr>

> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "valhalla-spec-experts" <valhalla-spec-experts at openjdk.java.net>
> Sent: Thursday, December 23, 2021 6:14:43 PM
> Subject: Updated State of Valhalla documents

> Just in time for Christmas, the latest State of Valhalla is available!

> [
> https://openjdk.java.net/projects/valhalla/design-notes/state-of-valhalla/01-background
> |
> https://openjdk.java.net/projects/valhalla/design-notes/state-of-valhalla/01-background
> ]
> [
> https://openjdk.java.net/projects/valhalla/design-notes/state-of-valhalla/02-object-model
> |
> https://openjdk.java.net/projects/valhalla/design-notes/state-of-valhalla/02-object-model
> ]
> [
> https://openjdk.java.net/projects/valhalla/design-notes/state-of-valhalla/03-vm-model
> |
> https://openjdk.java.net/projects/valhalla/design-notes/state-of-valhalla/03-vm-model
> ]

> The main focus for the last year has been finding the right way to expose the
> Valhalla features in the user model, in a way that is cleanly factored,
> intuitive, and clearly connects with where the platform has come from. I am
> very pleased with where this has landed.

> There are several more installments in the works, but these should give plenty
> to chew on for now!

I've done a rapid reading, 
in the objec-model 
primitive class Point implements Serializable 

should be 
primitive Point implements Serializable 

"value" is a modifier but "primitive" is a top level type. 

The design in part 3 is cool, because if i'm not mistaken, you can implement value classes without the support of Qtype in the classfile. 

R?mi 

From john.r.rose at oracle.com  Thu Dec 23 18:51:14 2021
From: john.r.rose at oracle.com (John Rose)
Date: Thu, 23 Dec 2021 18:51:14 +0000
Subject: Updated State of Valhalla documents
In-Reply-To: <1471584940.1211017.1640284516318.JavaMail.zimbra@u-pem.fr>
References: <09EED588-7A6E-4BB8-8DCA-08C29F4E3D73@oracle.com>
 <1471584940.1211017.1640284516318.JavaMail.zimbra@u-pem.fr>
Message-ID: <112398AD-D910-4AF4-8644-844A07DE6539@oracle.com>


On Dec 23, 2021, at 10:35 AM, Remi Forax <forax at univ-mlv.fr> wrote:

?


________________________________
From: "Brian Goetz" <brian.goetz at oracle.com>
To: "valhalla-spec-experts" <valhalla-spec-experts at openjdk.java.net>
Sent: Thursday, December 23, 2021 6:14:43 PM
Subject: Updated State of Valhalla documents
Just in time for Christmas, the latest State of Valhalla is available!

https://openjdk.java.net/projects/valhalla/design-notes/state-of-valhalla/01-background
https://openjdk.java.net/projects/valhalla/design-notes/state-of-valhalla/02-object-model
https://openjdk.java.net/projects/valhalla/design-notes/state-of-valhalla/03-vm-model

The main focus for the last year has been finding the right way to expose the Valhalla features in the user model, in a way that is cleanly factored, intuitive, and clearly connects with where the platform has come from.  I am very pleased with where this has landed.

There are several more installments in the works, but these should give plenty to chew on for now!

I've done a rapid reading,
in the objec-model
   primitive class Point implements Serializable

should be
   primitive Point implements Serializable

"value" is a modifier but "primitive" is a top level type.

I call bike shed on that!  Since a primitive class file defines two types we have a choice in how to convey that in the source notation. This may evolve further of course and even to the place you suggest.


The design in part 3 is cool, because if i'm not mistaken, you can implement value classes without the support of Qtype in the classfile.


Thank you. That is correct!  This is a big result of the refactoring work, and to a lower total complexity.

R?mi

From forax at univ-mlv.fr  Thu Dec 23 19:26:08 2021
From: forax at univ-mlv.fr (forax at univ-mlv.fr)
Date: Thu, 23 Dec 2021 20:26:08 +0100 (CET)
Subject: Updated State of Valhalla documents
In-Reply-To: <112398AD-D910-4AF4-8644-844A07DE6539@oracle.com>
References: <09EED588-7A6E-4BB8-8DCA-08C29F4E3D73@oracle.com>
 <1471584940.1211017.1640284516318.JavaMail.zimbra@u-pem.fr>
 <112398AD-D910-4AF4-8644-844A07DE6539@oracle.com>
Message-ID: <394037657.1216754.1640287568639.JavaMail.zimbra@u-pem.fr>

> From: "John Rose" <john.r.rose at oracle.com>
> To: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "Brian Goetz" <brian.goetz at oracle.com>, "valhalla-spec-experts"
> <valhalla-spec-experts at openjdk.java.net>
> Sent: Thursday, December 23, 2021 7:51:14 PM
> Subject: Re: Updated State of Valhalla documents

>> On Dec 23, 2021, at 10:35 AM, Remi Forax <forax at univ-mlv.fr> wrote:

>>> From: "Brian Goetz" <brian.goetz at oracle.com>
>>> To: "valhalla-spec-experts" <valhalla-spec-experts at openjdk.java.net>
>>> Sent: Thursday, December 23, 2021 6:14:43 PM
>>> Subject: Updated State of Valhalla documents

>>> Just in time for Christmas, the latest State of Valhalla is available!

>>> [
>>> https://openjdk.java.net/projects/valhalla/design-notes/state-of-valhalla/01-background
>>> |
>>> https://openjdk.java.net/projects/valhalla/design-notes/state-of-valhalla/01-background
>>> ]
>>> [
>>> https://openjdk.java.net/projects/valhalla/design-notes/state-of-valhalla/02-object-model
>>> |
>>> https://openjdk.java.net/projects/valhalla/design-notes/state-of-valhalla/02-object-model
>>> ]
>>> [
>>> https://openjdk.java.net/projects/valhalla/design-notes/state-of-valhalla/03-vm-model
>>> |
>>> https://openjdk.java.net/projects/valhalla/design-notes/state-of-valhalla/03-vm-model
>>> ]

>>> The main focus for the last year has been finding the right way to expose the
>>> Valhalla features in the user model, in a way that is cleanly factored,
>>> intuitive, and clearly connects with where the platform has come from. I am
>>> very pleased with where this has landed.

>>> There are several more installments in the works, but these should give plenty
>>> to chew on for now!

>> I've done a rapid reading,
>> in the objec-model
>> primitive class Point implements Serializable

>> should be
>> primitive Point implements Serializable

>> "value" is a modifier but "primitive" is a top level type.

> I call bike shed on that! Since a primitive class file defines two types we have
> a choice in how to convey that in the source notation. This may evolve further
> of course and even to the place you suggest.

For "value", we know that we want value class and value record, so it's more like a modifier. 
For primitive, do we want a primitive record ? The VM supports it, but do we want to offer that possibility in Java ? 
My gut feeling is that the answer is "No" because of what Kevin said earlier, we should drive users to use value classes instead of primitives. 

>> The design in part 3 is cool, because if i'm not mistaken, you can implement
>> value classes without the support of Qtype in the classfile.

> Thank you. That is correct! This is a big result of the refactoring work, and to
> a lower total complexity.

yes ! 

R?mi 

From john.r.rose at oracle.com  Thu Dec 23 19:43:22 2021
From: john.r.rose at oracle.com (John Rose)
Date: Thu, 23 Dec 2021 11:43:22 -0800
Subject: [External] : Re: Updated State of Valhalla documents
In-Reply-To: <394037657.1216754.1640287568639.JavaMail.zimbra@u-pem.fr>
References: <09EED588-7A6E-4BB8-8DCA-08C29F4E3D73@oracle.com>
 <1471584940.1211017.1640284516318.JavaMail.zimbra@u-pem.fr>
 <112398AD-D910-4AF4-8644-844A07DE6539@oracle.com>
 <394037657.1216754.1640287568639.JavaMail.zimbra@u-pem.fr>
Message-ID: <056021F6-4BDF-43C8-A3DC-C9D2B064FC59@oracle.com>

On 23 Dec 2021, at 11:26, forax at univ-mlv.fr wrote:

> For "value", we know that we want value class and value record, so 
> it's more like a modifier.
> For primitive, do we want a primitive record ? The VM supports it, but 
> do we want to offer that possibility in Java ?
> My gut feeling is that the answer is "No" because of what Kevin said 
> earlier, we should drive users to use value classes instead of 
> primitives.

Good points, though not sure if they carry the decision completely the 
other way.  The VM sees primitive as a classfile modifier.  (The 
`ACC_PRIMITIVE` modifier flag!)  You are raising the question of whether 
this is smart for the language as well.  For further discussion and 
perhaps experimentation.

From forax at univ-mlv.fr  Thu Dec 23 19:58:03 2021
From: forax at univ-mlv.fr (forax at univ-mlv.fr)
Date: Thu, 23 Dec 2021 20:58:03 +0100 (CET)
Subject: [External] : Re: Updated State of Valhalla documents
In-Reply-To: <056021F6-4BDF-43C8-A3DC-C9D2B064FC59@oracle.com>
References: <09EED588-7A6E-4BB8-8DCA-08C29F4E3D73@oracle.com>
 <1471584940.1211017.1640284516318.JavaMail.zimbra@u-pem.fr>
 <112398AD-D910-4AF4-8644-844A07DE6539@oracle.com>
 <394037657.1216754.1640287568639.JavaMail.zimbra@u-pem.fr>
 <056021F6-4BDF-43C8-A3DC-C9D2B064FC59@oracle.com>
Message-ID: <313972565.1229854.1640289483686.JavaMail.zimbra@u-pem.fr>

> From: "John Rose" <john.r.rose at oracle.com>
> To: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "Brian Goetz" <brian.goetz at oracle.com>, "valhalla-spec-experts"
> <valhalla-spec-experts at openjdk.java.net>
> Sent: Thursday, December 23, 2021 8:43:22 PM
> Subject: Re: [External] : Re: Updated State of Valhalla documents

> On 23 Dec 2021, at 11:26, [ mailto:forax at univ-mlv.fr | forax at univ-mlv.fr ]
> wrote:

>> For "value", we know that we want value class and value record, so it's more
>> like a modifier.
>> For primitive, do we want a primitive record ? The VM supports it, but do we
>> want to offer that possibility in Java ?
>> My gut feeling is that the answer is "No" because of what Kevin said earlier, we
>> should drive users to use value classes instead of primitives.
> Good points, though not sure if they carry the decision completely the other
> way. The VM sees primitive as a classfile modifier. (The ACC_PRIMITIVE modifier
> flag!) You are raising the question of whether this is smart for the language
> as well. For further discussion and perhaps experimentation.

This re-join with the discussion about where to cut. 

>From the VM POV, which is interested by the runtime characteristics, 
we have either classical classes or value types and value types can be value class or primitive class. 

But for Java, i would argue that the model is more 
we have either reference objects or primitives, for reference objects you have those with identity and those without identity, 
hence "primitive" being a top-level kind while "value" (or a better term) being a modifier. 

R?mi 

From ali.ebrahimi1781 at gmail.com  Fri Dec 24 15:18:06 2021
From: ali.ebrahimi1781 at gmail.com (Ali Ebrahimi)
Date: Fri, 24 Dec 2021 18:48:06 +0330
Subject: Updated State of Valhalla documents
In-Reply-To: <09EED588-7A6E-4BB8-8DCA-08C29F4E3D73@oracle.com>
References: <09EED588-7A6E-4BB8-8DCA-08C29F4E3D73@oracle.com>
Message-ID: <CAA0cW5AxekTjsbDuWRQTsweXx09=H+pEuzF28UJnzc2ZaQu1nQ@mail.gmail.com>

Hi Brian,
Thanks for sharing project's latest direction.

In Identifying identity
<https://openjdk.java.net/projects/valhalla/design-notes/state-of-valhalla/02-object-model#identifying-identity>
part of  02-object-model doc:

Identifying identity

To distinguish between *primitive *and identity classes at compile and run
time, we introduce two restricted interfaces IdentityObject and
ValueObject.


*I think you mean value instead of primitive: *

To distinguish between *value *and *identity *classes .............


On Thu, Dec 23, 2021 at 8:44 PM Brian Goetz <brian.goetz at oracle.com> wrote:

> Just in time for Christmas, the latest State of Valhalla is available!
>
>
> https://openjdk.java.net/projects/valhalla/design-notes/state-of-valhalla/01-background
>
> https://openjdk.java.net/projects/valhalla/design-notes/state-of-valhalla/02-object-model
>
> https://openjdk.java.net/projects/valhalla/design-notes/state-of-valhalla/03-vm-model
>
> The main focus for the last year has been finding the right way to expose
> the Valhalla features in the user model, in a way that is cleanly factored,
> intuitive, and clearly connects with where the platform has come from.  I
> am very pleased with where this has landed.
>
> There are several more installments in the works, but these should give
> plenty to chew on for now!
>
> Simple corrections accepted as PRs against valhalla-docs.
>
>
>

-- 

Best Regards,
Ali Ebrahimi