From forax at univ-mlv.fr Sat Aug 12 20:39:42 2023 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 12 Aug 2023 22:39:42 +0200 (CEST) Subject: Target of opportunity: remove method and aconst_init / withfield opcodes Message-ID: <1652528568.11770992.1691872782490.JavaMail.zimbra@univ-eiffel.fr> John and I and several others had several interresting discussions yesterday about Valhalla, one of them about the removal of and aconst_init/withfield, something i've called in the past, solving the last mile issue. Currently, refactoring from/to a value class/identity class is not a backward compatible move because of the way value classes are initialized using the factory method. The reason is that during the initialization, a class instance is mutable but the VM considers that all value class instances are non-mutable. But at the same time, in order to implement Serialization of value classes (exactly de-serialization), there is a need for a mechanism to tag value class instances as "not yet finished to be initialized", something John refers has the object being in larval state because the de-serialization first create the instance and then populate its fields. In the lw prototypes, this is currently implemented using Unsafe. I think we have the opportunity now to use the same larval protocol for all values classes, making them binary backward compatible with identity classes. With both of them using the initialization protocol, the same new/dup/invokespecial dance. In term of specification, the idea is that "new" on a value class creates a larval instance and the end of the constructor mark the instance as non-larval/true-value-instance. I think that using the same initialization protocol at callsite is a good idea. We are re-aligning the bytecode with the Java code, One thing wichh is currently hard to understand for our users is that currently the Java code is the same for a value class and an identity class but the generated bytecode is not binary compatible. This also go well with the recent move to remove the Q-descriptor, we are moving toward the goal of being fully binary backward compatible. We may want to modify the verifier to verify that "this" does not escape the constructor in case of a value class, but I do not thing it's a requirement, so it's maybe better to not do that :) The end of constructor is already a point where all VMs/JITs are able to emit codes (for store/store barrier or finalizer registration), so we are piggy backing on an existing concept. The only main drawback I see is that the header of a buffered value type has to have one bit to indicate the larval state and those header bits are really precious. regards, R?mi From john.r.rose at oracle.com Sat Aug 12 22:48:34 2023 From: john.r.rose at oracle.com (John Rose) Date: Sat, 12 Aug 2023 22:48:34 +0000 Subject: Target of opportunity: remove method and aconst_init / withfield opcodes In-Reply-To: <1652528568.11770992.1691872782490.JavaMail.zimbra@univ-eiffel.fr> References: <1652528568.11770992.1691872782490.JavaMail.zimbra@univ-eiffel.fr> Message-ID: The header bit is not a problem because value object headers have no object monitor. Thus, their headers are relatively empty OC state. Lots of slack. In the spirit of minimizing JVMS changes I think this approa), might make sense. The new byte codes paired well with new Q-descriptors and new verifier rules. Not so much now. Also, we already built the necessary states for serialization. The verifier will also hide them for us in bytecode. That was a new discovery yesterday over burgers. > On Aug 12, 2023, at 1:40 PM, Remi Forax wrote: > > ?John and I and several others had several interresting discussions yesterday about Valhalla, > one of them about the removal of and aconst_init/withfield, something i've called in the past, solving the last mile issue. > > Currently, refactoring from/to a value class/identity class is not a backward compatible move because of the way value classes are initialized using the factory method. The reason is that during the initialization, a class instance is mutable but the VM considers that all value class instances are non-mutable. > > But at the same time, in order to implement Serialization of value classes (exactly de-serialization), there is a need for a mechanism to tag value class instances as "not yet finished to be initialized", something John refers has the object being in larval state because the de-serialization first create the instance and then populate its fields. In the lw prototypes, this is currently implemented using Unsafe. > > I think we have the opportunity now to use the same larval protocol for all values classes, making them binary backward compatible with identity classes. With both of them using the initialization protocol, the same new/dup/invokespecial dance. > > In term of specification, the idea is that "new" on a value class creates a larval instance and the end of the constructor mark the instance as non-larval/true-value-instance. > > I think that using the same initialization protocol at callsite is a good idea. We are re-aligning the bytecode with the Java code, One thing wichh is currently hard to understand for our users is that currently the Java code is the same for a value class and an identity class but the generated bytecode is not binary compatible. This also go well with the recent move to remove the Q-descriptor, we are moving toward the goal of being fully binary backward compatible. > > We may want to modify the verifier to verify that "this" does not escape the constructor in case of a value class, but I do not thing it's a requirement, so it's maybe better to not do that :) > > The end of constructor is already a point where all VMs/JITs are able to emit codes (for store/store barrier or finalizer registration), so we are piggy backing on an existing concept. The only main drawback I see is that the header of a buffered value type has to have one bit to indicate the larval state and those header bits are really precious. > > regards, > R?mi From heidinga at redhat.com Mon Aug 14 12:53:31 2023 From: heidinga at redhat.com (Dan Heidinga) Date: Mon, 14 Aug 2023 08:53:31 -0400 Subject: Target of opportunity: remove method and aconst_init / withfield opcodes In-Reply-To: References: <1652528568.11770992.1691872782490.JavaMail.zimbra@univ-eiffel.fr> Message-ID: Can one of you sketch out the bytecode sequences being proposed? Being able to look at some concrete "before we used /aconst_init/withfield" and now "we use ....." comparisons would make the proposal clearer. For a value class like Point, with a "Point changeX(int x)" method, we used withfield to pop an instance off the stack, create a new instance with the updated single field, and put that instance back on the stack which allows us to preserve our immutability without needing a constructor per field we want to change. Old bytecode: Point changeX(int x) { aload0 iload_1 withfield "Point:x" areturn } What would the new bytecode do? --Dan On Sat, Aug 12, 2023 at 6:49?PM John Rose wrote: > The header bit is not a problem because value object headers have no > object monitor. Thus, their headers are relatively empty OC state. Lots of > slack. > > In the spirit of minimizing JVMS changes I think this approa), might make > sense. The new byte codes paired well with new Q-descriptors and new > verifier rules. Not so much now. > > Also, we already built the necessary states for serialization. The > verifier will also hide them for us in bytecode. That was a new discovery > yesterday over burgers. > > > On Aug 12, 2023, at 1:40 PM, Remi Forax wrote: > > > > ?John and I and several others had several interresting discussions > yesterday about Valhalla, > > one of them about the removal of and aconst_init/withfield, > something i've called in the past, solving the last mile issue. > > > > Currently, refactoring from/to a value class/identity class is not a > backward compatible move because of the way value classes are initialized > using the factory method. The reason is that during the > initialization, a class instance is mutable but the VM considers that all > value class instances are non-mutable. > > > > But at the same time, in order to implement Serialization of value > classes (exactly de-serialization), there is a need for a mechanism to tag > value class instances as "not yet finished to be initialized", something > John refers has the object being in larval state because the > de-serialization first create the instance and then populate its fields. In > the lw prototypes, this is currently implemented using Unsafe. > > > > I think we have the opportunity now to use the same larval protocol for > all values classes, making them binary backward compatible with identity > classes. With both of them using the initialization protocol, the same > new/dup/invokespecial dance. > > > > In term of specification, the idea is that "new" on a value class > creates a larval instance and the end of the constructor mark the instance > as non-larval/true-value-instance. > > > > I think that using the same initialization protocol at callsite is a > good idea. We are re-aligning the bytecode with the Java code, One thing > wichh is currently hard to understand for our users is that currently the > Java code is the same for a value class and an identity class but the > generated bytecode is not binary compatible. This also go well with the > recent move to remove the Q-descriptor, we are moving toward the goal of > being fully binary backward compatible. > > > > We may want to modify the verifier to verify that "this" does not escape > the constructor in case of a value class, but I do not thing it's a > requirement, so it's maybe better to not do that :) > > > > The end of constructor is already a point where all VMs/JITs are able to > emit codes (for store/store barrier or finalizer registration), so we are > piggy backing on an existing concept. The only main drawback I see is that > the header of a buffered value type has to have one bit to indicate the > larval state and those header bits are really precious. > > > > regards, > > R?mi > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Mon Aug 14 13:55:43 2023 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 14 Aug 2023 15:55:43 +0200 (CEST) Subject: Target of opportunity: remove method and aconst_init / withfield opcodes In-Reply-To: References: <1652528568.11770992.1691872782490.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <131730267.12297792.1692021343129.JavaMail.zimbra@univ-eiffel.fr> > From: "Dan Heidinga" > To: "John Rose" > Cc: "Remi Forax" , "valhalla-spec-experts" > > Sent: Monday, August 14, 2023 2:53:31 PM > Subject: Re: Target of opportunity: remove method and aconst_init / > withfield opcodes > Can one of you sketch out the bytecode sequences being proposed? Being able to > look at some concrete "before we used /aconst_init/withfield" and now "we > use ....." comparisons would make the proposal clearer. > For a value class like Point, with a "Point changeX(int x)" method, we used > withfield to pop an instance off the stack, create a new instance with the > updated single field, and put that instance back on the stack which allows us > to preserve our immutability without needing a constructor per field we want to > change. > Old bytecode: > Point changeX(int x) { > aload0 > iload_1 > withfield "Point:x" > areturn > } > What would the new bytecode do? Sure, for the content of the constructor of a value class the bytecode is almost the same as for an identity class, the only difference is that the call to super() is not emitted. The constructor of Point (int x, int y) { aload 0 iload 1 putfield Field:x aload 0 iload 2 putfield Field:y return } // here, 'this' is not larval anymore For all the other methods, a value class or an identity class use the same bytecode, for the method Point:changeX Point changeX(int x) { new. // in larval state dup iload 1 aload 0 getfield Field:y invokespecial Method Point:(II)V areturn // not larval anymore } > --Dan R?mi > On Sat, Aug 12, 2023 at 6:49 PM John Rose < [ mailto:john.r.rose at oracle.com | > john.r.rose at oracle.com ] > wrote: >> The header bit is not a problem because value object headers have no object >> monitor. Thus, their headers are relatively empty OC state. Lots of slack. >> In the spirit of minimizing JVMS changes I think this approa), might make sense. >> The new byte codes paired well with new Q-descriptors and new verifier rules. >> Not so much now. >> Also, we already built the necessary states for serialization. The verifier will >> also hide them for us in bytecode. That was a new discovery yesterday over >> burgers. >>> On Aug 12, 2023, at 1:40 PM, Remi Forax < [ mailto:forax at univ-mlv.fr | >> > forax at univ-mlv.fr ] > wrote: >>> John and I and several others had several interresting discussions yesterday >> > about Valhalla, >>> one of them about the removal of and aconst_init/withfield, something >> > i've called in the past, solving the last mile issue. >>> Currently, refactoring from/to a value class/identity class is not a backward >>> compatible move because of the way value classes are initialized using the >>> factory method. The reason is that during the initialization, a class >>> instance is mutable but the VM considers that all value class instances are >> > non-mutable. >>> But at the same time, in order to implement Serialization of value classes >>> (exactly de-serialization), there is a need for a mechanism to tag value class >>> instances as "not yet finished to be initialized", something John refers has >>> the object being in larval state because the de-serialization first create the >>> instance and then populate its fields. In the lw prototypes, this is currently >> > implemented using Unsafe. >>> I think we have the opportunity now to use the same larval protocol for all >>> values classes, making them binary backward compatible with identity classes. >>> With both of them using the initialization protocol, the same >> > new/dup/invokespecial dance. >>> In term of specification, the idea is that "new" on a value class creates a >>> larval instance and the end of the constructor mark the instance as >> > non-larval/true-value-instance. >>> I think that using the same initialization protocol at callsite is a good idea. >>> We are re-aligning the bytecode with the Java code, One thing wichh is >>> currently hard to understand for our users is that currently the Java code is >>> the same for a value class and an identity class but the generated bytecode is >>> not binary compatible. This also go well with the recent move to remove the >>> Q-descriptor, we are moving toward the goal of being fully binary backward >> > compatible. >>> We may want to modify the verifier to verify that "this" does not escape the >>> constructor in case of a value class, but I do not thing it's a requirement, so >> > it's maybe better to not do that :) >>> The end of constructor is already a point where all VMs/JITs are able to emit >>> codes (for store/store barrier or finalizer registration), so we are piggy >>> backing on an existing concept. The only main drawback I see is that the header >>> of a buffered value type has to have one bit to indicate the larval state and >> > those header bits are really precious. >> > regards, >> > R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From heidinga at redhat.com Mon Aug 14 14:35:31 2023 From: heidinga at redhat.com (Dan Heidinga) Date: Mon, 14 Aug 2023 10:35:31 -0400 Subject: Target of opportunity: remove method and aconst_init / withfield opcodes In-Reply-To: <131730267.12297792.1692021343129.JavaMail.zimbra@univ-eiffel.fr> References: <1652528568.11770992.1691872782490.JavaMail.zimbra@univ-eiffel.fr> <131730267.12297792.1692021343129.JavaMail.zimbra@univ-eiffel.fr> Message-ID: On Mon, Aug 14, 2023 at 9:55?AM wrote: > > > ------------------------------ > > *From: *"Dan Heidinga" > *To: *"John Rose" > *Cc: *"Remi Forax" , "valhalla-spec-experts" < > valhalla-spec-experts at openjdk.java.net> > *Sent: *Monday, August 14, 2023 2:53:31 PM > *Subject: *Re: Target of opportunity: remove method and > aconst_init / withfield opcodes > > Can one of you sketch out the bytecode sequences being proposed? Being > able to look at some concrete "before we used /aconst_init/withfield" > and now "we use ....." comparisons would make the proposal clearer. > For a value class like Point, with a "Point changeX(int x)" method, we > used withfield to pop an instance off the stack, create a new instance with > the updated single field, and put that instance back on the stack which > allows us to preserve our immutability without needing a constructor per > field we want to change. > > Old bytecode: > Point changeX(int x) { > aload0 > iload_1 > withfield "Point:x" > areturn > } > > What would the new bytecode do? > > > Sure, for the content of the constructor of a value class the bytecode is > almost the same as for an identity class, the only difference is that the > call to super() is not emitted. > > The constructor of Point > (int x, int y) { > aload 0 > iload 1 > putfield Field:x > aload 0 > iload 2 > putfield Field:y > return > } // here, 'this' is not larval anymore > > For all the other methods, a value class or an identity class use the same > bytecode, for the method Point:changeX > Point changeX(int x) { > new. // in larval state > dup > iload 1 > aload 0 > getfield Field:y > invokespecial Method Point:(II)V > areturn // not larval anymore > } > If I'm following correctly, changing one field requires stacking all the fields and calling the constructor? >From a VM perspective withfield is nice but as I dig through my notes, I don't see any notes on how we'd expose it in the language. All the examples I'm finding are either directly generating the withfield bytecode or java source that shows the fields passing through a constructor. Do we have a model on how javac would generate withfield or have we already defaulted in the ctor model? --Dan > > > > --Dan > > > R?mi > > > > On Sat, Aug 12, 2023 at 6:49?PM John Rose wrote: > >> The header bit is not a problem because value object headers have no >> object monitor. Thus, their headers are relatively empty OC state. Lots of >> slack. >> >> In the spirit of minimizing JVMS changes I think this approa), might make >> sense. The new byte codes paired well with new Q-descriptors and new >> verifier rules. Not so much now. >> >> Also, we already built the necessary states for serialization. The >> verifier will also hide them for us in bytecode. That was a new discovery >> yesterday over burgers. >> >> > On Aug 12, 2023, at 1:40 PM, Remi Forax wrote: >> > >> > John and I and several others had several interresting discussions >> yesterday about Valhalla, >> > one of them about the removal of and aconst_init/withfield, >> something i've called in the past, solving the last mile issue. >> > >> > Currently, refactoring from/to a value class/identity class is not a >> backward compatible move because of the way value classes are initialized >> using the factory method. The reason is that during the >> initialization, a class instance is mutable but the VM considers that all >> value class instances are non-mutable. >> > >> > But at the same time, in order to implement Serialization of value >> classes (exactly de-serialization), there is a need for a mechanism to tag >> value class instances as "not yet finished to be initialized", something >> John refers has the object being in larval state because the >> de-serialization first create the instance and then populate its fields. In >> the lw prototypes, this is currently implemented using Unsafe. >> > >> > I think we have the opportunity now to use the same larval protocol for >> all values classes, making them binary backward compatible with identity >> classes. With both of them using the initialization protocol, the same >> new/dup/invokespecial dance. >> > >> > In term of specification, the idea is that "new" on a value class >> creates a larval instance and the end of the constructor mark the instance >> as non-larval/true-value-instance. >> > >> > I think that using the same initialization protocol at callsite is a >> good idea. We are re-aligning the bytecode with the Java code, One thing >> wichh is currently hard to understand for our users is that currently the >> Java code is the same for a value class and an identity class but the >> generated bytecode is not binary compatible. This also go well with the >> recent move to remove the Q-descriptor, we are moving toward the goal of >> being fully binary backward compatible. >> > >> > We may want to modify the verifier to verify that "this" does not >> escape the constructor in case of a value class, but I do not thing it's a >> requirement, so it's maybe better to not do that :) >> > >> > The end of constructor is already a point where all VMs/JITs are able >> to emit codes (for store/store barrier or finalizer registration), so we >> are piggy backing on an existing concept. The only main drawback I see is >> that the header of a buffered value type has to have one bit to indicate >> the larval state and those header bits are really precious. >> > >> > regards, >> > R?mi >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Mon Aug 14 14:54:10 2023 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 14 Aug 2023 16:54:10 +0200 (CEST) Subject: Target of opportunity: remove method and aconst_init / withfield opcodes In-Reply-To: References: <1652528568.11770992.1691872782490.JavaMail.zimbra@univ-eiffel.fr> <131730267.12297792.1692021343129.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <1013000521.12303933.1692024850645.JavaMail.zimbra@univ-eiffel.fr> > From: "Dan Heidinga" > To: "Remi Forax" > Cc: "John Rose" , "valhalla-spec-experts" > > Sent: Monday, August 14, 2023 4:35:31 PM > Subject: Re: Target of opportunity: remove method and aconst_init / > withfield opcodes > On Mon, Aug 14, 2023 at 9:55 AM < [ mailto:forax at univ-mlv.fr | forax at univ-mlv.fr > ] > wrote: >>> From: "Dan Heidinga" < [ mailto:heidinga at redhat.com | heidinga at redhat.com ] > >>> To: "John Rose" < [ mailto:john.r.rose at oracle.com | john.r.rose at oracle.com ] > >>> Cc: "Remi Forax" < [ mailto:forax at univ-mlv.fr | forax at univ-mlv.fr ] >, >>> "valhalla-spec-experts" < [ mailto:valhalla-spec-experts at openjdk.java.net | >>> valhalla-spec-experts at openjdk.java.net ] > >>> Sent: Monday, August 14, 2023 2:53:31 PM >>> Subject: Re: Target of opportunity: remove method and aconst_init / >>> withfield opcodes >>> Can one of you sketch out the bytecode sequences being proposed? Being able to >>> look at some concrete "before we used /aconst_init/withfield" and now "we >>> use ....." comparisons would make the proposal clearer. >>> For a value class like Point, with a "Point changeX(int x)" method, we used >>> withfield to pop an instance off the stack, create a new instance with the >>> updated single field, and put that instance back on the stack which allows us >>> to preserve our immutability without needing a constructor per field we want to >>> change. >>> Old bytecode: >>> Point changeX(int x) { >>> aload0 >>> iload_1 >>> withfield "Point:x" >>> areturn >>> } >>> What would the new bytecode do? >> Sure, for the content of the constructor of a value class the bytecode is almost >> the same as for an identity class, the only difference is that the call to >> super() is not emitted. >> The constructor of Point >> (int x, int y) { >> aload 0 >> iload 1 >> putfield Field:x >> aload 0 >> iload 2 >> putfield Field:y >> return >> } // here, 'this' is not larval anymore >> For all the other methods, a value class or an identity class use the same >> bytecode, for the method Point:changeX >> Point changeX(int x) { >> new. // in larval state >> dup >> iload 1 >> aload 0 >> getfield Field:y >> invokespecial Method Point:(II)V >> areturn // not larval anymore >> } > If I'm following correctly, changing one field requires stacking all the fields > and calling the constructor? yes, > From a VM perspective withfield is nice but as I dig through my notes, I don't > see any notes on how we'd expose it in the language. All the examples I'm > finding are either directly generating the withfield bytecode or java source > that shows the fields passing through a constructor. Do we have a model on how > javac would generate withfield or have we already defaulted in the ctor model? Brian has proposed a syntax for records (as part of Amber) in the past [ https://github.com/openjdk/amber-docs/blob/master/eg-drafts/reconstruction-records-and-classes.md | https://github.com/openjdk/amber-docs/blob/master/eg-drafts/reconstruction-records-and-classes.md ] but there were no concensus about it. And the mapping for the compiler is not obvious given that withfield is limited to the nestmates of a value type. Currently the compiler only emits withfield (or aconst_init) inside a method. > --Dan R?mi >>> --Dan >> R?mi >>> On Sat, Aug 12, 2023 at 6:49 PM John Rose < [ mailto:john.r.rose at oracle.com | >>> john.r.rose at oracle.com ] > wrote: >>>> The header bit is not a problem because value object headers have no object >>>> monitor. Thus, their headers are relatively empty OC state. Lots of slack. >>>> In the spirit of minimizing JVMS changes I think this approa), might make sense. >>>> The new byte codes paired well with new Q-descriptors and new verifier rules. >>>> Not so much now. >>>> Also, we already built the necessary states for serialization. The verifier will >>>> also hide them for us in bytecode. That was a new discovery yesterday over >>>> burgers. >>>>> On Aug 12, 2023, at 1:40 PM, Remi Forax < [ mailto:forax at univ-mlv.fr | >>>> > forax at univ-mlv.fr ] > wrote: >>>>> John and I and several others had several interresting discussions yesterday >>>> > about Valhalla, >>>>> one of them about the removal of and aconst_init/withfield, something >>>> > i've called in the past, solving the last mile issue. >>>>> Currently, refactoring from/to a value class/identity class is not a backward >>>>> compatible move because of the way value classes are initialized using the >>>>> factory method. The reason is that during the initialization, a class >>>>> instance is mutable but the VM considers that all value class instances are >>>> > non-mutable. >>>>> But at the same time, in order to implement Serialization of value classes >>>>> (exactly de-serialization), there is a need for a mechanism to tag value class >>>>> instances as "not yet finished to be initialized", something John refers has >>>>> the object being in larval state because the de-serialization first create the >>>>> instance and then populate its fields. In the lw prototypes, this is currently >>>> > implemented using Unsafe. >>>>> I think we have the opportunity now to use the same larval protocol for all >>>>> values classes, making them binary backward compatible with identity classes. >>>>> With both of them using the initialization protocol, the same >>>> > new/dup/invokespecial dance. >>>>> In term of specification, the idea is that "new" on a value class creates a >>>>> larval instance and the end of the constructor mark the instance as >>>> > non-larval/true-value-instance. >>>>> I think that using the same initialization protocol at callsite is a good idea. >>>>> We are re-aligning the bytecode with the Java code, One thing wichh is >>>>> currently hard to understand for our users is that currently the Java code is >>>>> the same for a value class and an identity class but the generated bytecode is >>>>> not binary compatible. This also go well with the recent move to remove the >>>>> Q-descriptor, we are moving toward the goal of being fully binary backward >>>> > compatible. >>>>> We may want to modify the verifier to verify that "this" does not escape the >>>>> constructor in case of a value class, but I do not thing it's a requirement, so >>>> > it's maybe better to not do that :) >>>>> The end of constructor is already a point where all VMs/JITs are able to emit >>>>> codes (for store/store barrier or finalizer registration), so we are piggy >>>>> backing on an existing concept. The only main drawback I see is that the header >>>>> of a buffered value type has to have one bit to indicate the larval state and >>>> > those header bits are really precious. >>>> > regards, >>>> > R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.smith at oracle.com Mon Aug 14 23:26:09 2023 From: daniel.smith at oracle.com (Dan Smith) Date: Mon, 14 Aug 2023 23:26:09 +0000 Subject: JEP 401 revisions: Null-Restricted Value Object Storage Message-ID: I've made some revisions to JEP 401 to align with our latest design ideas for expressing flattenability in the language and in class files. https://openjdk.org/jeps/401 At one point I was considering introducing nullness features in a separate JEP, but the consensus seems to be that we're better off delivering a smaller version of nullness features first, only applicable to value classes. So I've revised the JEP title to be "Null-Restricted Value Object Storage" and eliminated the dependency on a separate nullness JEP. (Don't forget that the core Value Objects concepts related to identity have been lifted out into their own JEP, https://openjdk.org/jeps/8277163, which is a prerequisite to JEP 401.) From john.r.rose at oracle.com Tue Aug 15 03:05:28 2023 From: john.r.rose at oracle.com (John Rose) Date: Mon, 14 Aug 2023 20:05:28 -0700 Subject: Target of opportunity: remove method and aconst_init / withfield opcodes In-Reply-To: <1652528568.11770992.1691872782490.JavaMail.zimbra@univ-eiffel.fr> References: <1652528568.11770992.1691872782490.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <8EAD1121-7F18-47DA-BD80-6488392E88AE@oracle.com> On 12 Aug 2023, at 13:39, Remi Forax wrote: > John and I and several others had several interresting discussions yesterday about Valhalla, > one of them about the removal of and aconst_init/withfield, something i've called in the past, solving the last mile issue. Yes, you?ve mentioned this, but I wasn?t convinced it was possible until Friday. > We may want to modify the verifier to verify that "this" does not escape the constructor in case of a value class, but I do not thing it's a requirement, so it's maybe better to not do that :) Actually, for me the whole proposal collapses unless the buffer containing the not-yet-constructed value can be rigorously prevented from escaping. (I call it a buffer, because it not yet a value.) The only valid operations on such a buffer are a. calling (at the end of which it may be treated as a real value), and b. during the course of that method calling putfield on it. If the buffer is allowed to escape in any other way, then the transition between (larval) buffer and (adult, frozen) value becomes ill-defined. In particular, if the escaped buffer is stored in a heap variable, then race conditions can affect the contents of the finished value, but there is no way to ?find and fix? the escaped buffer and promote it to be a real value, unless you allow a racy side effect for the state change. I think requiring such ?flexibility? would impede optimiziation. But there are four lucky breaks that I realized on Friday: 1. The uninitializedThis state already defined by the verifier is sufficient (as well as necessary) for preventing escape of (larval) buffer references. 2. The existing rules for using uninitializedThis, including the permission to use putfield, are sufficient (as well as necessary) for initializing value buffers, 3. the JMM freeze operation at the end of can be interpreted (for values) as the operation which transitions a mutable (larval) value buffer into a true immutable (adult) value, and finally 4. a special rule that forbids a value constructor from calling a super-constructor (or this-constructor) is sufficient to protect the (larval) value buffer from escaping, all the way up to the freeze. That last point 4 is the only adjustment necessary to the verifier rules, in order to make the whole scheme work. I am against all but the most necessary verifier changes in general, and I like the fact that I get rid of two bytecodes (and their verifier rules) in exchange for a simple modification to the access rules for a constructor calling another constructor on its uninitalizedThis value. > The end of constructor is already a point where all VMs/JITs are able to emit codes (for store/store barrier or finalizer registration), so we are piggy backing on an existing concept. The only main drawback I see is that the header of a buffered value type has to have one bit to indicate the larval state and those header bits are really precious. Not a problem, I hope. Basically, the states of a value-class heap node header are either larval or adult, while the states of an identity-class heap node header are either locked or unlocked. Value types are not allowed to store complex object monitor states, so it is likely (even with Lilliput) that no new bits are needed. Perhaps the larval value buffer or the adult value itself can be given special inflated states as needed; they would be singleton states, not heap-allocated. Why explicitly mark the larval state at all? Maybe it?s not necessary for bytecode-based value construction, if the reference is so rigidly constrained, but it might gate future some GC optimizations such a de-duplication (or NUMA splitting OTOH). You can?t do that stuff to a larval value buffer. Also, Unsafe can use it as an error check (though we often dispense with error checks for Unsafe). Most importantly, though, there has to be some way to prevent putfield from being accidentally applied to an adult value. The verifier could be asked to help with this, but I don?t like adding subtle new verifier rules. It?s better just to have a dynamic check, that says ?putfield looks for the larval buffer state and throws if it finds a properly frozen value?. The dynamic check is relevant only in the interpreter; the JIT can usually elide the whole larval state completely. One case where it cannot: If the method is compiled for an out of line call, from a caller who has allocated the larval buffer. ? John P.S. It is an optional move, but I think we should discuss keeping . The would be demoted from its current draft role as a constructor, to a simple static factory, which then does the new-dup-init dance. The VM could supply it automagically, or it could be explicit bytecode declarations. It could be extended to identity classes, so all kinds of ?new? expressions could link through as a static factory. Armed with as the canonical way to build values (outside of the value class itself), we could mandate that, after a certain classfile version, the ?new? bytecode on a value class is a private operation (like we treat ?withfield?). And maybe the same for identity classes as well, so we retain migration compatibility (in both directions). Old classfiles would be exempt, and would still contain new-dup-init, but when recompiled they would call . Why do this? It would reduce proliferation of the new-dup-init dance, in favor of a simpler and safer vnew dance. There have been many serious bugs in HotSpot from complicated interactions of the uninitializedThis produced by a ?new? with other edge cases. There would be many fewer occasions for bugs if only nestmates of a class were able to make actual raw new instances of that class. Then all the parts of the verifier that deal with uninitializedThis would be confined to just a few places, in the class definition itself. The author of that class would have more exact control over what was done with these raw new instances. It would not be necessary to trust the verifier to protect partially constructed instances, which is good because it has not always done a perfect job of that. I realize folks will say, ?we just rehabilitated for values, so why add a new to wrap around it??. So, full disclosure, I wish we had already shifted (in newer classfiles) to a static factory like instead of new-dup-init. That?s how it seems to me, as I look back on years of verifier bugs. From john.r.rose at oracle.com Tue Aug 15 03:11:06 2023 From: john.r.rose at oracle.com (John Rose) Date: Mon, 14 Aug 2023 20:11:06 -0700 Subject: Target of opportunity: remove method and aconst_init / withfield opcodes In-Reply-To: References: <1652528568.11770992.1691872782490.JavaMail.zimbra@univ-eiffel.fr> <131730267.12297792.1692021343129.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <49C90AE2-81CD-413E-B60B-7EC51104288F@oracle.com> On 14 Aug 2023, at 7:35, Dan Heidinga wrote: > > If I'm following correctly, changing one field requires stacking all the > fields and calling the constructor? Yes. Just as we do today with Java records. I expect the JIT will pay extra attention to constructor calls which pass through previous field values, for all fields but one, and DTRT. But I also hope that we will build a ?reconstructor? mechanism into the VM, where a constructor-like method body executes, not against a blank instance (just produced by ?new?) but against a freshly buffered clone of an existing instruction. (That?s pretty much what I mean, at the VM level, whenever I mention ?reconstructors?.) A ?wither? formulated as a reconstructor would have exactly one ?putfield? (in the new design), or formulated as a static factory it would have exactly one ?withfield? (in the old design). That feels like design parity to me. > From a VM perspective withfield is nice but as I dig through my notes, I > don't see any notes on how we'd expose it in the language. All the > examples I'm finding are either directly generating the withfield bytecode > or java source that shows the fields passing through a constructor. Do we > have a model on how javac would generate withfield or have we already > defaulted in the ctor model? We have already defaulted to the ctor model. The withfield was seen as a demonstrably clean and incremental operation, useful at present for constructors and in the future for reconstructors (or just withers, which are a special case). But if putfield is just as good, why build a new bytecode? From john.r.rose at oracle.com Sat Aug 19 04:15:14 2023 From: john.r.rose at oracle.com (John Rose) Date: Fri, 18 Aug 2023 21:15:14 -0700 Subject: The last miles In-Reply-To: References: <1724794551.103459871.1689233963536.JavaMail.zimbra@univ-eiffel.fr> <940032A1-D914-474B-8473-7DAE200ACF40@oracle.com> <939E976C-5088-4AE2-987E-D5EFBEF734C1@oracle.com> Message-ID: <4B804FDD-E843-44D2-BE66-E915C7ECD258@oracle.com> On 13 Jul 2023, at 14:03, John Rose wrote: > On Jul 13, 2023, at 1:52 PM, John Rose wrote: >> >> The proposed ?unification? would require us to somehow simulate larval objects in terms of today?s blank identity objects > > P.P.S. That?s almost possible if you declare that the new opcode makes a larval value, but closing it off is very hard. You need an explicit end-larval transition to adult. The verifier would have to enforce this. Nightmare. Welcome to my nightmare. As of last Friday (the week of the JVMLS) I now believe the pieces fit together, in a way that is really more like a pleasant dream. I?m surprised. I?ve written up in detail how I think Remi?s suggestion can work. https://cr.openjdk.org/~jrose/values/larval-values.html While this is a rough note, I think all the details are present. The last tenth of the last mile, which clinched it for me, was realizing that the JVM already defines an execution point, during object creation, where the end-larval transition must take place. It is the JMM freeze operation. Just as new can be overloaded to make both values and identity objects, the JMM freeze (defined as happening at return from ) can and should be overloaded to finalize both values and identity objects. The changes to the verifier (which were frightening me) are quite mild after all, because existing rules carry the burden of separating the larval from the adult phases of the value object under construction. Take a look at my write-up and see if it makes sense to you too. ? John From brian.goetz at oracle.com Mon Aug 21 16:39:26 2023 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 21 Aug 2023 12:39:26 -0400 Subject: Fwd: Superclasses with fields In-Reply-To: References: Message-ID: <249387bd-e140-177c-feb6-82f3ddaea69d@oracle.com> The below was received on the valhalla-spec-comments list. The topic of further relaxing the restrictions on abstract class supertypes of value classes has been discussed by the EG.? While it is possible that such restrictions could be further relaxed, we are likely facing diminishing returns. It is certainly possible to define a way that fields and initialization logic from an abstract super is "copied" into the value class.? But this would still come with restrictions we might not like; we would then either have to give up putting fields in the value subclass (to eliminate layout polymorphism), or give up flattening of the abstract class type (because there is layout polymorphism.)? And, we would likely end up creating a new category of abstract class to support the "pushing down" of fields and initialization, which is new language complexity. Overall this seems like something that has a relatively weak return-on-complexity and there are many opportunities for much higher return right now. -------- Forwarded Message -------- Subject: Superclasses with fields Date: Mon, 21 Aug 2023 08:03:58 -0300 From: Thiago Henrique Hupner To: valhalla-spec-comments at openjdk.org Now that most of the migration issues from classes to value classes have been discovered, would it also be possible to allow maybe a subset of superclasses that contains some fields? Probably having "value enums" would be great, but also it would enable to have subclasses of j.u.AbstractList to also become value classes. Probably there are more issues to it than I could remember, but wouldn't it be possible to check at load time if the superclass uses some unsupported?features, like synchronized methods, and throw an error? -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.smith at oracle.com Mon Aug 21 17:39:03 2023 From: daniel.smith at oracle.com (Dan Smith) Date: Mon, 21 Aug 2023 17:39:03 +0000 Subject: The last miles In-Reply-To: <4B804FDD-E843-44D2-BE66-E915C7ECD258@oracle.com> References: <1724794551.103459871.1689233963536.JavaMail.zimbra@univ-eiffel.fr> <940032A1-D914-474B-8473-7DAE200ACF40@oracle.com> <939E976C-5088-4AE2-987E-D5EFBEF734C1@oracle.com> <4B804FDD-E843-44D2-BE66-E915C7ECD258@oracle.com> Message-ID: > On Aug 18, 2023, at 9:15 PM, John Rose wrote: > > I?ve written up in detail how I think Remi?s suggestion can work. > > https://cr.openjdk.org/~jrose/values/larval-values.html > > While this is a rough note, I think all the details are present. The compatibility wins of this strategy do seem nice. But let me scrutinize a few details, because I think there are some trade-offs: 1) A larval value object is an identity object. This means, in the hand-off between the method and the caller, the object must be heap allocated: the caller and the method need an agreed-upon memory location where state will be set up. I can see this being optimized away if the method can be inlined. But if not (e.g., the constructor logic is sufficiently complex/large), that's a new cost for value object creation: every 'new' needs a heap allocation. (For , we are able to optimize the return value calling convention without needing inlining.) Am I understanding this correctly? How concerned should we be about the extra allocation cost? (Our working principle to this point has been that optimal compiled code should have zero heap allocations.) 2) If we *do* inline the call, then at the call site, there can be any number of references from locals/stack to the larval value, and at the end of the call, there's this unusual operation where all of those locals/stack get transformed into the value object. I *think* this all just falls out cleanly (locals become compiler metadata that bottoms out at the same registers, no matter how many references there are), but it's something to think carefully about. 3) The approach doesn't have any constraints about leaking 'this', and in particular the javac rule we were envisioning is that the constructor can't leak 'this' until all fields are provably set, but aftewards it's fair game. This strategy is stricter: the verifier disallows leaking 'this' at all from any point in the constructor. Are we okay with these restrictions? In practice, this is most likely to trip up people trying to do instance method calls, plus those who are doing things like keeping track of constructed objects. (Even printf logging seems tricky, since 'toString' is off limits.) 4) I'm not sure the prohibition on 'super' calls is actually necessary. What if, instead, all non-'identity' methods are understood to be working on larval objects, and prohibited from any leaking of 'this'? Instead of disallowing 'super' calls, the verifier would only transition from 'uninitializedThis' to 'LFoo;' in an identity class constructor. Does that make sense or am I missing something? (If it works, does this mean we get support for super fields "for free"?) 5) Do we really need a header state for larval objects? We don't do anything like that to distinguish between uninitialized identity objects (post-'new') and valid identity objects (post-'super()'). We just let the verifier handle it. Same principle here perhaps? From forax at univ-mlv.fr Mon Aug 21 18:39:23 2023 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 21 Aug 2023 20:39:23 +0200 (CEST) Subject: The last miles In-Reply-To: References: <1724794551.103459871.1689233963536.JavaMail.zimbra@univ-eiffel.fr> <940032A1-D914-474B-8473-7DAE200ACF40@oracle.com> <939E976C-5088-4AE2-987E-D5EFBEF734C1@oracle.com> <4B804FDD-E843-44D2-BE66-E915C7ECD258@oracle.com> Message-ID: <1971969766.15476894.1692643163067.JavaMail.zimbra@univ-eiffel.fr> ----- Original Message ----- > From: "daniel smith" > To: "John Rose" > Cc: "Remi Forax" , "valhalla-spec-experts" > Sent: Monday, August 21, 2023 7:39:03 PM > Subject: Re: The last miles >> On Aug 18, 2023, at 9:15 PM, John Rose wrote: >> >> I?ve written up in detail how I think Remi?s suggestion can work. >> >> https://cr.openjdk.org/~jrose/values/larval-values.html >> >> While this is a rough note, I think all the details are present. > > The compatibility wins of this strategy do seem nice. But let me scrutinize a > few details, because I think there are some trade-offs: > > 1) A larval value object is an identity object. This means, in the hand-off > between the method and the caller, the object must be heap allocated: > the caller and the method need an agreed-upon memory location where > state will be set up. > > I can see this being optimized away if the method can be inlined. But if > not (e.g., the constructor logic is sufficiently complex/large), that's a new > cost for value object creation: every 'new' needs a heap allocation. (For > , we are able to optimize the return value calling convention without > needing inlining.) A larval value object is a value object with the larval state to "on", it's not an identity object. At the end of the call of , the larval state is set to "off". The larval bit controls if putfield is allowed or not. In the interpreter, a larval value object is buffered, so there is an heap allocation. But in JITed code, if everything is inlined, the larval bit does not even need to be set because the JIT can prove that the value object is in larval state when a putfield occurs. If not everything is inlined, by example if is compiled as one method, the JITed code has to check the larval bit. I will let the others answer the other questions. R?mi > > Am I understanding this correctly? How concerned should we be about the extra > allocation cost? (Our working principle to this point has been that optimal > compiled code should have zero heap allocations.) > > 2) If we *do* inline the call, then at the call site, there can be any > number of references from locals/stack to the larval value, and at the end of > the call, there's this unusual operation where all of those locals/stack get > transformed into the value object. I *think* this all just falls out cleanly > (locals become compiler metadata that bottoms out at the same registers, no > matter how many references there are), but it's something to think carefully > about. > > 3) The approach doesn't have any constraints about leaking 'this', and in > particular the javac rule we were envisioning is that the constructor can't > leak 'this' until all fields are provably set, but aftewards it's fair game. > This strategy is stricter: the verifier disallows leaking 'this' at all > from any point in the constructor. > > Are we okay with these restrictions? In practice, this is most likely to trip up > people trying to do instance method calls, plus those who are doing things like > keeping track of constructed objects. (Even printf logging seems tricky, since > 'toString' is off limits.) > > 4) I'm not sure the prohibition on 'super' calls is actually necessary. What if, > instead, all non-'identity' methods are understood to be working on > larval objects, and prohibited from any leaking of 'this'? Instead of > disallowing 'super' calls, the verifier would only transition from > 'uninitializedThis' to 'LFoo;' in an identity class constructor. Does that make > sense or am I missing something? (If it works, does this mean we get support > for super fields "for free"?) > > 5) Do we really need a header state for larval objects? We don't do anything > like that to distinguish between uninitialized identity objects (post-'new') > and valid identity objects (post-'super()'). We just let the verifier handle > it. Same principle here perhaps? From forax at univ-mlv.fr Mon Aug 21 18:58:40 2023 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 21 Aug 2023 20:58:40 +0200 (CEST) Subject: Superclasses with fields In-Reply-To: <249387bd-e140-177c-feb6-82f3ddaea69d@oracle.com> References: <249387bd-e140-177c-feb6-82f3ddaea69d@oracle.com> Message-ID: <1337663507.15489954.1692644320317.JavaMail.zimbra@univ-eiffel.fr> > From: "Brian Goetz" > To: "valhalla-spec-experts" > Sent: Monday, August 21, 2023 6:39:26 PM > Subject: Fwd: Superclasses with fields > The below was received on the valhalla-spec-comments list. > The topic of further relaxing the restrictions on abstract class supertypes of > value classes has been discussed by the EG. While it is possible that such > restrictions could be further relaxed, we are likely facing diminishing > returns. > It is certainly possible to define a way that fields and initialization logic > from an abstract super is "copied" into the value class. But this would still > come with restrictions we might not like; we would then either have to give up > putting fields in the value subclass (to eliminate layout polymorphism), or > give up flattening of the abstract class type (because there is layout > polymorphism.) And, we would likely end up creating a new category of abstract > class to support the "pushing down" of fields and initialization, which is new > language complexity. Overall this seems like something that has a relatively > weak return-on-complexity and there are many opportunities for much higher > return right now. I will add that in case of an enum, it's not clear that a "value enum" make sense in term of performance. The number of instances of an enum is fixed, and because there are all initialized one after the other, there are all in the same memory page so the actual representation is already very cache friendly. R?mi > -------- Forwarded Message -------- > Subject: Superclasses with fields > Date: Mon, 21 Aug 2023 08:03:58 -0300 > From: Thiago Henrique Hupner [ mailto:thihup at gmail.com | ] > To: [ mailto:valhalla-spec-comments at openjdk.org | > valhalla-spec-comments at openjdk.org ] > Now that most of the migration issues from classes to value classes have been > discovered, would it also be possible to allow maybe a subset of superclasses > that contains some fields? > Probably having "value enums" would be great, but also it would enable to have > subclasses of j.u.AbstractList to also become value classes. > Probably there are more issues to it than I could remember, but wouldn't it be > possible to check at load time if the superclass uses some unsupported > features, like synchronized methods, and throw an error? -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Mon Aug 21 18:58:40 2023 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 21 Aug 2023 20:58:40 +0200 (CEST) Subject: Superclasses with fields In-Reply-To: <249387bd-e140-177c-feb6-82f3ddaea69d@oracle.com> References: <249387bd-e140-177c-feb6-82f3ddaea69d@oracle.com> Message-ID: <1337663507.15489954.1692644320317.JavaMail.zimbra@univ-eiffel.fr> > From: "Brian Goetz" > To: "valhalla-spec-experts" > Sent: Monday, August 21, 2023 6:39:26 PM > Subject: Fwd: Superclasses with fields > The below was received on the valhalla-spec-comments list. > The topic of further relaxing the restrictions on abstract class supertypes of > value classes has been discussed by the EG. While it is possible that such > restrictions could be further relaxed, we are likely facing diminishing > returns. > It is certainly possible to define a way that fields and initialization logic > from an abstract super is "copied" into the value class. But this would still > come with restrictions we might not like; we would then either have to give up > putting fields in the value subclass (to eliminate layout polymorphism), or > give up flattening of the abstract class type (because there is layout > polymorphism.) And, we would likely end up creating a new category of abstract > class to support the "pushing down" of fields and initialization, which is new > language complexity. Overall this seems like something that has a relatively > weak return-on-complexity and there are many opportunities for much higher > return right now. I will add that in case of an enum, it's not clear that a "value enum" make sense in term of performance. The number of instances of an enum is fixed, and because there are all initialized one after the other, there are all in the same memory page so the actual representation is already very cache friendly. R?mi > -------- Forwarded Message -------- > Subject: Superclasses with fields > Date: Mon, 21 Aug 2023 08:03:58 -0300 > From: Thiago Henrique Hupner [ mailto:thihup at gmail.com | ] > To: [ mailto:valhalla-spec-comments at openjdk.org | > valhalla-spec-comments at openjdk.org ] > Now that most of the migration issues from classes to value classes have been > discovered, would it also be possible to allow maybe a subset of superclasses > that contains some fields? > Probably having "value enums" would be great, but also it would enable to have > subclasses of j.u.AbstractList to also become value classes. > Probably there are more issues to it than I could remember, but wouldn't it be > possible to check at load time if the superclass uses some unsupported > features, like synchronized methods, and throw an error? -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.smith at oracle.com Mon Aug 21 19:12:03 2023 From: daniel.smith at oracle.com (Dan Smith) Date: Mon, 21 Aug 2023 19:12:03 +0000 Subject: The last miles In-Reply-To: <1971969766.15476894.1692643163067.JavaMail.zimbra@univ-eiffel.fr> References: <1724794551.103459871.1689233963536.JavaMail.zimbra@univ-eiffel.fr> <940032A1-D914-474B-8473-7DAE200ACF40@oracle.com> <939E976C-5088-4AE2-987E-D5EFBEF734C1@oracle.com> <4B804FDD-E843-44D2-BE66-E915C7ECD258@oracle.com> <1971969766.15476894.1692643163067.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <3F31C2CF-29F7-4D32-A56C-BE4F1E1432CA@oracle.com> On Aug 21, 2023, at 11:39 AM, forax at univ-mlv.fr wrote: 1) A larval value object is an identity object. This means, in the hand-off between the method and the caller, the object must be heap allocated: the caller and the method need an agreed-upon memory location where state will be set up. I can see this being optimized away if the method can be inlined. But if not (e.g., the constructor logic is sufficiently complex/large), that's a new cost for value object creation: every 'new' needs a heap allocation. (For , we are able to optimize the return value calling convention without needing inlining.) A larval value object is a value object with the larval state to "on", it's not an identity object. At the end of the call of , the larval state is set to "off". The larval bit controls if putfield is allowed or not. I mean it is an "identity object" in the sense that it must live at a canonical, mutable memory location. E.g., you can't scalarize it across calls, you have to pass it by reference. But, agreed, "larval object" and "identity object" are distinct concepts. It's just that both of them depend on some sort of "identity" capability. (Separately, we can dig further into the question of whether you actually need runtime flags to detect this state change, or can leave it to verification. That's my point (5).) In the interpreter, a larval value object is buffered, so there is an heap allocation. But in JITed code, if everything is inlined Exactly: "if everything is inlined". My point (1) is that if everything *is not* inlined, there are allocations that was able to optimize away. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Mon Aug 21 20:45:55 2023 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 21 Aug 2023 22:45:55 +0200 (CEST) Subject: The last miles In-Reply-To: <3F31C2CF-29F7-4D32-A56C-BE4F1E1432CA@oracle.com> References: <1724794551.103459871.1689233963536.JavaMail.zimbra@univ-eiffel.fr> <940032A1-D914-474B-8473-7DAE200ACF40@oracle.com> <939E976C-5088-4AE2-987E-D5EFBEF734C1@oracle.com> <4B804FDD-E843-44D2-BE66-E915C7ECD258@oracle.com> <1971969766.15476894.1692643163067.JavaMail.zimbra@univ-eiffel.fr> <3F31C2CF-29F7-4D32-A56C-BE4F1E1432CA@oracle.com> Message-ID: <1429953908.15567355.1692650755521.JavaMail.zimbra@univ-eiffel.fr> > From: "daniel smith" > To: "Remi Forax" > Cc: "John Rose" , "valhalla-spec-experts" > > Sent: Monday, August 21, 2023 9:12:03 PM > Subject: Re: The last miles >> On Aug 21, 2023, at 11:39 AM, forax at univ-mlv.fr wrote: >>> 1) A larval value object is an identity object. This means, in the hand-off >>> between the method and the caller, the object must be heap allocated: >>> the caller and the method need an agreed-upon memory location where >>> state will be set up. >>> I can see this being optimized away if the method can be inlined. But if >>> not (e.g., the constructor logic is sufficiently complex/large), that's a new >>> cost for value object creation: every 'new' needs a heap allocation. (For >>> , we are able to optimize the return value calling convention without >>> needing inlining.) >> A larval value object is a value object with the larval state to "on", it's not >> an identity object. >> At the end of the call of , the larval state is set to "off". >> The larval bit controls if putfield is allowed or not. > I mean it is an "identity object" in the sense that it must live at a canonical, > mutable memory location. E.g., you can't scalarize it across calls, you have to > pass it by reference. But, agreed, "larval object" and "identity object" are > distinct concepts. It's just that both of them depend on some sort of > "identity" capability. > (Separately, we can dig further into the question of whether you actually need > runtime flags to detect this state change, or can leave it to verification. > That's my point (5).) >> In the interpreter, a larval value object is buffered, so there is an heap >> allocation. >> But in JITed code, if everything is inlined > Exactly: "if everything is inlined". My point (1) is that if everything *is not* > inlined, there are allocations that was able to optimize away. inside , yes, but at the end of if it goes back into the interpreter, the return value will be buffered. I do not think it is obvious if there will be performance regression or not until the new scheme is implemented and tested. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From liangchenblue at gmail.com Tue Aug 22 00:30:29 2023 From: liangchenblue at gmail.com (-) Date: Tue, 22 Aug 2023 08:30:29 +0800 Subject: Superclasses with fields In-Reply-To: <249387bd-e140-177c-feb6-82f3ddaea69d@oracle.com> References: <249387bd-e140-177c-feb6-82f3ddaea69d@oracle.com> Message-ID: I think Thiago has a good point: this abstract class with fields scenario may look like what we will encounter when we implement generic specialization: the erased type acts like a superclass and the parameterized types (with value types, especially) act like subclasses. I think such a model is also why JEP 181 was introduced. Why can't these 2 processes share at least some of their logic? How are we going to implement generic specialization otherwise? > And, we would likely end up creating a new category of abstract class to support the "pushing down" of fields and initialization, which is new language complexity. Isn't this already in the language model, that if an abstract class is not identity (either implicitly or explicitly, such as by declaring synchronized methods, etc.), then it belongs to such a category? Do we need another category to "push down" the fields for only some of these non-identity classes, once the non-identity requirements are relaxed, to avoid the performance overhead? Chen Liang On Tue, Aug 22, 2023 at 2:54?AM Brian Goetz wrote: > The below was received on the valhalla-spec-comments list. > > The topic of further relaxing the restrictions on abstract class > supertypes of value classes has been discussed by the EG. While it is > possible that such restrictions could be further relaxed, we are likely > facing diminishing returns. > > It is certainly possible to define a way that fields and initialization > logic from an abstract super is "copied" into the value class. But this > would still come with restrictions we might not like; we would then either > have to give up putting fields in the value subclass (to eliminate layout > polymorphism), or give up flattening of the abstract class type (because > there is layout polymorphism.) And, we would likely end up creating a new > category of abstract class to support the "pushing down" of fields and > initialization, which is new language complexity. Overall this seems like > something that has a relatively weak return-on-complexity and there are > many opportunities for much higher return right now. > > > -------- Forwarded Message -------- > Subject: Superclasses with fields > Date: Mon, 21 Aug 2023 08:03:58 -0300 > From: Thiago Henrique Hupner > To: valhalla-spec-comments at openjdk.org > > Now that most of the migration issues from classes to value classes have > been discovered, would it also be possible to allow maybe a subset of > superclasses that contains some fields? > > Probably having "value enums" would be great, but also it would enable to > have subclasses of j.u.AbstractList to also become value classes. > > Probably there are more issues to it than I could remember, but wouldn't > it be possible to check at load time if the superclass uses some > unsupported features, like synchronized methods, and throw an error? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Tue Aug 22 15:31:21 2023 From: john.r.rose at oracle.com (John Rose) Date: Tue, 22 Aug 2023 08:31:21 -0700 Subject: The last miles In-Reply-To: References: <1724794551.103459871.1689233963536.JavaMail.zimbra@univ-eiffel.fr> <940032A1-D914-474B-8473-7DAE200ACF40@oracle.com> <939E976C-5088-4AE2-987E-D5EFBEF734C1@oracle.com> <4B804FDD-E843-44D2-BE66-E915C7ECD258@oracle.com> Message-ID: <2781CA35-DED4-4494-B65F-86CD005D3D01@oracle.com> On 21 Aug 2023, at 10:39, Dan Smith wrote: >> On Aug 18, 2023, at 9:15 PM, John Rose wrote: >> >> I?ve written up in detail how I think Remi?s suggestion can work. >> >> https://cr.openjdk.org/~jrose/values/larval-values.html >> >> While this is a rough note, I think all the details are present. > > The compatibility wins of this strategy do seem nice. But let me scrutinize a few details, because I think there are some trade-offs: > > 1) A larval value object is an identity object. This means, in the hand-off between the method and the caller, the object must be heap allocated: the caller and the method need an agreed-upon memory location where state will be set up. > > I can see this being optimized away if the method can be inlined. But if not (e.g., the constructor logic is sufficiently complex/large), that's a new cost for value object creation: every 'new' needs a heap allocation. (For , we are able to optimize the return value calling convention without needing inlining.) > > Am I understanding this correctly? How concerned should we be about the extra allocation cost? (Our working principle to this point has been that optimal compiled code should have zero heap allocations.) Optimal compiled code can still have this feature, if we choose. We can direct the compiled version of (for a value class only) to alter its calling sequence, dropping the input and returning the value. Compiled calls to this guy would omit the input. The interpreter adapter for it would adjust the discrepancy. There are a number of ways to do this, in detail. (But, also, if the method is complex enough to fail to inline, we probably won?t notice the extra cost of a buffered input. If the method inlines, current JIT optimizations will get rid of the allocation. We can, also, adjust inlining heuristics to greatly favor value-; we do this for certain other kinds of methods already. Note that all of these worries only apply to value classes with non-deprecated constructors. New code will use factory methods, which doesn?t need to suffer from failed inlines, again because of an adjusted heuristic, if we need it. As I said, there are a number of ways to address this issue.) > 2) If we *do* inline the call, then at the call site, there can be any number of references from locals/stack to the larval value, and at the end of the call, there's this unusual operation where all of those locals/stack get transformed into the value object. I *think* this all just falls out cleanly (locals become compiler metadata that bottoms out at the same registers, no matter how many references there are), but it's something to think carefully about. Let?s continue to think about it, but I have done a first pass and I don?t see any problem. (This was surprising to me, so I?m not surprised others will wish to think about it more as well.) > 3) The approach doesn't have any constraints about leaking 'this', and in particular the javac rule we were envisioning is that the constructor can't leak 'this' until all fields are provably set, but aftewards it's fair game. This strategy is stricter: the verifier disallows leaking 'this' at all from any point in the constructor. Yes, easy leaking is a feature of ; the value is always ready. (This also means the interpreter has to create a new buffer on every state change. I don?t care much about interpreter performance, but I think the version of things performs fewer allocations.) > Are we okay with these restrictions? In practice, this is most likely to trip up people trying to do instance method calls, plus those who are doing things like keeping track of constructed objects. (Even printf logging seems tricky, since 'toString' is off limits.) If we wish to allow the super call after all, it can serve as the freeze point within the constructor. It is still the case that the freeze must be performed before the value is usable as an adult, and there is no way to perform ?late? putfields after the freeze. If the language wishes to fully implement ?late? putfields, then we need to use some new machinery in the translation strategy. I?d reach for method/var handles to create withers, in that case. In other words, a direct withfield operation would be spun up in a runtime support API. We have this code today. > 4) I'm not sure the prohibition on 'super' calls is actually necessary. No, but it?s a move of economy. Defining the meaning of super for values would be extra work. We could do that; I?d prefer not to. Remember that super-constructors for values are already very special animals: They must be empty in a special sense. Forbidding calls to them seems like the clean move. > What if, instead, all non-'identity' methods are understood to be working on larval objects, and prohibited from any leaking of 'this'? Indeed, that is the real issue. If the user is allowed to leak ?this? from , the translation strategy must arrange to promote the leaked ?this? to an adult object. It?s easiest if this is ?in place?, which is how the current verifier rules see it. But it could also be done with a method handle built by the runtime, which works like the finishPrivateBuffer method. This method handle could take the larval object and return a fresh adult copy, but without changing the status of the input larval buffer. > Instead of disallowing 'super' calls, the verifier would only transition from 'uninitializedThis' to 'LFoo;' in an identity class constructor. Does that make sense or am I missing something? You are missing the fact that the JVMS allows putfield at that point, to change the state of an identity object, and this is a bad move for values. That?s why I think the simple and safe move is to disallow that state transition; this makes all putfields legitimate, inside the whole method, without further checks. > (If it works, does this mean we get support for super fields "for free"?) That is probably true. Do we care? > 5) Do we really need a header state for larval objects? We don't do anything like that to distinguish between uninitialized identity objects (post-'new') and valid identity objects (post-'super()'). We just let the verifier handle it. Same principle here perhaps? In most cases the header state is not needed. I mentioned that there are some potential GC optimizations that (if implemented) would need to see the larval state and treat it differently. For compiled code, I think the larval state would disappear, except for deoptimization logic, which would ?put it back? to larval, for methods that must jump back into the interpreter. The relevant JIT optimizations would drop out more or less for free once the compiled version of had its calling sequence adjusted as noted above, to drop the input buffer (then it has no state to worry about!) and just return a value at the end. From daniel.smith at oracle.com Tue Aug 22 18:03:11 2023 From: daniel.smith at oracle.com (Dan Smith) Date: Tue, 22 Aug 2023 18:03:11 +0000 Subject: The last miles In-Reply-To: <2781CA35-DED4-4494-B65F-86CD005D3D01@oracle.com> References: <1724794551.103459871.1689233963536.JavaMail.zimbra@univ-eiffel.fr> <940032A1-D914-474B-8473-7DAE200ACF40@oracle.com> <939E976C-5088-4AE2-987E-D5EFBEF734C1@oracle.com> <4B804FDD-E843-44D2-BE66-E915C7ECD258@oracle.com> <2781CA35-DED4-4494-B65F-86CD005D3D01@oracle.com> Message-ID: <94A14763-DD2D-40A6-A0CB-77E5477B0EE2@oracle.com> > On Aug 22, 2023, at 8:31 AM, John Rose wrote: > > On 21 Aug 2023, at 10:39, Dan Smith wrote: > >>> On Aug 18, 2023, at 9:15 PM, John Rose wrote: >>> >>> I?ve written up in detail how I think Remi?s suggestion can work. >>> >>> https://cr.openjdk.org/~jrose/values/larval-values.html >>> >>> While this is a rough note, I think all the details are present. >> >> The compatibility wins of this strategy do seem nice. But let me scrutinize a few details, because I think there are some trade-offs: >> >> 1) A larval value object is an identity object. This means, in the hand-off between the method and the caller, the object must be heap allocated: the caller and the method need an agreed-upon memory location where state will be set up. >> >> I can see this being optimized away if the method can be inlined. But if not (e.g., the constructor logic is sufficiently complex/large), that's a new cost for value object creation: every 'new' needs a heap allocation. (For , we are able to optimize the return value calling convention without needing inlining.) >> >> Am I understanding this correctly? How concerned should we be about the extra allocation cost? (Our working principle to this point has been that optimal compiled code should have zero heap allocations.) > > Optimal compiled code can still have this feature, if we choose. > > We can direct the compiled version of (for a value class only) > to alter its calling sequence, dropping the input and returning the > value. Compiled calls to this guy would omit the input. The > interpreter adapter for it would adjust the discrepancy. There are > a number of ways to do this, in detail. I would worry about the complexity of such an optimization (the optimized calling convention bears little resemblance to the original, and there needs to be some novel encoding of 'uninitialized' at the call site to express the promise of a value object to be computed later and stored in n different locals/stack positions). Another thing that could be done is to have a lightweight on-stack encoding of "larval value object" that could be passed by reference and mutated by an method, but without the overhead of a full heap object. New encodings mean new complexity, but maybe this one would be worth it. Or maybe you're right, no need to worry about this corner case, inlining will be fine... > (But, also, if the method is complex enough to fail to inline, > we probably won?t notice the extra cost of a buffered input. Yeah, that's fair. I guess the worrying case would be where the existing has a high computation time cost but zero memory impact; the strategy would be bad on both dimensions. > Note that all of these worries only apply to value classes > with non-deprecated constructors. New code will use factory methods, > which doesn?t need to suffer from failed inlines, again because of > an adjusted heuristic, if we need it. There's no particular reason that new code would favor factories instead. (At least, there doesn't need to be. This compilation strategy makes it even easier for us to say in the language "almost nothing about constructors has changed, carry on as you have before.") But it's true that, in a performance-sensitive application, an expensive constructor could be rewritten as a static factory with a private constructor that just sets the fields. And the calling convention for that factory will support scalarization. Such a refactoring shrinks the lifespan of the problematic larval object to the point that inlining & eliminating it should be trivial. My takeaway is just that we should be cautious here: where before we had a guarantee of no new allocations from value class constructors (modulo some size threshold), now we're in the fuzzy territory of "if everything shakes out okay, you shouldn't notice any impact". This may be fine, but we'll want to keep an eye on it. >> 3) The approach doesn't have any constraints about leaking 'this', and in particular the javac rule we were envisioning is that the constructor can't leak 'this' until all fields are provably set, but aftewards it's fair game. This strategy is stricter: the verifier disallows leaking 'this' at all from any point in the constructor. > > Yes, easy leaking is a feature of ; the value is always ready. > (This also means the interpreter has to create a new buffer on every > state change. I don?t care much about interpreter performance, but > I think the version of things performs fewer allocations.) > >> Are we okay with these restrictions? In practice, this is most likely to trip up people trying to do instance method calls, plus those who are doing things like keeping track of constructed objects. (Even printf logging seems tricky, since 'toString' is off limits.) > > If we wish to allow the super call after all, it can serve as the freeze > point within the constructor. It is still the case that the freeze must > be performed before the value is usable as an adult, and there is no way > to perform ?late? putfields after the freeze. Yeah, you've got me thinking that maybe a rule that says you can set fields before 'super()' but not after would be good enough. (With a language change that says in a value class, the implicit 'super()' call happens at the end rather than the start. If you want to write any post-super() code, you'll need an explicit super call.) That sort of bottom-to-top initialization strategy is a change from tradition, but maybe we're mostly equipped to handle it already? (Thanks, JEP 447!) > If the language wishes to fully implement ?late? putfields No. I don't think publishing 'this' before all value object fields are set is on the table. >> 4) I'm not sure the prohibition on 'super' calls is actually necessary. > > No, but it?s a move of economy. Defining the meaning of super for > values would be extra work. We could do that; I?d prefer not to. > Remember that super-constructors for values are already very special > animals: They must be empty in a special sense. Forbidding calls to > them seems like the clean move. The rule is that super constructors must be empty because we had no concept of mutable state to communicate changes from parent to child. But now that we have larval objects... Concretely, what if: - putfield is a verifier error on non-identity class types, it only works on uninitializedThis - as usual, every method (for all kinds of classes) must do a super- invokespecial (or this-? still thinking about that) Then: - value objects get built bottom-to-top, with fields set before a super() call, and freedom to use 'this' afterwards - abstract classes can participate too, following the same code shape - identity classes (abstract and concrete) have a little more freedom, because they can follow the same pattern *or* set their fields after the super() call I need to think more about this, but it seems to me at the moment that everything falls out cleanly... >> (If it works, does this mean we get support for super fields "for free"?) > > That is probably true. Do we care? I'd be happy to get rid of special rules that have to do with super fields. (Replacing it with a rule that says certain shapes of abstract class constructors imply identity.) Not so much because of particular use cases, but because it makes the language more regular. From heidinga at redhat.com Tue Aug 22 18:50:42 2023 From: heidinga at redhat.com (Dan Heidinga) Date: Tue, 22 Aug 2023 14:50:42 -0400 Subject: The last miles In-Reply-To: <94A14763-DD2D-40A6-A0CB-77E5477B0EE2@oracle.com> References: <1724794551.103459871.1689233963536.JavaMail.zimbra@univ-eiffel.fr> <940032A1-D914-474B-8473-7DAE200ACF40@oracle.com> <939E976C-5088-4AE2-987E-D5EFBEF734C1@oracle.com> <4B804FDD-E843-44D2-BE66-E915C7ECD258@oracle.com> <2781CA35-DED4-4494-B65F-86CD005D3D01@oracle.com> <94A14763-DD2D-40A6-A0CB-77E5477B0EE2@oracle.com> Message-ID: On Tue, Aug 22, 2023 at 2:03?PM Dan Smith wrote: > > > > On Aug 22, 2023, at 8:31 AM, John Rose wrote: > > > > On 21 Aug 2023, at 10:39, Dan Smith wrote: > > > >>> On Aug 18, 2023, at 9:15 PM, John Rose wrote: > >>> > >>> I?ve written up in detail how I think Remi?s suggestion can work. > >>> > >>> https://cr.openjdk.org/~jrose/values/larval-values.html > >>> > >>> While this is a rough note, I think all the details are present. > >> > >> The compatibility wins of this strategy do seem nice. But let me > scrutinize a few details, because I think there are some trade-offs: > >> > >> 1) A larval value object is an identity object. This means, in the > hand-off between the method and the caller, the object must be heap > allocated: the caller and the method need an agreed-upon memory > location where state will be set up. > >> > >> I can see this being optimized away if the method can be > inlined. But if not (e.g., the constructor logic is sufficiently > complex/large), that's a new cost for value object creation: every 'new' > needs a heap allocation. (For , we are able to optimize the return > value calling convention without needing inlining.) > >> > >> Am I understanding this correctly? How concerned should we be about the > extra allocation cost? (Our working principle to this point has been that > optimal compiled code should have zero heap allocations.) > > > > Optimal compiled code can still have this feature, if we choose. > > > > We can direct the compiled version of (for a value class only) > > to alter its calling sequence, dropping the input and returning the > > value. Compiled calls to this guy would omit the input. The > > interpreter adapter for it would adjust the discrepancy. There are > > a number of ways to do this, in detail. > > I would worry about the complexity of such an optimization (the optimized > calling convention bears little resemblance to the original, and there > needs to be some novel encoding of 'uninitialized' at the call site to > express the promise of a value object to be computed later and stored in n > different locals/stack positions). > > Another thing that could be done is to have a lightweight on-stack > encoding of "larval value object" that could be passed by reference and > mutated by an method, but without the overhead of a full heap > object. New encodings mean new complexity, but maybe this one would be > worth it. > > Or maybe you're right, no need to worry about this corner case, inlining > will be fine... > > > (But, also, if the method is complex enough to fail to inline, > > we probably won?t notice the extra cost of a buffered input. > > Yeah, that's fair. I guess the worrying case would be where the existing > has a high computation time cost but zero memory impact; the > strategy would be bad on both dimensions. > > > Note that all of these worries only apply to value classes > > with non-deprecated constructors. New code will use factory methods, > > which doesn?t need to suffer from failed inlines, again because of > > an adjusted heuristic, if we need it. > > There's no particular reason that new code would favor factories instead. > (At least, there doesn't need to be. This compilation strategy makes it > even easier for us to say in the language "almost nothing about > constructors has changed, carry on as you have before.") > > But it's true that, in a performance-sensitive application, an expensive > constructor could be rewritten as a static factory with a private > constructor that just sets the fields. And the calling convention for that > factory will support scalarization. Such a refactoring shrinks the lifespan > of the problematic larval object to the point that inlining & eliminating > it should be trivial. > > My takeaway is just that we should be cautious here: where before we had a > guarantee of no new allocations from value class constructors (modulo some > size threshold), now we're in the fuzzy territory of "if everything shakes > out okay, you shouldn't notice any impact". This may be fine, but we'll > want to keep an eye on it. > > >> 3) The approach doesn't have any constraints about leaking > 'this', and in particular the javac rule we were envisioning is that the > constructor can't leak 'this' until all fields are provably set, but > aftewards it's fair game. This strategy is stricter: the verifier > disallows leaking 'this' at all from any point in the constructor. > > > > Yes, easy leaking is a feature of ; the value is always ready. > > (This also means the interpreter has to create a new buffer on every > > state change. I don?t care much about interpreter performance, but > > I think the version of things performs fewer allocations.) > > > >> Are we okay with these restrictions? In practice, this is most likely > to trip up people trying to do instance method calls, plus those who are > doing things like keeping track of constructed objects. (Even printf > logging seems tricky, since 'toString' is off limits.) > > > > If we wish to allow the super call after all, it can serve as the freeze > > point within the constructor. It is still the case that the freeze must > > be performed before the value is usable as an adult, and there is no way > > to perform ?late? putfields after the freeze. > > Yeah, you've got me thinking that maybe a rule that says you can set > fields before 'super()' but not after would be good enough. (With a > language change that says in a value class, the implicit 'super()' call > happens at the end rather than the start. If you want to write any > post-super() code, you'll need an explicit super call.) > > That sort of bottom-to-top initialization strategy is a change from > tradition, but maybe we're mostly equipped to handle it already? (Thanks, > JEP 447!) > > > If the language wishes to fully implement ?late? putfields > > No. I don't think publishing 'this' before all value object fields are set > is on the table. > > >> 4) I'm not sure the prohibition on 'super' calls is actually necessary. > > > > No, but it?s a move of economy. Defining the meaning of super for > > values would be extra work. We could do that; I?d prefer not to. > > Remember that super-constructors for values are already very special > > animals: They must be empty in a special sense. Forbidding calls to > > them seems like the clean move. > > The rule is that super constructors must be empty because we had no > concept of mutable state to communicate changes from parent to child. But > now that we have larval objects... > > Concretely, what if: > > - putfield is a verifier error on non-identity class types, it only works > on uninitializedThis > - as usual, every method (for all kinds of classes) must do a > super- invokespecial (or this-? still thinking about that) > > Then: > > - value objects get built bottom-to-top, with fields set before a super() > call, and freedom to use 'this' afterwards > - abstract classes can participate too, following the same code shape > - identity classes (abstract and concrete) have a little more freedom, > because they can follow the same pattern *or* set their fields after the > super() call > Flattened values can't support layout polymorphism which means all uses of a common abstract class will need to rely on pointer polymorphism (this was already true before this change). If abstract classes which are super classes of value classes can now have fields, does that encourage developers to adopt patterns which rely on pointer polymorphism and forfeit flattening? The concern here is less about can we do this but more about should we do it? Does this hold together with our story on implicit constructors (generate both an implicit_creation attribute and a method in the class)? --Dan > > I need to think more about this, but it seems to me at the moment that > everything falls out cleanly... > > >> (If it works, does this mean we get support for super fields "for > free"?) > > > > That is probably true. Do we care? > > I'd be happy to get rid of special rules that have to do with super > fields. (Replacing it with a rule that says certain shapes of abstract > class constructors imply identity.) Not so much because of particular use > cases, but because it makes the language more regular. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Tue Aug 22 19:29:21 2023 From: john.r.rose at oracle.com (John Rose) Date: Tue, 22 Aug 2023 12:29:21 -0700 Subject: The last miles In-Reply-To: References: <1724794551.103459871.1689233963536.JavaMail.zimbra@univ-eiffel.fr> <940032A1-D914-474B-8473-7DAE200ACF40@oracle.com> <939E976C-5088-4AE2-987E-D5EFBEF734C1@oracle.com> <4B804FDD-E843-44D2-BE66-E915C7ECD258@oracle.com> <2781CA35-DED4-4494-B65F-86CD005D3D01@oracle.com> <94A14763-DD2D-40A6-A0CB-77E5477B0EE2@oracle.com> Message-ID: <57D9C555-744A-4B21-8E70-89AB0CE41CEE@oracle.com> On 22 Aug 2023, at 11:50, Dan Heidinga wrote: ? >>>> 4) I'm not sure the prohibition on 'super' calls is actually necessary. >>> >>> No, but it?s a move of economy. Defining the meaning of super for >>> values would be extra work. We could do that; I?d prefer not to. >>> Remember that super-constructors for values are already very special >>> animals: They must be empty in a special sense. Forbidding calls to >>> them seems like the clean move. >> >> The rule is that super constructors must be empty because we had no >> concept of mutable state to communicate changes from parent to child. But >> now that we have larval objects... >> >> Concretely, what if: >> >> - putfield is a verifier error on non-identity class types, it only works >> on uninitializedThis >> - as usual, every method (for all kinds of classes) must do a >> super- invokespecial (or this-? still thinking about that) >> >> Then: >> >> - value objects get built bottom-to-top, with fields set before a super() >> call, and freedom to use 'this' afterwards >> - abstract classes can participate too, following the same code shape >> - identity classes (abstract and concrete) have a little more freedom, >> because they can follow the same pattern *or* set their fields after the >> super() call >> > > Flattened values can't support layout polymorphism which means all uses of > a common abstract class will need to rely on pointer polymorphism (this was > already true before this change). That?s true, except in the case of inlining, which often removes the need for pointer polymorphism. In this case, the interpreter needs a real physical buffer on the heap (or stack, maybe) for the larval value so that it can be passed to both Bottom. and then Top.. (Here final Bottom <: abstract Top.) But the JIT can inline Top. and Bottom. into the same place, and then dissolve the buffer into scalar components. I am very confident that inlining policy tweaks can fix this in nearly all cases, reducing the problem of heap-based larvae, down into the noise. And/or adjusting the calling sequence of can do so as well. The constructor Top. could be adjusted to return the state for just the super fields. ?This stuff about inlining or special handling of Top. is assuming we adopt a new feature, not planned for Valhalla, which is non-empty super constructors for values, as required by abstract supers of values which would define their own fields. I have always thought fields in abstract supers of values is a very reasonable ask, as a future feature, but I also think we should not delay Valhalla to do it. > If abstract classes which are super > classes of value classes can now have fields, does that encourage > developers to adopt patterns which rely on pointer polymorphism and > forfeit flattening? The concern here is less about can we do this but more > about should we do it? I think we should consider this feature for a future release of Valhalla. There is nothing special about this feature that would make it defeat the VM in its usual job of optimizing whatever the user throws at it. As usual inlining will probably be sufficient, and if not there are other tricks we can play. > Does this hold together with our story on implicit constructors (generate > both an implicit_creation attribute and a method in the class)? Probably, but that?s a question Dan can answer more surely than me. Again, the interpreter is not my biggest performance concern, but it is worth noting that the current -based design has the interpreter creating one new buffered value per withfield operation (which is per non-static field in the class), while the proposed -based design requires one new (larval) buffered value per instance. That means the interpreter (which I don?t really care about) will create no fewer buffered values in the old scheme, as long as there is at least one field in the value class. That?s a (small) win for the -based scheme. And for the more important case of the JIT, which has its own IR for value types unrelated to the interpreter, a calling sequence adjustment (which we already do in other cases!) will make the buffer disappear completely. From john.r.rose at oracle.com Tue Aug 22 20:02:05 2023 From: john.r.rose at oracle.com (John Rose) Date: Tue, 22 Aug 2023 13:02:05 -0700 Subject: The last miles In-Reply-To: <94A14763-DD2D-40A6-A0CB-77E5477B0EE2@oracle.com> References: <1724794551.103459871.1689233963536.JavaMail.zimbra@univ-eiffel.fr> <940032A1-D914-474B-8473-7DAE200ACF40@oracle.com> <939E976C-5088-4AE2-987E-D5EFBEF734C1@oracle.com> <4B804FDD-E843-44D2-BE66-E915C7ECD258@oracle.com> <2781CA35-DED4-4494-B65F-86CD005D3D01@oracle.com> <94A14763-DD2D-40A6-A0CB-77E5477B0EE2@oracle.com> Message-ID: <77DDA762-00BF-46A3-861A-60A4C81A8CE9@oracle.com> On 22 Aug 2023, at 11:03, Dan Smith wrote: ? >>> 4) I'm not sure the prohibition on 'super' calls is actually >>> necessary. >> >> No, but it?s a move of economy. Defining the meaning of super for >> values would be extra work. We could do that; I?d prefer not to. >> Remember that super-constructors for values are already very special >> animals: They must be empty in a special sense. Forbidding calls to >> them seems like the clean move. > > The rule is that super constructors must be empty because we had no > concept of mutable state to communicate changes from parent to child. > But now that we have larval objects... > > Concretely, what if: > > - putfield is a verifier error on non-identity class types, it only > works on uninitializedThis Alternatively, we don?t need to touch the verifier if we use a dynamic larval-bit check on putfield. I like that much better. The effect on the language and T.S. is the same as a verifier rule. Specifically, allow putfield (on a final field) throughout the of the same class, just as we do now. Add a dynamic check on putfield that throws an error if the object has been frozen. The interpreter creates the larval buffer with the ?frozen? bit clear, and sets it on exit from the appropriate constructor. (Probably the larval state is a distinct header pattern; I am speaking here of a logical frozen bit.) > - as usual, every method (for all kinds of classes) must do a > super- invokespecial (or this-? still thinking about that) If we keep the rules exactly as they are, then we can make putfield safe for values like this: - putfield dynamically asserts unfrozen (object header in larval state) - exit from any constructor (super or not) sets frozen bit if unfrozen Both of these rules appeal to dynamic state, and so do not require verifier changes. For calls to a ?this? constructor, the implication is that the object is frozen (becomes a mature usable adult value) after invokespecial V. (V is the value class). For calls to a ?super? constructor, the implication is that the object is frozen after invokespecial S. (S is the super). The freeze operation can be something simple, like this pseudocode: if (this->header().is_in_larval_state()) this->set_header(this->klass()->adult_state()); Or a CAS could be used if there are GC threads lurking nearby. The return instruction(s) of such a constructor can be ?hacked? in the interpreter to perform the freeze operation, as one of the many steps performed by the interpreter when a stack frame is taken down. > Then: > > - value objects get built bottom-to-top, with fields set before a > super() call, and freedom to use 'this' afterwards > - abstract classes can participate too, following the same code shape > - identity classes (abstract and concrete) have a little more freedom, > because they can follow the same pattern *or* set their fields after > the super() call Yes. All of these use cases can be made to work with zero verifier changes. If we do not forbid the super- call ? or even require it, as in today?s verifier ? then we need a way for the interpreter to execute it. Can we just allow today?s verifier and interpreter rules to ?bang away? on the larval value? Maybe, if the first constructor return (including Object.) promotes the value to adult. This puts the burden on the coder of values to push the calls (including the whole super chain) to the end of the method, after all putfields. If this actually works, I suppose it would lead to an even simpler JVMS. We might not need special markings for constructors of abstract supers. > I need to think more about this, but it seems to me at the moment that > everything falls out cleanly... > >>> (If it works, does this mean we get support for super fields "for >>> free"?) >> >> That is probably true. Do we care? > > I'd be happy to get rid of special rules that have to do with super > fields. (Replacing it with a rule that says certain shapes of abstract > class constructors imply identity.) Not so much because of particular > use cases, but because it makes the language more regular. Let?s think about it, but also be willing to push this off to later, at the cost of putting in special markings for abstract supers that are value-capable. -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.smith at oracle.com Tue Aug 22 20:27:19 2023 From: daniel.smith at oracle.com (Dan Smith) Date: Tue, 22 Aug 2023 20:27:19 +0000 Subject: The last miles In-Reply-To: <57D9C555-744A-4B21-8E70-89AB0CE41CEE@oracle.com> References: <1724794551.103459871.1689233963536.JavaMail.zimbra@univ-eiffel.fr> <940032A1-D914-474B-8473-7DAE200ACF40@oracle.com> <939E976C-5088-4AE2-987E-D5EFBEF734C1@oracle.com> <4B804FDD-E843-44D2-BE66-E915C7ECD258@oracle.com> <2781CA35-DED4-4494-B65F-86CD005D3D01@oracle.com> <94A14763-DD2D-40A6-A0CB-77E5477B0EE2@oracle.com> <57D9C555-744A-4B21-8E70-89AB0CE41CEE@oracle.com> Message-ID: <12B75E2E-AEA4-4B88-BA45-D183C165F6F2@oracle.com> On Aug 22, 2023, at 12:29 PM, John Rose wrote: If abstract classes which are super classes of value classes can now have fields, does that encourage developers to adopt patterns which rely on pointer polymorphism and forfeit flattening? The concern here is less about can we do this but more about should we do it? I think we should consider this feature for a future release of Valhalla. There is nothing special about this feature that would make it defeat the VM in its usual job of optimizing whatever the user throws at it. As usual inlining will probably be sufficient, and if not there are other tricks we can play. I'll add that I think the moral hazard of writing polymorphic code is equally present in interfaces (default methods) and fieldless abstract classes (concrete methods that don't mention fields). I'm not seeing that dynamic significantly changing in the presence of fields. Does this hold together with our story on implicit constructors (generate both an implicit_creation attribute and a method in the class)? Probably, but that?s a question Dan can answer more surely than me. Yeah, the "I allow implicit instance creation" metadata is, at the class file level, orthogonal to constructors. The flag enables the VM to create all-zeros instances without running any code. The constructor bodies determine what happens when a constructor is explicitly invoked. -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.smith at oracle.com Tue Aug 22 20:33:45 2023 From: daniel.smith at oracle.com (Dan Smith) Date: Tue, 22 Aug 2023 20:33:45 +0000 Subject: The last miles In-Reply-To: <77DDA762-00BF-46A3-861A-60A4C81A8CE9@oracle.com> References: <1724794551.103459871.1689233963536.JavaMail.zimbra@univ-eiffel.fr> <940032A1-D914-474B-8473-7DAE200ACF40@oracle.com> <939E976C-5088-4AE2-987E-D5EFBEF734C1@oracle.com> <4B804FDD-E843-44D2-BE66-E915C7ECD258@oracle.com> <2781CA35-DED4-4494-B65F-86CD005D3D01@oracle.com> <94A14763-DD2D-40A6-A0CB-77E5477B0EE2@oracle.com> <77DDA762-00BF-46A3-861A-60A4C81A8CE9@oracle.com> Message-ID: <4F71AD80-5C51-4082-8AD5-4AB32095BE9D@oracle.com> > On Aug 22, 2023, at 1:02 PM, John Rose wrote: > >> - putfield is a verifier error on non-identity class types, it only works on uninitializedThis > Alternatively, we don?t need to touch the verifier if we use a dynamic > larval-bit check on putfield. Okay, but FWIW, we need verification to restrict putfield anyway: outside of , putfield on a value class field is a verification error. It would probably be more trouble than it's worth to make the verifier *allow* putfield on a value class type (distinct from an 'uninitializedThis' type) in an method?we'd have a special-purpose verification rule to allow it, and then a matching runtime rule to reject it. From daniel.smith at oracle.com Tue Aug 22 20:37:44 2023 From: daniel.smith at oracle.com (Dan Smith) Date: Tue, 22 Aug 2023 20:37:44 +0000 Subject: The last miles In-Reply-To: <4F71AD80-5C51-4082-8AD5-4AB32095BE9D@oracle.com> References: <1724794551.103459871.1689233963536.JavaMail.zimbra@univ-eiffel.fr> <940032A1-D914-474B-8473-7DAE200ACF40@oracle.com> <939E976C-5088-4AE2-987E-D5EFBEF734C1@oracle.com> <4B804FDD-E843-44D2-BE66-E915C7ECD258@oracle.com> <2781CA35-DED4-4494-B65F-86CD005D3D01@oracle.com> <94A14763-DD2D-40A6-A0CB-77E5477B0EE2@oracle.com> <77DDA762-00BF-46A3-861A-60A4C81A8CE9@oracle.com> <4F71AD80-5C51-4082-8AD5-4AB32095BE9D@oracle.com> Message-ID: <8B73D408-718E-41DC-8889-C2863C9F6101@oracle.com> > On Aug 22, 2023, at 1:33 PM, Dan Smith wrote: > >> On Aug 22, 2023, at 1:02 PM, John Rose wrote: >> >>> - putfield is a verifier error on non-identity class types, it only works on uninitializedThis >> Alternatively, we don?t need to touch the verifier if we use a dynamic >> larval-bit check on putfield. > > Okay, but FWIW, we need verification to restrict putfield anyway: outside of , putfield on a value class field is a verification error. Err, spoke too soon, please ignore. I checked the spec and actually this is enforced via the 'final' linkage check, not anything in verification. From heidinga at redhat.com Wed Aug 23 14:17:18 2023 From: heidinga at redhat.com (Dan Heidinga) Date: Wed, 23 Aug 2023 10:17:18 -0400 Subject: JEP 401 revisions: Null-Restricted Value Object Storage In-Reply-To: References: Message-ID: Some comments based on first reading: > The HotSpot implementation then optimizes these fields and arrays by storing value objects directly in flattened storage, without any object headers, indirections, or null flags. Should that say "implementation may optimize ..." as there are other considerations, such as size, which effect flattening? > As for primitive-typed fields, JVMs could be given permission to create this zero Point object implicitly by simply allocating a field of an appropriate type. Should this also mention array slots in addition to fields? > Similarly, some value classes are like long and double, able to interpret values implicitly created by non-atomic reads and writes. This reads awkwardly. Maybe adapt as "double, and are able to...."? Otherwise the fragment following the second comma reads as incomplete. In the "Null-restricted types" section, there's a "public Cursor(Point! position)" constructor shown but the JEP has so far only talked about type restrictions for fields and array slots. We've talked about supporting type-restrictions in methods as being similar to erased generics (checked at the callsite) and also about not allowing it in method declarations yet. Do we want to avoid showing the "!" in methods for this JEP? > A concrete class that implements LooselyConsistentValue (directly or indirectly) must be a value class and must declare an implicit constructor. Where is this check going to be specified? I think it needs to occur as part of verification (5.4.1) so the check is complete before static fields are prepared. I think we need to assert the check both ways - having one of implements LooselyConsistentValue or ImplicitCreation[ACC_NON_ATOMIC] flag should fail verification. > If the value of a field is implicitly set to a value class's initial instance, the named value class must be initialized before the field can be read. What about before it can be written? Does code that does a putstatic of Point.default to a Point! x field force Point to be initialized? Or just a read from the "x" field? I think the JEP needs to more explicitly handle what happens to method descriptors using the "!" syntax. It shows of it but doesn't talk about how it is encoded, erased, or otherwise handled. --Dan On Mon, Aug 14, 2023 at 7:26?PM Dan Smith wrote: > I've made some revisions to JEP 401 to align with our latest design ideas > for expressing flattenability in the language and in class files. > > https://openjdk.org/jeps/401 > > At one point I was considering introducing nullness features in a separate > JEP, but the consensus seems to be that we're better off delivering a > smaller version of nullness features first, only applicable to value > classes. So I've revised the JEP title to be "Null-Restricted Value Object > Storage" and eliminated the dependency on a separate nullness JEP. > > (Don't forget that the core Value Objects concepts related to identity > have been lifted out into their own JEP, https://openjdk.org/jeps/8277163, > which is a prerequisite to JEP 401.) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.smith at oracle.com Wed Aug 23 14:47:55 2023 From: daniel.smith at oracle.com (Dan Smith) Date: Wed, 23 Aug 2023 14:47:55 +0000 Subject: EG meeting 2023-08-23 Message-ID: An EG meeting will be held today, August 23, at 4pm UTC (9am PDT, 12pm EDT). To discuss: - John's document about new/dup/init as a replacement for vnew/aconst_init/withfield - My revised JEP 401 From john.r.rose at oracle.com Wed Aug 23 15:00:38 2023 From: john.r.rose at oracle.com (John Rose) Date: Wed, 23 Aug 2023 15:00:38 +0000 Subject: The last miles In-Reply-To: <8B73D408-718E-41DC-8889-C2863C9F6101@oracle.com> References: <1724794551.103459871.1689233963536.JavaMail.zimbra@univ-eiffel.fr> <940032A1-D914-474B-8473-7DAE200ACF40@oracle.com> <939E976C-5088-4AE2-987E-D5EFBEF734C1@oracle.com> <4B804FDD-E843-44D2-BE66-E915C7ECD258@oracle.com> <2781CA35-DED4-4494-B65F-86CD005D3D01@oracle.com> <94A14763-DD2D-40A6-A0CB-77E5477B0EE2@oracle.com> <77DDA762-00BF-46A3-861A-60A4C81A8CE9@oracle.com> <4F71AD80-5C51-4082-8AD5-4AB32095BE9D@oracle.com> <8B73D408-718E-41DC-8889-C2863C9F6101@oracle.com> Message-ID: Yes, that?s not a loophole. We just need a new dynamic check on putfield against the larval state to exclude fieldwrites in init methods for values that are frozen. No new verifier or linkage rule is needed for putfield. It?s a surprise. > On Aug 22, 2023, at 1:37 PM, Dan Smith wrote: > > ? >> >>> On Aug 22, 2023, at 1:33 PM, Dan Smith wrote: >>> >>>> On Aug 22, 2023, at 1:02 PM, John Rose wrote: >>> >>>> - putfield is a verifier error on non-identity class types, it only works on uninitializedThis >>> Alternatively, we don?t need to touch the verifier if we use a dynamic >>> larval-bit check on putfield. >> >> Okay, but FWIW, we need verification to restrict putfield anyway: outside of , putfield on a value class field is a verification error. > > > Err, spoke too soon, please ignore. I checked the spec and actually this is enforced via the 'final' linkage check, not anything in verification.