From andrey.breslav at jetbrains.com Tue Aug 1 23:17:01 2017 From: andrey.breslav at jetbrains.com (Andrey Breslav) Date: Tue, 01 Aug 2017 23:17:01 +0000 Subject: [patterns] Are we considering GADTs for pattern matching? Message-ID: Hi, Since we are looking into pattern matching for Java, I think we should explore carefully even those features that look more exotic and may never end up in the final design. Generalized Algebraic Data Types (GADTs) represent an example of this kind of feature. Here's a quick and fairly nice description of the feature and some issues related to it that arose in Scala: https://gist.github.com/smarter/2e1c564c83bae58c65b4f3f041bfb15f To recap, we are basically looking at this kind of code: class Expr class IntExpr(Integer i) extends Expr {} class BoolExpr(Boolean b) extends Expr {} T eval(Expr e) { switch (e) { case IntExpr ie: return ie.i; // Infer T=Integer, so OK to return i case BoolExpr be: return be.b; // Infer T=Boolean, so OK to return b } ... } So, the gist of it is that sometimes we can infer the type variables if a dynamic type test has succeeded. Same would work for if-match . I see the following questions here: 1. Should we have this in Java at all? It may be a bit more magic than we are used to having in Java 2. Most issues related to GADTs that I'm aware of arise in the realm of structural types, i.e. functional languages or declaration-site variance (which Java luckily doesn't have today), but we'd have to look very carefully at Java generics to make sure this sort of thing won't blow up anywhere around wildcards and/or recursive type bounds, for example. Observation: if we answer "yes" to 1), then this behaviour should be added as soon as we add any pattern matching, because even the most limited form of matching includes it. It seems to me that adding this later will be a breaking change. -- Andrey Breslav Project Lead of Kotlin JetBrains http://kotlinlang.org/ The Drive to Develop From gavin.bierman at oracle.com Wed Aug 2 00:02:12 2017 From: gavin.bierman at oracle.com (gavin.bierman at oracle.com) Date: Tue, 1 Aug 2017 17:02:12 -0700 Subject: [patterns] Are we considering GADTs for pattern matching? In-Reply-To: References: Message-ID: <69D34EBB-CA71-4251-BDFE-CCD58FD7831C@oracle.com> I agree that we should think about it. You might be interested in this point in the research/design space: https://www.microsoft.com/en-us/research/publication/generalized-algebraic-data-types-and-object-oriented-programming/ Regards, Gavin Sent from my iPad > On 1 Aug 2017, at 16:17, Andrey Breslav wrote: > > Hi, > > Since we are looking into pattern matching for Java, I think we should explore carefully even those features that look more exotic and may never end up in the final design. Generalized Algebraic Data Types (GADTs) represent an example of this kind of feature. > > Here's a quick and fairly nice description of the feature and some issues related to it that arose in Scala: https://gist.github.com/smarter/2e1c564c83bae58c65b4f3f041bfb15f > > To recap, we are basically looking at this kind of code: > > class Expr > class IntExpr(Integer i) extends Expr {} > class BoolExpr(Boolean b) extends Expr {} > > T eval(Expr e) { > switch (e) { > case IntExpr ie: return ie.i; // Infer T=Integer, so OK to return i > case BoolExpr be: return be.b; // Infer T=Boolean, so OK to return b > } > ... > } > > So, the gist of it is that sometimes we can infer the type variables if a dynamic type test has succeeded. Same would work for if-match . > > I see the following questions here: > 1. Should we have this in Java at all? It may be a bit more magic than we are used to having in Java > 2. Most issues related to GADTs that I'm aware of arise in the realm of structural types, i.e. functional languages or declaration-site variance (which Java luckily doesn't have today), but we'd have to look very carefully at Java generics to make sure this sort of thing won't blow up anywhere around wildcards and/or recursive type bounds, for example. > > Observation: if we answer "yes" to 1), then this behaviour should be added as soon as we add any pattern matching, because even the most limited form of matching includes it. It seems to me that adding this later will be a breaking change. > -- > Andrey Breslav > Project Lead of Kotlin > JetBrains > http://kotlinlang.org/ > The Drive to Develop From forax at univ-mlv.fr Thu Aug 17 16:15:59 2017 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 17 Aug 2017 18:15:59 +0200 (CEST) Subject: Condy bsm should be idempotent Message-ID: <214203876.490900.1502986559987.JavaMail.zimbra@u-pem.fr> Hi all, have some of you may know, i've started to implement ConstantDynamic in ASM, and the spec currently breaks an invariant of ASM that i would like to keep. ASM API do not expose the constant of the constant pool, it provides methods that decode/encode instructions that as a side effect decode constant pool constants on demand and share them when automatically when encoding, so there is no need to have an API that directly expose the constant pool constants. The problem is that Condy breaks that, because the VM calls the bsm of a condy constant only once* by constant pool entry. So there is a difference from the user point of view of a constant pool containing two Condy resolving the same BSM with the same arguments and one Condy refencing the same BSM with the same arguments, in the former, the BSM will be called twice while in the later case, the BSM will be called once. There is a way to solve that, mandate the the BSM of a Condy as to be idempotent, i.e. a call to a BSM with the same arguments should provide the same result. If someone does not want to share several Condy, it can specify different names for each of them, respecting the idempotent criteria. I've mostly implemented the current semantics in ASM but the result is ugly, i had to specialize the resolution/caching for Condy, i'm not able to reuse the code that deals with invokedynamic. And from the API point of view, it's awkward because it works automatically for all constants but Condy, for which the user have to take extra care. R?mi * let say threads do not exist. From maurizio.cimadamore at oracle.com Thu Aug 17 16:30:23 2017 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 17 Aug 2017 17:30:23 +0100 Subject: Condy bsm should be idempotent In-Reply-To: <214203876.490900.1502986559987.JavaMail.zimbra@u-pem.fr> References: <214203876.490900.1502986559987.JavaMail.zimbra@u-pem.fr> Message-ID: Well spotted - this would be an issue for my bytecode API too (currently my API is assuming that if two condy CP entries are the same, only one entry has to be written in the resulting classfile stream). Maurizio On 17/08/17 17:15, Remi Forax wrote: > Hi all, > have some of you may know, i've started to implement ConstantDynamic in ASM, > and the spec currently breaks an invariant of ASM that i would like to keep. > > ASM API do not expose the constant of the constant pool, > it provides methods that decode/encode instructions that as a side effect decode constant pool constants on demand and share them when automatically when encoding, so there is no need to have an API that directly expose the constant pool constants. > > The problem is that Condy breaks that, because the VM calls the bsm of a condy constant only once* by constant pool entry. > So there is a difference from the user point of view of a constant pool containing two Condy resolving the same BSM with the same arguments and one Condy refencing the same BSM with the same arguments, > in the former, the BSM will be called twice while in the later case, the BSM will be called once. > > There is a way to solve that, mandate the the BSM of a Condy as to be idempotent, i.e. a call to a BSM with the same arguments should provide the same result. > > If someone does not want to share several Condy, it can specify different names for each of them, respecting the idempotent criteria. > > I've mostly implemented the current semantics in ASM but the result is ugly, i had to specialize the resolution/caching for Condy, i'm not able to reuse the code that deals with invokedynamic. And from the API point of view, it's awkward because it works automatically for all constants but Condy, for which the user have to take extra care. > > R?mi > > > * let say threads do not exist. From paul.sandoz at oracle.com Thu Aug 17 17:01:32 2017 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Thu, 17 Aug 2017 10:01:32 -0700 Subject: Condy bsm should be idempotent In-Reply-To: References: <214203876.490900.1502986559987.JavaMail.zimbra@u-pem.fr> Message-ID: > On 17 Aug 2017, at 09:30, Maurizio Cimadamore wrote: > > Well spotted - this would be an issue for my bytecode API too (currently my API is assuming that if two condy CP entries are the same, only one entry has to be written in the resulting classfile stream). > What is wrong with that? If there are two separate condy CP entries with the same name, type and BSM index then resolution of those two condy CP entries should produce the same value. So so two such entries are redundant and it?s beneficial to only produce one entry. Paul. > Maurizio > > > On 17/08/17 17:15, Remi Forax wrote: >> Hi all, >> have some of you may know, i've started to implement ConstantDynamic in ASM, >> and the spec currently breaks an invariant of ASM that i would like to keep. >> >> ASM API do not expose the constant of the constant pool, >> it provides methods that decode/encode instructions that as a side effect decode constant pool constants on demand and share them when automatically when encoding, so there is no need to have an API that directly expose the constant pool constants. >> >> The problem is that Condy breaks that, because the VM calls the bsm of a condy constant only once* by constant pool entry. >> So there is a difference from the user point of view of a constant pool containing two Condy resolving the same BSM with the same arguments and one Condy refencing the same BSM with the same arguments, >> in the former, the BSM will be called twice while in the later case, the BSM will be called once. >> >> There is a way to solve that, mandate the the BSM of a Condy as to be idempotent, i.e. a call to a BSM with the same arguments should provide the same result. >> >> If someone does not want to share several Condy, it can specify different names for each of them, respecting the idempotent criteria. >> >> I've mostly implemented the current semantics in ASM but the result is ugly, i had to specialize the resolution/caching for Condy, i'm not able to reuse the code that deals with invokedynamic. And from the API point of view, it's awkward because it works automatically for all constants but Condy, for which the user have to take extra care. >> >> R?mi >> >> * let say threads do not exist. > From brian.goetz at oracle.com Thu Aug 17 17:22:21 2017 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 17 Aug 2017 13:22:21 -0400 Subject: Condy bsm should be idempotent In-Reply-To: <214203876.490900.1502986559987.JavaMail.zimbra@u-pem.fr> References: <214203876.490900.1502986559987.JavaMail.zimbra@u-pem.fr> Message-ID: Can you clarify what you mean by "same"? According to equals(), or ==? On 8/17/2017 12:15 PM, Remi Forax wrote: > There is a way to solve that, mandate the the BSM of a Condy as to be idempotent, i.e. a call to a BSM with the same arguments should provide the same result. From forax at univ-mlv.fr Thu Aug 17 17:30:39 2017 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 17 Aug 2017 19:30:39 +0200 (CEST) Subject: Condy bsm should be idempotent In-Reply-To: References: <214203876.490900.1502986559987.JavaMail.zimbra@u-pem.fr> Message-ID: <1418445754.494598.1502991039573.JavaMail.zimbra@u-pem.fr> equals, if they are structurally equivalent, same name, same descriptor, same bsm and same bsm arguments, they should produce the same result. R?mi ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" , "amber-spec-experts" > Envoy?: Jeudi 17 Ao?t 2017 19:22:21 > Objet: Re: Condy bsm should be idempotent > Can you clarify what you mean by "same"? According to equals(), or ==? > > On 8/17/2017 12:15 PM, Remi Forax wrote: >> There is a way to solve that, mandate the the BSM of a Condy as to be >> idempotent, i.e. a call to a BSM with the same arguments should provide the > > same result. From brian.goetz at oracle.com Thu Aug 17 17:35:33 2017 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 17 Aug 2017 13:35:33 -0400 Subject: Condy bsm should be idempotent In-Reply-To: <1418445754.494598.1502991039573.JavaMail.zimbra@u-pem.fr> References: <214203876.490900.1502986559987.JavaMail.zimbra@u-pem.fr> <1418445754.494598.1502991039573.JavaMail.zimbra@u-pem.fr> Message-ID: <7844ba98-92fa-9b08-9538-5f4e8e532902@oracle.com> This is going to be problematic for bootstraps that produce, e.g., arrays; their equals() method delegates to Object.equals(), which means such bootstraps would have to maintain an (expensive!) cache for interning. It is an old problem that constant resolution can be racy, but we would like for the JVM to manage the race, not the bootstraps. On 8/17/2017 1:30 PM, forax at univ-mlv.fr wrote: > equals, > if they are structurally equivalent, same name, same descriptor, same bsm and same bsm arguments, > they should produce the same result. > > R?mi > > ----- Mail original ----- >> De: "Brian Goetz" >> ?: "Remi Forax" , "amber-spec-experts" >> Envoy?: Jeudi 17 Ao?t 2017 19:22:21 >> Objet: Re: Condy bsm should be idempotent >> Can you clarify what you mean by "same"? According to equals(), or ==? >> >> On 8/17/2017 12:15 PM, Remi Forax wrote: >>> There is a way to solve that, mandate the the BSM of a Condy as to be >>> idempotent, i.e. a call to a BSM with the same arguments should provide the >>> same result. From forax at univ-mlv.fr Thu Aug 17 17:38:22 2017 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 17 Aug 2017 19:38:22 +0200 (CEST) Subject: Condy bsm should be idempotent In-Reply-To: References: <214203876.490900.1502986559987.JavaMail.zimbra@u-pem.fr> Message-ID: <459872089.496904.1502991502358.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Paul Sandoz" > ?: "Maurizio Cimadamore" > Cc: "Remi Forax" , "amber-spec-experts" > Envoy?: Jeudi 17 Ao?t 2017 19:01:32 > Objet: Re: Condy bsm should be idempotent >> On 17 Aug 2017, at 09:30, Maurizio Cimadamore >> wrote: >> >> Well spotted - this would be an issue for my bytecode API too (currently my API >> is assuming that if two condy CP entries are the same, only one entry has to be >> written in the resulting classfile stream). >> > > What is wrong with that? > > If there are two separate condy CP entries with the same name, type and BSM > index then resolution of those two condy CP entries should produce the same > value. So so two such entries are redundant and it?s beneficial to only produce > one entry. Nothing wrong, that's the point ! it's just that the current spec does not allow to produce only one entry, according to the current spec, two calls of the same bsm with the same arguments may produce different results, that's why I propose to change the spec by adding that the bsm has to be idempotent. > > Paul. R?mi > >> Maurizio >> >> >> On 17/08/17 17:15, Remi Forax wrote: >>> Hi all, >>> have some of you may know, i've started to implement ConstantDynamic in ASM, >>> and the spec currently breaks an invariant of ASM that i would like to keep. >>> >>> ASM API do not expose the constant of the constant pool, >>> it provides methods that decode/encode instructions that as a side effect decode >>> constant pool constants on demand and share them when automatically when >>> encoding, so there is no need to have an API that directly expose the constant >>> pool constants. >>> >>> The problem is that Condy breaks that, because the VM calls the bsm of a condy >>> constant only once* by constant pool entry. >>> So there is a difference from the user point of view of a constant pool >>> containing two Condy resolving the same BSM with the same arguments and one >>> Condy refencing the same BSM with the same arguments, >>> in the former, the BSM will be called twice while in the later case, the BSM >>> will be called once. >>> >>> There is a way to solve that, mandate the the BSM of a Condy as to be >>> idempotent, i.e. a call to a BSM with the same arguments should provide the >>> same result. >>> >>> If someone does not want to share several Condy, it can specify different names >>> for each of them, respecting the idempotent criteria. >>> >>> I've mostly implemented the current semantics in ASM but the result is ugly, i >>> had to specialize the resolution/caching for Condy, i'm not able to reuse the >>> code that deals with invokedynamic. And from the API point of view, it's >>> awkward because it works automatically for all constants but Condy, for which >>> the user have to take extra care. >>> >>> R?mi >>> >>> * let say threads do not exist. From forax at univ-mlv.fr Thu Aug 17 17:50:34 2017 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 17 Aug 2017 19:50:34 +0200 (CEST) Subject: Condy bsm should be idempotent In-Reply-To: <7844ba98-92fa-9b08-9538-5f4e8e532902@oracle.com> References: <214203876.490900.1502986559987.JavaMail.zimbra@u-pem.fr> <1418445754.494598.1502991039573.JavaMail.zimbra@u-pem.fr> <7844ba98-92fa-9b08-9538-5f4e8e532902@oracle.com> Message-ID: <1871167599.497506.1502992234363.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: forax at univ-mlv.fr > Cc: "amber-spec-experts" > Envoy?: Jeudi 17 Ao?t 2017 19:35:33 > Objet: Re: Condy bsm should be idempotent > This is going to be problematic for bootstraps that produce, e.g., > arrays; their equals() method delegates to Object.equals(), which means > such bootstraps would have to maintain an (expensive!) cache for interning. fully agree, i've forget the case of array, structurally equivalent <=> == for primitives, Object.equals for objects and Arrays.equals() for arrays. > > It is an old problem that constant resolution can be racy, but we would > like for the JVM to manage the race, not the bootstraps. either we specify that the BSM has to be idempotent and the VM doesn't have to enforce that, the other solution is to say that the semantics of condy allow the resolved constant to be a result of a previous call of the BSM with the same arguments. also, if we want to be able to resolve condy at jlink time, we also need the same kind of wordings, because currently a BSM can depend on the daytime. R?mi > > On 8/17/2017 1:30 PM, forax at univ-mlv.fr wrote: >> equals, >> if they are structurally equivalent, same name, same descriptor, same bsm and >> same bsm arguments, >> they should produce the same result. >> >> R?mi >> >> ----- Mail original ----- >>> De: "Brian Goetz" >>> ?: "Remi Forax" , "amber-spec-experts" >>> >>> Envoy?: Jeudi 17 Ao?t 2017 19:22:21 >>> Objet: Re: Condy bsm should be idempotent >>> Can you clarify what you mean by "same"? According to equals(), or ==? >>> >>> On 8/17/2017 12:15 PM, Remi Forax wrote: >>>> There is a way to solve that, mandate the the BSM of a Condy as to be >>>> idempotent, i.e. a call to a BSM with the same arguments should provide the > >>> same result. From brian.goetz at oracle.com Thu Aug 17 18:07:28 2017 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 17 Aug 2017 14:07:28 -0400 Subject: Condy bsm should be idempotent In-Reply-To: <1871167599.497506.1502992234363.JavaMail.zimbra@u-pem.fr> References: <214203876.490900.1502986559987.JavaMail.zimbra@u-pem.fr> <1418445754.494598.1502991039573.JavaMail.zimbra@u-pem.fr> <7844ba98-92fa-9b08-9538-5f4e8e532902@oracle.com> <1871167599.497506.1502992234363.JavaMail.zimbra@u-pem.fr> Message-ID: So, this is more of a purity requirement -- that the result of the bootstrap be consistently derived from its arguments and no other state. But you are not asking for heroic interning -- just that the bootstraps not do anything "funny". (This is in the same category as "the result should be a constant", though neither the language nor VM can enforce this.) Right? On 8/17/2017 1:50 PM, forax at univ-mlv.fr wrote: > ----- Mail original ----- >> De: "Brian Goetz" >> ?: forax at univ-mlv.fr >> Cc: "amber-spec-experts" >> Envoy?: Jeudi 17 Ao?t 2017 19:35:33 >> Objet: Re: Condy bsm should be idempotent >> This is going to be problematic for bootstraps that produce, e.g., >> arrays; their equals() method delegates to Object.equals(), which means >> such bootstraps would have to maintain an (expensive!) cache for interning. > fully agree, > i've forget the case of array, > structurally equivalent <=> == for primitives, Object.equals for objects and Arrays.equals() for arrays. > >> It is an old problem that constant resolution can be racy, but we would >> like for the JVM to manage the race, not the bootstraps. > either we specify that the BSM has to be idempotent and the VM doesn't have to enforce that, > the other solution is to say that the semantics of condy allow the resolved constant to be a result of a previous call of the BSM with the same arguments. > > also, if we want to be able to resolve condy at jlink time, we also need the same kind of wordings, because currently a BSM can depend on the daytime. > > R?mi > >> On 8/17/2017 1:30 PM, forax at univ-mlv.fr wrote: >>> equals, >>> if they are structurally equivalent, same name, same descriptor, same bsm and >>> same bsm arguments, >>> they should produce the same result. >>> >>> R?mi >>> >>> ----- Mail original ----- >>>> De: "Brian Goetz" >>>> ?: "Remi Forax" , "amber-spec-experts" >>>> >>>> Envoy?: Jeudi 17 Ao?t 2017 19:22:21 >>>> Objet: Re: Condy bsm should be idempotent >>>> Can you clarify what you mean by "same"? According to equals(), or ==? >>>> >>>> On 8/17/2017 12:15 PM, Remi Forax wrote: >>>>> There is a way to solve that, mandate the the BSM of a Condy as to be >>>>> idempotent, i.e. a call to a BSM with the same arguments should provide the >>>>> same result. From forax at univ-mlv.fr Thu Aug 17 18:32:24 2017 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 17 Aug 2017 20:32:24 +0200 (CEST) Subject: Condy bsm should be idempotent In-Reply-To: References: <214203876.490900.1502986559987.JavaMail.zimbra@u-pem.fr> <1418445754.494598.1502991039573.JavaMail.zimbra@u-pem.fr> <7844ba98-92fa-9b08-9538-5f4e8e532902@oracle.com> <1871167599.497506.1502992234363.JavaMail.zimbra@u-pem.fr> Message-ID: <2065802560.499623.1502994744611.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: forax at univ-mlv.fr > Cc: "amber-spec-experts" > Envoy?: Jeudi 17 Ao?t 2017 20:07:28 > Objet: Re: Condy bsm should be idempotent > So, this is more of a purity requirement -- that the result of the > bootstrap be consistently derived from its arguments and no other > state. But you are not asking for heroic interning -- just that the > bootstraps not do anything "funny". (This is in the same category as > "the result should be a constant", though neither the language nor VM > can enforce this.) Right? yes ! R?mi > > On 8/17/2017 1:50 PM, forax at univ-mlv.fr wrote: >> ----- Mail original ----- >>> De: "Brian Goetz" >>> ?: forax at univ-mlv.fr >>> Cc: "amber-spec-experts" >>> Envoy?: Jeudi 17 Ao?t 2017 19:35:33 >>> Objet: Re: Condy bsm should be idempotent >>> This is going to be problematic for bootstraps that produce, e.g., >>> arrays; their equals() method delegates to Object.equals(), which means >>> such bootstraps would have to maintain an (expensive!) cache for interning. >> fully agree, >> i've forget the case of array, >> structurally equivalent <=> == for primitives, Object.equals for objects and >> Arrays.equals() for arrays. >> >>> It is an old problem that constant resolution can be racy, but we would >>> like for the JVM to manage the race, not the bootstraps. >> either we specify that the BSM has to be idempotent and the VM doesn't have to >> enforce that, >> the other solution is to say that the semantics of condy allow the resolved >> constant to be a result of a previous call of the BSM with the same arguments. >> >> also, if we want to be able to resolve condy at jlink time, we also need the >> same kind of wordings, because currently a BSM can depend on the daytime. >> >> R?mi >> >>> On 8/17/2017 1:30 PM, forax at univ-mlv.fr wrote: >>>> equals, >>>> if they are structurally equivalent, same name, same descriptor, same bsm and >>>> same bsm arguments, >>>> they should produce the same result. >>>> >>>> R?mi >>>> >>>> ----- Mail original ----- >>>>> De: "Brian Goetz" >>>>> ?: "Remi Forax" , "amber-spec-experts" >>>>> >>>>> Envoy?: Jeudi 17 Ao?t 2017 19:22:21 >>>>> Objet: Re: Condy bsm should be idempotent >>>>> Can you clarify what you mean by "same"? According to equals(), or ==? >>>>> >>>>> On 8/17/2017 12:15 PM, Remi Forax wrote: >>>>>> There is a way to solve that, mandate the the BSM of a Condy as to be >>>>>> idempotent, i.e. a call to a BSM with the same arguments should provide the > >>>>> same result. From brian.goetz at oracle.com Thu Aug 17 18:41:59 2017 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 17 Aug 2017 14:41:59 -0400 Subject: Condy bsm should be idempotent In-Reply-To: <2065802560.499623.1502994744611.JavaMail.zimbra@u-pem.fr> References: <214203876.490900.1502986559987.JavaMail.zimbra@u-pem.fr> <1418445754.494598.1502991039573.JavaMail.zimbra@u-pem.fr> <7844ba98-92fa-9b08-9538-5f4e8e532902@oracle.com> <1871167599.497506.1502992234363.JavaMail.zimbra@u-pem.fr> <2065802560.499623.1502994744611.JavaMail.zimbra@u-pem.fr> Message-ID: <83719a01-5523-92ec-1406-8428302847b2@oracle.com> I agree, and I think this is already implied by the race-arbitrating behavior of CP resolution. If two threads race to resolve the same CP#, the VM will arbitrarily pick a winner, and toss the losing result. Which means that both results must be, in some sense, equivalent. But there's no harm in stating it (just as there's no harm in reminding people that these are supposed to be CONSTANTS.) On 8/17/2017 2:32 PM, forax at univ-mlv.fr wrote: > ----- Mail original ----- >> De: "Brian Goetz" >> ?: forax at univ-mlv.fr >> Cc: "amber-spec-experts" >> Envoy?: Jeudi 17 Ao?t 2017 20:07:28 >> Objet: Re: Condy bsm should be idempotent >> So, this is more of a purity requirement -- that the result of the >> bootstrap be consistently derived from its arguments and no other >> state. But you are not asking for heroic interning -- just that the >> bootstraps not do anything "funny". (This is in the same category as >> "the result should be a constant", though neither the language nor VM >> can enforce this.) Right? > yes ! > > R?mi > >> On 8/17/2017 1:50 PM, forax at univ-mlv.fr wrote: >>> ----- Mail original ----- >>>> De: "Brian Goetz" >>>> ?: forax at univ-mlv.fr >>>> Cc: "amber-spec-experts" >>>> Envoy?: Jeudi 17 Ao?t 2017 19:35:33 >>>> Objet: Re: Condy bsm should be idempotent >>>> This is going to be problematic for bootstraps that produce, e.g., >>>> arrays; their equals() method delegates to Object.equals(), which means >>>> such bootstraps would have to maintain an (expensive!) cache for interning. >>> fully agree, >>> i've forget the case of array, >>> structurally equivalent <=> == for primitives, Object.equals for objects and >>> Arrays.equals() for arrays. >>> >>>> It is an old problem that constant resolution can be racy, but we would >>>> like for the JVM to manage the race, not the bootstraps. >>> either we specify that the BSM has to be idempotent and the VM doesn't have to >>> enforce that, >>> the other solution is to say that the semantics of condy allow the resolved >>> constant to be a result of a previous call of the BSM with the same arguments. >>> >>> also, if we want to be able to resolve condy at jlink time, we also need the >>> same kind of wordings, because currently a BSM can depend on the daytime. >>> >>> R?mi >>> >>>> On 8/17/2017 1:30 PM, forax at univ-mlv.fr wrote: >>>>> equals, >>>>> if they are structurally equivalent, same name, same descriptor, same bsm and >>>>> same bsm arguments, >>>>> they should produce the same result. >>>>> >>>>> R?mi >>>>> >>>>> ----- Mail original ----- >>>>>> De: "Brian Goetz" >>>>>> ?: "Remi Forax" , "amber-spec-experts" >>>>>> >>>>>> Envoy?: Jeudi 17 Ao?t 2017 19:22:21 >>>>>> Objet: Re: Condy bsm should be idempotent >>>>>> Can you clarify what you mean by "same"? According to equals(), or ==? >>>>>> >>>>>> On 8/17/2017 12:15 PM, Remi Forax wrote: >>>>>>> There is a way to solve that, mandate the the BSM of a Condy as to be >>>>>>> idempotent, i.e. a call to a BSM with the same arguments should provide the >>>>>>> same result. From john.r.rose at oracle.com Thu Aug 17 21:41:15 2017 From: john.r.rose at oracle.com (John Rose) Date: Thu, 17 Aug 2017 14:41:15 -0700 Subject: Condy bsm should be idempotent In-Reply-To: <83719a01-5523-92ec-1406-8428302847b2@oracle.com> References: <214203876.490900.1502986559987.JavaMail.zimbra@u-pem.fr> <1418445754.494598.1502991039573.JavaMail.zimbra@u-pem.fr> <7844ba98-92fa-9b08-9538-5f4e8e532902@oracle.com> <1871167599.497506.1502992234363.JavaMail.zimbra@u-pem.fr> <2065802560.499623.1502994744611.JavaMail.zimbra@u-pem.fr> <83719a01-5523-92ec-1406-8428302847b2@oracle.com> Message-ID: On Aug 17, 2017, at 11:41 AM, Brian Goetz wrote: > > I agree, and I think this is already implied by the race-arbitrating behavior of CP resolution. If two threads race to resolve the same CP#, the VM will arbitrarily pick a winner, and toss the losing result. Which means that both results must be, in some sense, equivalent. But there's no harm in stating it (just as there's no harm in reminding people that these are supposed to be CONSTANTS.) We are on tricky ground here, wanting to say something about equivalent expressions yielding equivalent results. (And yes, it's like the similar desire to say that of course a condy result is, somehow, constant.) There's no good way to enforce these constraints, short of inventing a restricted subset of Java that can be proven to have the desired properties, and then requiring that condy expressions use that subset. What we can do is give advice to users of condy on how to use it safely. And then surround those good behaviors with a spec. which does something reasonably predictable and safe even if the users go off the rails (by accident or nefarious design). There are a lot of ways to win at this, without solving the halting problem for full Java or designing a compile-time execution mode for Java. (BTW, I'd like to do the latter, some day, but for today let's suppose that condy BSMs are completely unpredictable in their actions, unless their authors take responsibility for them. The current position is for the JVM to uphold a very simple contract: Each CP entry is distinct (as a contract between the classfile author and the JVM) and has independent behavior, which is idempotent. The linkage process *behind* the CP is not, and cannot be, idempotent, which is why we have to record both normal and exceptional linkage results. Despite the inconvenience for either Remi or ASM users (and likewise with Maurizio) I think this is the best way to go because it's the simplest for the most delicate part of the system, the JVMS. (That's where the attackers attack, and where needless complexity is to be avoided.) So, I'd prefer to leave the JVMS as it is, and allow bytecode generation APIs to cater *only* (or mainly) to well-behaved authors who would never dream of writing non-idempotent condys. To complicate the JVMS in order to regularize the user model of ASM would be a mistake. But I don't advocate complicating ASM either. Instead, I think it is perfectly reasonable to do any of three things in ASM (and other tools like it): A. Continue normalizing all CP entries, including the new ones. This means that a null translation might de-duplicate equivalent condy entries. This will only hurt people who are creating bad class files on purpose, either as negative tests or to explore the dark corners of the JVMS behavior. (Remember, the bright center requires human responsibility.) B. For the new data-type used by ASM to describe a condy constant, add a 32-bit "stamp" field which participates in that type's equals/hashCode/toString methods. This "stamp" field is an arbitrary value serving only to differentiate otherwise equivalent condy constants. User-built constants default their stamp to zero. Constants built during class file reading default their stamp to the CP index at which they occur. New condy constants are interned, old ones are retained distinct. And nobody needs to be the wiser, unless they choose to look very, very close at the behavior of ASM. (B2 Variation: Give the stamp value of zero to the every unique condy constant encountered in a class file. For the edge case of non-unique constants, give them stamps of their CP indexes. Other variations are possible. I don't think the effort would be well spent, because it requires extra stamp-suppressing comparison logic, which goes against ASM's minimalist design, and may slightly slow ASM's processing of condy. Perhaps an optional method could be given to find a pre-existing condy item that matches a given one? Nobody will use it, I think.) C. Say that ASM is free to do either of behaviors A (interning) or B (keeping distinct), as a matter of implementation. If you need to predict the treatment of equivalent condy constants, you need to find a workaround: Either don't use ASM, or add some salt to the name component of the condy's name-and-type, and remove it as a post-pass. The choice between A/B/C can be adjusted over time in response to bugs. Perhaps C is the best choice to start with, as a contract, with A as an implementation, switching to B or B2 if users run into actual problems with duplicate condy's. (They probably won't.) The JVM must retain the distinction between equivalent condy constants at distinct CP indexes. It cannot do the interning (in A above) because that's too expensive; that's an off-line tool's job. It might specify the equivalent of C (threaten to intern), but I think that is an empty threat, and could only cause harm down the road. I'll go even farther: For the JVM, we should specifically test that distinct condy constants with equivalent structure *can* evaluate to distinct results. The purpose of this is not to encourage the use case (although it could be used for things like cryptographic nonces) but rather as a sort of edge behavior test, to ensure that there is no "cross-talk" between constant pool entries. ? John From forax at univ-mlv.fr Thu Aug 17 23:40:09 2017 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 18 Aug 2017 01:40:09 +0200 (CEST) Subject: Condy bsm should be idempotent In-Reply-To: References: <214203876.490900.1502986559987.JavaMail.zimbra@u-pem.fr> <1418445754.494598.1502991039573.JavaMail.zimbra@u-pem.fr> <7844ba98-92fa-9b08-9538-5f4e8e532902@oracle.com> <1871167599.497506.1502992234363.JavaMail.zimbra@u-pem.fr> <2065802560.499623.1502994744611.JavaMail.zimbra@u-pem.fr> <83719a01-5523-92ec-1406-8428302847b2@oracle.com> Message-ID: <1339351316.512550.1503013209674.JavaMail.zimbra@u-pem.fr> > De: "John Rose" > ?: "Brian Goetz" > Cc: "R?mi Forax" , "amber-spec-experts" > > Envoy?: Jeudi 17 Ao?t 2017 23:41:15 > Objet: Re: Condy bsm should be idempotent > On Aug 17, 2017, at 11:41 AM, Brian Goetz < [ mailto:brian.goetz at oracle.com | > brian.goetz at oracle.com ] > wrote: >> I agree, and I think this is already implied by the race-arbitrating behavior of >> CP resolution. If two threads race to resolve the same CP#, the VM will >> arbitrarily pick a winner, and toss the losing result. Which means that both >> results must be, in some sense, equivalent. But there's no harm in stating it >> (just as there's no harm in reminding people that these are supposed to be >> CONSTANTS.) > We are on tricky ground here, wanting to say something about > equivalent expressions yielding equivalent results. (And yes, > it's like the similar desire to say that of course a condy result > is, somehow, constant.) There's no good way to enforce these > constraints, short of inventing a restricted subset of Java > that can be proven to have the desired properties, and then > requiring that condy expressions use that subset. > What we can do is give advice to users of condy on how > to use it safely. And then surround those good behaviors > with a spec. which does something reasonably predictable > and safe even if the users go off the rails (by accident or > nefarious design). > There are a lot of ways to win at this, without solving the > halting problem for full Java or designing a compile-time > execution mode for Java. (BTW, I'd like to do the latter, > some day, but for today let's suppose that condy BSMs > are completely unpredictable in their actions, unless > their authors take responsibility for them. > The current position is for the JVM to uphold a very simple > contract: Each CP entry is distinct (as a contract between > the classfile author and the JVM) and has independent > behavior, which is idempotent. The linkage process > *behind* the CP is not, and cannot be, idempotent, > which is why we have to record both normal and > exceptional linkage results. > Despite the inconvenience for either Remi or ASM users > (and likewise with Maurizio) I think this is the best way > to go because it's the simplest for the most delicate part > of the system, the JVMS. (That's where the attackers > attack, and where needless complexity is to be avoided.) > So, I'd prefer to leave the JVMS as it is, and allow bytecode > generation APIs to cater *only* (or mainly) to well-behaved > authors who would never dream of writing non-idempotent > condys. > To complicate the JVMS in order to regularize the user > model of ASM would be a mistake. But I don't advocate > complicating ASM either. Instead, I think it is perfectly > reasonable to do any of three things in ASM (and other > tools like it): > A. Continue normalizing all CP entries, including the > new ones. This means that a null translation might > de-duplicate equivalent condy entries. This will > only hurt people who are creating bad class files > on purpose, either as negative tests or to explore > the dark corners of the JVMS behavior. (Remember, > the bright center requires human responsibility.) > B. For the new data-type used by ASM to describe > a condy constant, add a 32-bit "stamp" field which > participates in that type's equals/hashCode/toString > methods. This "stamp" field is an arbitrary value > serving only to differentiate otherwise equivalent > condy constants. User-built constants default > their stamp to zero. Constants built during class > file reading default their stamp to the CP index > at which they occur. New condy constants are > interned, old ones are retained distinct. And > nobody needs to be the wiser, unless they choose > to look very, very close at the behavior of ASM. > (B2 Variation: Give the stamp value of zero to > the every unique condy constant encountered > in a class file. For the edge case of non-unique > constants, give them stamps of their CP indexes. > Other variations are possible. I don't think the > effort would be well spent, because it requires > extra stamp-suppressing comparison logic, which > goes against ASM's minimalist design, and > may slightly slow ASM's processing of condy. > Perhaps an optional method could be given > to find a pre-existing condy item that matches > a given one? Nobody will use it, I think.) > C. Say that ASM is free to do either of behaviors > A (interning) or B (keeping distinct), as a matter > of implementation. If you need to predict the > treatment of equivalent condy constants, you > need to find a workaround: Either don't use > ASM, or add some salt to the name component > of the condy's name-and-type, and remove it as a > post-pass. > The choice between A/B/C can be adjusted over > time in response to bugs. Perhaps C is the best > choice to start with, as a contract, with A as an > implementation, switching to B or B2 if users > run into actual problems with duplicate condy's. > (They probably won't.) > The JVM must retain the distinction between equivalent > condy constants at distinct CP indexes. It cannot > do the interning (in A above) because that's too > expensive; that's an off-line tool's job. It might specify > the equivalent of C (threaten to intern), but I think > that is an empty threat, and could only cause harm > down the road. > I'll go even farther: For the JVM, we should specifically > test that distinct condy constants with equivalent > structure *can* evaluate to distinct results. The purpose > of this is not to encourage the use case (although it > could be used for things like cryptographic nonces) > but rather as a sort of edge behavior test, to ensure > that there is no "cross-talk" between constant pool entries. > ? John I've first implemented something like B2, after a private discussion with John about how to implement stamps, i've used the constant pool index as stamp when reading, 0 if you want a shared one, and 65536 if you do not want a shared one. I've decided to go to a simpler route (A) after remembering that ASM already has that bug (feature?) of de-duplicating constant pool constants with already existing constants, and very few people complain about that. For the JDK tests that requires several structurally equivalent condy, as John said, one can use ASM to generate two slighly different condy (just change the name) and ask Paul, he said to me at the JVM Summit that he secretly wants to become an hexeditor expert :) so i agree that the VM should not try to do any interning and resolve each condy once, i still think the spec should, at least in a discussion section, say that the returned value should be constant and the bsm should be idempotent regards, R?mi From john.r.rose at oracle.com Fri Aug 18 00:15:57 2017 From: john.r.rose at oracle.com (John Rose) Date: Thu, 17 Aug 2017 17:15:57 -0700 Subject: Condy bsm should be idempotent In-Reply-To: <1339351316.512550.1503013209674.JavaMail.zimbra@u-pem.fr> References: <214203876.490900.1502986559987.JavaMail.zimbra@u-pem.fr> <1418445754.494598.1502991039573.JavaMail.zimbra@u-pem.fr> <7844ba98-92fa-9b08-9538-5f4e8e532902@oracle.com> <1871167599.497506.1502992234363.JavaMail.zimbra@u-pem.fr> <2065802560.499623.1502994744611.JavaMail.zimbra@u-pem.fr> <83719a01-5523-92ec-1406-8428302847b2@oracle.com> <1339351316.512550.1503013209674.JavaMail.zimbra@u-pem.fr> Message-ID: <2DE812D7-A8D5-4BE8-B88F-92C7B0DECEF7@oracle.com> On Aug 17, 2017, at 4:40 PM, forax at univ-mlv.fr wrote: > > > so i agree that the VM should not try to do any interning and resolve each condy once, i still think the spec should, at least in a discussion section, say that the returned value should be constant and the bsm should be idempotent I will add this, as a non-normative comment. BTW, the problem exists in a milder form with CONSTANT_Class, for example, which if you duplicate it can produce different answers. The system ensures that if you get a Class from both, you will get the same Class. But either or both could independently produce a distinct LinkageError. That's milder than two identical condy's producing TRUE vs. FALSE, but it is on the slippery slope. ? John From forax at univ-mlv.fr Tue Aug 22 20:23:05 2017 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 22 Aug 2017 22:23:05 +0200 (CEST) Subject: Using Condy to calculate a method/field descriptor Message-ID: <1062605753.662245.1503433385889.JavaMail.zimbra@u-pem.fr> How far we are from using Condy as a computable String to represent a field/method descriptor, this may be useful for representing reified generics if a descriptor can be calculated from something representing a type argument. At least, it's a good replacement for the things i do currently with an unsafe.defineAnonymousClass, i.e. to replace the constant pool patching. R?mi