From gavin.bierman at oracle.com Fri Oct 1 12:49:01 2021 From: gavin.bierman at oracle.com (Gavin Bierman) Date: Fri, 1 Oct 2021 12:49:01 +0000 Subject: Pattern Matching for switch (Second Preview) In-Reply-To: <4fef4bed-da6c-f6c0-46df-bbe551408bae@oracle.com> References: <5321c128-ba4d-e405-255a-72025a002e0d@oracle.com> <1bed58ce-015c-289b-5696-6dac3c539f6a@oracle.com> <4fef4bed-da6c-f6c0-46df-bbe551408bae@oracle.com> Message-ID: <7A62F8A6-3946-413F-8CC4-3708176D3EE8@oracle.com> On 30 Sep 2021, at 23:25, Brian Goetz > wrote: [ moving to a-s-e ] I get the concern that a type pattern is no longer "just a variable declaration"; that was a nice part of the "patterns aren't really so hard to understand" story. But I think the usability is likely to be not very good. Take this example: sealed interface Node { } record AddNode(Node left, Node right) extends Node { } ... Node ni = ... switch (ni) { case AddNode(Node left, Node right) -> ... There's no instantiation of Node possible here *other than* Node. Which means we are forcing users to either redundantly type out the instantiation (which can get big), or use casts inside the body when they pull things out of left and right. (And patterns were supposed to make casts go away.) There's almost no case where someone wants a raw type here. But surely they should write var here? Gavin From forax at univ-mlv.fr Fri Oct 1 13:12:48 2021 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 1 Oct 2021 15:12:48 +0200 (CEST) Subject: Pattern Matching for switch (Second Preview) In-Reply-To: <7A62F8A6-3946-413F-8CC4-3708176D3EE8@oracle.com> References: <5321c128-ba4d-e405-255a-72025a002e0d@oracle.com> <1bed58ce-015c-289b-5696-6dac3c539f6a@oracle.com> <4fef4bed-da6c-f6c0-46df-bbe551408bae@oracle.com> <7A62F8A6-3946-413F-8CC4-3708176D3EE8@oracle.com> Message-ID: <537785940.2631200.1633093968865.JavaMail.zimbra@u-pem.fr> > From: "Gavin Bierman" > To: "Brian Goetz" > Cc: "amber-spec-experts" > Sent: Vendredi 1 Octobre 2021 14:49:01 > Subject: Re: Pattern Matching for switch (Second Preview) >> On 30 Sep 2021, at 23:25, Brian Goetz < [ mailto:brian.goetz at oracle.com | >> brian.goetz at oracle.com ] > wrote: >> [ moving to a-s-e ] >> I get the concern that a type pattern is no longer "just a variable >> declaration"; that was a nice part of the "patterns aren't really so hard to >> understand" story. But I think the usability is likely to be not very good. >> Take this example: >> sealed interface Node { } >> record AddNode(Node left, Node right) extends Node { } >> ... >> Node ni = ... >> switch (ni) { >> case AddNode(Node left, Node right) -> ... >> There's no instantiation of Node possible here *other than* Node. Which >> means we are forcing users to either redundantly type out the instantiation >> (which can get big), or use casts inside the body when they pull things out of >> left and right. (And patterns were supposed to make casts go away.) There's >> almost no case where someone wants a raw type here. > But surely they should write var here? yes, here is another example List list = ... switch(list) { case ArrayList al -> ... > Gavin R?mi From cushon at google.com Wed Oct 6 18:15:16 2021 From: cushon at google.com (Liam Miller-Cushon) Date: Wed, 6 Oct 2021 11:15:16 -0700 Subject: [External] : Re: Minor improvement to anonymous classes In-Reply-To: <49676f3c-100e-ccb1-ee9a-ef999f9f4a0d@oracle.com> References: <424ad976-6f0c-5ada-ca22-f5a3d9c76dc1@oracle.com> <2ED211EA-3F96-4129-B5BF-9A262C917D9F@oracle.com> <1320762308.901257.1627755887681.JavaMail.zimbra@u-pem.fr> <32bade08-4697-a2fd-52d2-491822c14d19@oracle.com>

<1699135460.215325.1627933584253.JavaMail.zimbra@u-pem.fr> <49676f3c-100e-ccb1-ee9a-ef999f9f4a0d@oracle.com> Message-ID: Belatedly returning to this, +Joe Darcy helped with some corpus analysis in the CSR [1] (thanks!). The analysis didn't reveal any build breakages from optimizing away this$0, but it did reveal hundreds of textual occurrences of this$0. The behaviour of those occurrences of this$0 could potentially change if code is reflectively accessing the enclosing instance, and if it expects to be able to do that even if the inner class doesn't capture any enclosing instance state. As I mentioned earlier we've been using a version of the patch at Google since 2016 [2]. Rolling it out required a very small amount of cleanup, and I am not aware of it causing any issues since then (including with third party libraries that might have been relying on the hack). We have some remaining occurrences of this$0 in our code, which are not affected by the change because the only need to handle this$0 in classes that actually capture their enclosing instance. So from my (admittedly limited) perspective, this change is beneficial, and has minor compatibility impact relative to the other breaking changes we've absorbed. I'm curious if anyone has suggestions about how to get other data or perspectives that might help decide how to proceed here? If we can't get conclusive information on how much code would be affected by this, maybe it would be sufficient to roll the change out more conservatively, e.g. by only enabling it for new language levels? Has the 'preview feature' mechanism ever been used for things like this, or is intended more for new features that are visible in the spec? [1] https://bugs.openjdk.java.net/browse/JDK-8271717?focusedCommentId=14442858&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14442858 [2] https://bugs.openjdk.java.net/browse/JDK-8271623?focusedCommentId=14439152&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14439152 On Tue, Aug 3, 2021 at 5:41 AM Brian Goetz wrote: > Yes, local classes too. Essentially, this is for translation of > "effectively static" inner classes. > > I think this is independent of explicit-static or not; explicit-static > allows the programmer to capture intent and get more type checking as a > result. This is about generating better code. > > On 8/3/2021 12:52 AM, Tagir Valeev wrote: > > Another possible semantics change is the object lifetime. The code > > might rely on prolonged lifetime of the surrounding object if there > > are soft/weak/phantom references. E.g., the outer object might be > > registered via Cleaner, and the change may cause freeing the resource > > earlier than expected. Likely, this is a very rare scenario but if it > > happens, it could be quite hard to identify the root cause, as the > > problem will appear only if the object is collected within the > > specific timeframe. > > > > By the way, are we speaking about anonymous classes only? I think, > > local classes could be updated in the similar manner. Especially given > > the fact that now local records don't capture the surrounding "this" > > but if we convert the record to an equivalent local class, it will > > capture: > > > > public class Test { > > void test() { > > record R() {} // does not capture Test instance > > class C {} // captures Test instance > > } > > } > > > > Or should we allow explicit 'static' modifier on local classes? > > > > Best regards, > > Tagir Valeev. > > > > On Tue, Aug 3, 2021 at 2:47 AM wrote: > >> We may have some trouble with the usual suspect, Serialization, > >> There are classes like exceptions or Swing UI classes that are marked > as Serializable and can be implemented as an anonymous class. > >> In that case, removing the backpointer if it is not used may change the > serialization format. > >> > >> And yes, an anonymous class do not have a "stable" name but people do > not seem to care too much about that ... > >> > >> R?mi > >> > >> ----- Original Message ----- > >>> From: "Brian Goetz" > >>> To: "Liam Miller-Cushon" > >>> Cc: "Remi Forax" , "John Rose" < > john.r.rose at oracle.com>, "amber-spec-experts" > >>> > >>> Sent: Lundi 2 Ao?t 2021 20:18:56 > >>> Subject: Re: [External] : Re: Minor improvement to anonymous classes > >>> FWIW, making this fix not only reduces the memory leak risk, but has a > >>> number of nice follow-on benefits that can often trigger further > >>> follow-on benefits: > >>> > >>> - fewer fields, so reduced footprint; > >>> - fewer fields might mean more objects fall under the scalarization > >>> threshold, when applicable; > >>> - less work in constructors; > >>> - shorter constructors mean more constructors fall under the inlining > >>> threshold; > >>> - more inlining might lead to other optimizations. > >>> > >>> So it wouldn't surprise me to see macro-level effects even on programs > >>> without memory leaks. > >>> > >>>> I filed https://bugs.openjdk.java.net/browse/JDK-8271623 > >>>> to track that > >>>> enhancement. > > From brian.goetz at oracle.com Wed Oct 6 21:08:45 2021 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 6 Oct 2021 21:08:45 +0000 Subject: [External] : Re: Minor improvement to anonymous classes In-Reply-To: References: <424ad976-6f0c-5ada-ca22-f5a3d9c76dc1@oracle.com> <2ED211EA-3F96-4129-B5BF-9A262C917D9F@oracle.com> <1320762308.901257.1627755887681.JavaMail.zimbra@u-pem.fr> <32bade08-4697-a2fd-52d2-491822c14d19@oracle.com>

<1699135460.215325.1627933584253.JavaMail.zimbra@u-pem.fr> <49676f3c-100e-ccb1-ee9a-ef999f9f4a0d@oracle.com> Message-ID: I think you?ve done pretty good due diligence here. One more thing we could do is reach out to the most popular libraries that do this and give them a heads up that they need to tolerate the field not being there. But overall, the benefit accrues to the 99.999% of users that follow the rules. Sent from my iPad On Oct 6, 2021, at 2:15 PM, Liam Miller-Cushon wrote: ? Belatedly returning to this, +Joe Darcy helped with some corpus analysis in the CSR [1] (thanks!). The analysis didn't reveal any build breakages from optimizing away this$0, but it did reveal hundreds of textual occurrences of this$0. The behaviour of those occurrences of this$0 could potentially change if code is reflectively accessing the enclosing instance, and if it expects to be able to do that even if the inner class doesn't capture any enclosing instance state. As I mentioned earlier we've been using a version of the patch at Google since 2016 [2]. Rolling it out required a very small amount of cleanup, and I am not aware of it causing any issues since then (including with third party libraries that might have been relying on the hack). We have some remaining occurrences of this$0 in our code, which are not affected by the change because the only need to handle this$0 in classes that actually capture their enclosing instance. So from my (admittedly limited) perspective, this change is beneficial, and has minor compatibility impact relative to the other breaking changes we've absorbed. I'm curious if anyone has suggestions about how to get other data or perspectives that might help decide how to proceed here? If we can't get conclusive information on how much code would be affected by this, maybe it would be sufficient to roll the change out more conservatively, e.g. by only enabling it for new language levels? Has the 'preview feature' mechanism ever been used for things like this, or is intended more for new features that are visible in the spec? [1] https://bugs.openjdk.java.net/browse/JDK-8271717?focusedCommentId=14442858&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14442858 [2] https://bugs.openjdk.java.net/browse/JDK-8271623?focusedCommentId=14439152&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14439152 On Tue, Aug 3, 2021 at 5:41 AM Brian Goetz > wrote: Yes, local classes too. Essentially, this is for translation of "effectively static" inner classes. I think this is independent of explicit-static or not; explicit-static allows the programmer to capture intent and get more type checking as a result. This is about generating better code. On 8/3/2021 12:52 AM, Tagir Valeev wrote: > Another possible semantics change is the object lifetime. The code > might rely on prolonged lifetime of the surrounding object if there > are soft/weak/phantom references. E.g., the outer object might be > registered via Cleaner, and the change may cause freeing the resource > earlier than expected. Likely, this is a very rare scenario but if it > happens, it could be quite hard to identify the root cause, as the > problem will appear only if the object is collected within the > specific timeframe. > > By the way, are we speaking about anonymous classes only? I think, > local classes could be updated in the similar manner. Especially given > the fact that now local records don't capture the surrounding "this" > but if we convert the record to an equivalent local class, it will > capture: > > public class Test { > void test() { > record R() {} // does not capture Test instance > class C {} // captures Test instance > } > } > > Or should we allow explicit 'static' modifier on local classes? > > Best regards, > Tagir Valeev. > > On Tue, Aug 3, 2021 at 2:47 AM > wrote: >> We may have some trouble with the usual suspect, Serialization, >> There are classes like exceptions or Swing UI classes that are marked as Serializable and can be implemented as an anonymous class. >> In that case, removing the backpointer if it is not used may change the serialization format. >> >> And yes, an anonymous class do not have a "stable" name but people do not seem to care too much about that ... >> >> R?mi >> >> ----- Original Message ----- >>> From: "Brian Goetz" > >>> To: "Liam Miller-Cushon" > >>> Cc: "Remi Forax" >, "John Rose" >, "amber-spec-experts" >>> > >>> Sent: Lundi 2 Ao?t 2021 20:18:56 >>> Subject: Re: [External] : Re: Minor improvement to anonymous classes >>> FWIW, making this fix not only reduces the memory leak risk, but has a >>> number of nice follow-on benefits that can often trigger further >>> follow-on benefits: >>> >>> - fewer fields, so reduced footprint; >>> - fewer fields might mean more objects fall under the scalarization >>> threshold, when applicable; >>> - less work in constructors; >>> - shorter constructors mean more constructors fall under the inlining >>> threshold; >>> - more inlining might lead to other optimizations. >>> >>> So it wouldn't surprise me to see macro-level effects even on programs >>> without memory leaks. >>> >>>> I filed https://bugs.openjdk.java.net/browse/JDK-8271623 >>>> to track that >>>> enhancement. From maurizio.cimadamore at oracle.com Thu Oct 7 14:36:55 2021 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 7 Oct 2021 15:36:55 +0100 Subject: Minor improvement to anonymous classes In-Reply-To: <424ad976-6f0c-5ada-ca22-f5a3d9c76dc1@oracle.com> References: <424ad976-6f0c-5ada-ca22-f5a3d9c76dc1@oracle.com> Message-ID: <1630506b-f16e-7a47-0fd8-ea7980a80e7a@oracle.com> Your proposal is for anon classes, which I think works well. One related case I found quite often is the desire to combine types e.g. in a return type: Foo & AutoCloseable getCloseableFoo(); This veers into declaration-land, visible to javadoc and all. So it probably doesn't have a good benefit vs. cost ratio. Also, this is effectively adding intersection types (I don't think restricting at return types only will be feasible). But I'm puzzled by the fact that programmers can, in a way, do something similar to the above with a generic method: X getCloseableFoo(); Which kind of works, but it's quite an horrible hack (you introduce a type parameter you don't need - which means compiler will try to infer types, etc.) I'm not suggesting we have to solve this - just wanted to make sure this was somewhere on the radar. Maurizio On 30/07/2021 15:52, Brian Goetz wrote: > I have been working on a library where I've found myself repeatedly > refactoring what should be anonymous classes into named (often local) > classes, for the sole reason that I want to combine interfaces with an > abstract base class: > > ??? interface Foo { ... lots of stuff .. } > ??? abstract class AbstractFoo { ... lots of base implementation ... } > > ??? interface RedFoo extends Foo { void red(); } > > and I want a factory that yields a RedFoo that is based on AbstractFoo > and implements red().? Trivial with a named class, but there's no > reason I should not be able to do that with an anonymous class, since > I have no need of the name. > > We already address this problem elsewhere; there are several places in > the grammar where you can append additional _interfaces_ with &, such as: > > ??? class X { ... } > > and casts (which can be target types for lambdas.) > > These are not full-blown intersection types, but accomodate for the > fact that classes have one superclass and potentially multiple > interfaces.? It appears simple to extend this to inner class creation > expressions: > > ??? new AbstractFoo(args) & RedFoo { ... } > > This would also smooth out a rough edge refactoring between lambdas > and anonymous classes. > > I suspect there are one or two other places in the spec that could use > this treatment. > > (Note that this is explicitly *not* a call for "let's do full-blown > intersection types"; this is solely about class declaration.) > > From kevinb at google.com Thu Oct 7 20:37:58 2021 From: kevinb at google.com (Kevin Bourrillion) Date: Thu, 7 Oct 2021 13:37:58 -0700 Subject: Minor improvement to anonymous classes In-Reply-To: <1630506b-f16e-7a47-0fd8-ea7980a80e7a@oracle.com> References: <424ad976-6f0c-5ada-ca22-f5a3d9c76dc1@oracle.com> <1630506b-f16e-7a47-0fd8-ea7980a80e7a@oracle.com> Message-ID: On Thu, Oct 7, 2021 at 7:37 AM Maurizio Cimadamore < maurizio.cimadamore at oracle.com> wrote: > X getCloseableFoo(); > > Which kind of works, but it's quite an horrible hack (you introduce a type > parameter you don't need - which means compiler will try to infer types, > etc.) > (Incidentally, we have Error Prone give a warning any time a method/constructor type parameter is unused in any of the formal parameter types, and I think the results have been good. A method like `emptySet()` has to suppress it, but it's a fairly special case.) On 30/07/2021 15:52, Brian Goetz wrote: > > I have been working on a library where I've found myself repeatedly > refactoring what should be anonymous classes into named (often local) > classes, for the sole reason that I want to combine interfaces with an > abstract base class: > > interface Foo { ... lots of stuff .. } > abstract class AbstractFoo { ... lots of base implementation ... } > > interface RedFoo extends Foo { void red(); } > > and I want a factory that yields a RedFoo that is based on AbstractFoo and > implements red(). Trivial with a named class, but there's no reason I > should not be able to do that with an anonymous class, since I have no need > of the name. > > We already address this problem elsewhere; there are several places in the > grammar where you can append additional _interfaces_ with &, such as: > > class X { ... } > > and casts (which can be target types for lambdas.) > > These are not full-blown intersection types, but accomodate for the fact > that classes have one superclass and potentially multiple interfaces. It > appears simple to extend this to inner class creation expressions: > > new AbstractFoo(args) & RedFoo { ... } > > This would also smooth out a rough edge refactoring between lambdas and > anonymous classes. > > I suspect there are one or two other places in the spec that could use > this treatment. > > (Note that this is explicitly *not* a call for "let's do full-blown > intersection types"; this is solely about class declaration.) > > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com From forax at univ-mlv.fr Thu Oct 7 21:08:48 2021 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 7 Oct 2021 23:08:48 +0200 (CEST) Subject: Minor improvement to anonymous classes In-Reply-To: References: <424ad976-6f0c-5ada-ca22-f5a3d9c76dc1@oracle.com> <1630506b-f16e-7a47-0fd8-ea7980a80e7a@oracle.com> Message-ID: <1430522104.2385623.1633640928181.JavaMail.zimbra@u-pem.fr> > From: "Kevin Bourrillion" > To: "Maurizio Cimadamore" > Cc: "Brian Goetz" , "amber-spec-experts" > > Sent: Jeudi 7 Octobre 2021 22:37:58 > Subject: Re: Minor improvement to anonymous classes > On Thu, Oct 7, 2021 at 7:37 AM Maurizio Cimadamore < [ > mailto:maurizio.cimadamore at oracle.com | maurizio.cimadamore at oracle.com ] > > wrote: >> X getCloseableFoo(); >> Which kind of works, but it's quite an horrible hack (you introduce a type >> parameter you don't need - which means compiler will try to infer types, etc.) > (Incidentally, we have Error Prone give a warning any time a method/constructor > type parameter is unused in any of the formal parameter types, and I think the > results have been good. A method like `emptySet()` has to suppress it, but it's > a fairly special case.) Using an "unused" parameter types as return type is not unusual either when returning null or when throwing an exception given that both the type of null and the "nothing" type can not be expressed in Java. See by example the javadoc of Assertions.fail() [ https://junit.org/junit5/docs/current/api/org.junit.jupiter.api/org/junit/jupiter/api/Assertions.html#fail(java.lang.String) | https://junit.org/junit5/docs/current/api/org.junit.jupiter.api/org/junit/jupiter/api/Assertions.html#fail(java.lang.String) ] The other usage i can see is to have a better type inference of the return type (avoid an explicit cast) when using a polymorphic signature but i'm not even sure javac support it. R?mi >> On 30/07/2021 15:52, Brian Goetz wrote: >>> I have been working on a library where I've found myself repeatedly refactoring >>> what should be anonymous classes into named (often local) classes, for the sole >>> reason that I want to combine interfaces with an abstract base class: >>> interface Foo { ... lots of stuff .. } >>> abstract class AbstractFoo { ... lots of base implementation ... } >>> interface RedFoo extends Foo { void red(); } >>> and I want a factory that yields a RedFoo that is based on AbstractFoo and >>> implements red(). Trivial with a named class, but there's no reason I should >>> not be able to do that with an anonymous class, since I have no need of the >>> name. >>> We already address this problem elsewhere; there are several places in the >>> grammar where you can append additional _interfaces_ with &, such as: >>> class X { ... } >>> and casts (which can be target types for lambdas.) >>> These are not full-blown intersection types, but accomodate for the fact that >>> classes have one superclass and potentially multiple interfaces. It appears >>> simple to extend this to inner class creation expressions: >>> new AbstractFoo(args) & RedFoo { ... } >>> This would also smooth out a rough edge refactoring between lambdas and >>> anonymous classes. >>> I suspect there are one or two other places in the spec that could use this >>> treatment. >>> (Note that this is explicitly *not* a call for "let's do full-blown intersection >>> types"; this is solely about class declaration.) > -- > Kevin Bourrillion | Java Librarian | Google, Inc. | [ mailto:kevinb at google.com | > kevinb at google.com ] From kevinb at google.com Thu Oct 7 21:12:21 2021 From: kevinb at google.com (Kevin Bourrillion) Date: Thu, 7 Oct 2021 14:12:21 -0700 Subject: Minor improvement to anonymous classes In-Reply-To: <1430522104.2385623.1633640928181.JavaMail.zimbra@u-pem.fr> References: <424ad976-6f0c-5ada-ca22-f5a3d9c76dc1@oracle.com> <1630506b-f16e-7a47-0fd8-ea7980a80e7a@oracle.com> <1430522104.2385623.1633640928181.JavaMail.zimbra@u-pem.fr> Message-ID: I'm sorry that I appeared to be suggesting that there were no other reasons to suppress it. I was actually giving just one example. Nevertheless, the check has done more good than "harm" (in the form of these small suppression costs). On Thu, Oct 7, 2021 at 2:08 PM Remi Forax wrote: > > > ------------------------------ > > *From: *"Kevin Bourrillion" > *To: *"Maurizio Cimadamore" > *Cc: *"Brian Goetz" , "amber-spec-experts" < > amber-spec-experts at openjdk.java.net> > *Sent: *Jeudi 7 Octobre 2021 22:37:58 > *Subject: *Re: Minor improvement to anonymous classes > > On Thu, Oct 7, 2021 at 7:37 AM Maurizio Cimadamore < > maurizio.cimadamore at oracle.com> wrote: > >> X getCloseableFoo(); >> >> Which kind of works, but it's quite an horrible hack (you introduce a >> type parameter you don't need - which means compiler will try to infer >> types, etc.) >> > (Incidentally, we have Error Prone give a warning any time a > method/constructor type parameter is unused in any of the formal parameter > types, and I think the results have been good. A method like `emptySet()` > has to suppress it, but it's a fairly special case.) > > > Using an "unused" parameter types as return type is not unusual either > when returning null or when throwing an exception given that both the type > of null and the "nothing" type can not be expressed in Java. > > See by example the javadoc of Assertions.fail() > > https://junit.org/junit5/docs/current/api/org.junit.jupiter.api/org/junit/jupiter/api/Assertions.html#fail(java.lang.String) > > The other usage i can see is to have a better type inference of the return > type (avoid an explicit cast) when using a polymorphic signature but i'm > not even sure javac support it. > > R?mi > > > > > On 30/07/2021 15:52, Brian Goetz wrote: >> >> I have been working on a library where I've found myself repeatedly >> refactoring what should be anonymous classes into named (often local) >> classes, for the sole reason that I want to combine interfaces with an >> abstract base class: >> >> interface Foo { ... lots of stuff .. } >> abstract class AbstractFoo { ... lots of base implementation ... } >> >> interface RedFoo extends Foo { void red(); } >> >> and I want a factory that yields a RedFoo that is based on AbstractFoo >> and implements red(). Trivial with a named class, but there's no reason I >> should not be able to do that with an anonymous class, since I have no need >> of the name. >> >> We already address this problem elsewhere; there are several places in >> the grammar where you can append additional _interfaces_ with &, such as: >> >> class X { ... } >> >> and casts (which can be target types for lambdas.) >> >> These are not full-blown intersection types, but accomodate for the fact >> that classes have one superclass and potentially multiple interfaces. It >> appears simple to extend this to inner class creation expressions: >> >> new AbstractFoo(args) & RedFoo { ... } >> >> This would also smooth out a rough edge refactoring between lambdas and >> anonymous classes. >> >> I suspect there are one or two other places in the spec that could use >> this treatment. >> >> (Note that this is explicitly *not* a call for "let's do full-blown >> intersection types"; this is solely about class declaration.) >> >> >> > > -- > Kevin Bourrillion | Java Librarian | Google, Inc. |kevinb at google.com > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com From amalloy at google.com Thu Oct 7 21:25:54 2021 From: amalloy at google.com (Alan Malloy) Date: Thu, 7 Oct 2021 14:25:54 -0700 Subject: Minor improvement to anonymous classes In-Reply-To: <1630506b-f16e-7a47-0fd8-ea7980a80e7a@oracle.com> References: <424ad976-6f0c-5ada-ca22-f5a3d9c76dc1@oracle.com> <1630506b-f16e-7a47-0fd8-ea7980a80e7a@oracle.com> Message-ID: You can't actually do this: that signature is a promise to return an instance of *any* class implementing those interfaces, not *some* class implementing them. If you try to implement your getCloseableFoo method, you'll find that no implementation compiles. For example: final class tmp { static X getX() { class Impl implements AutoCloseable, Serializable { public void close() {} } return new Impl(); } } tmp.java:8: error: incompatible types: Impl cannot be converted to X return new Impl(); ^ where X is a type-variable: X extends AutoCloseable,Serializable declared in method getX() This is because some caller may define their own MyImpl class implementing those interfaces, and then write MyImpl i = getX(), instantiating X to MyImpl, and your method doesn't know how to build such an object. On Thu, Oct 7, 2021 at 7:37 AM Maurizio Cimadamore < maurizio.cimadamore at oracle.com> wrote: > But I'm puzzled by the fact that programmers can, in a way, do something > similar to the above with a generic method: > > X getCloseableFoo(); > > Which kind of works, but it's quite an horrible hack (you introduce a type > parameter you don't need - which means compiler will try to infer types, > etc.) > > From cushon at google.com Thu Oct 7 21:57:43 2021 From: cushon at google.com (Liam Miller-Cushon) Date: Thu, 7 Oct 2021 14:57:43 -0700 Subject: [External] : Re: Minor improvement to anonymous classes In-Reply-To: References: <424ad976-6f0c-5ada-ca22-f5a3d9c76dc1@oracle.com> <2ED211EA-3F96-4129-B5BF-9A262C917D9F@oracle.com> <1320762308.901257.1627755887681.JavaMail.zimbra@u-pem.fr> <32bade08-4697-a2fd-52d2-491822c14d19@oracle.com>

<1699135460.215325.1627933584253.JavaMail.zimbra@u-pem.fr> <49676f3c-100e-ccb1-ee9a-ef999f9f4a0d@oracle.com>

Message-ID: On Wed, Oct 6, 2021 at 2:08 PM Brian Goetz wrote: > One more thing we could do is reach out to the most popular libraries that > do this and give them a heads up that they need to tolerate the field not > being there. > Good idea, I filed bugs against several libraries that are reflecting on fields named this$. I skipped examples that contain textual occurrences of this$ that were clearly safe, e.g. because they were filtering fields with that name out of the results of getDeclaredFields(). https://github.com/classgraph/classgraph/issues/570 https://github.com/robolectric/robolectric/issues/6757 https://github.com/micrometer-metrics/micrometer/issues/2806 https://github.com/awaitility/awaitility/issues/223 https://issues.apache.org/jira/browse/MAPREDUCE-7364 https://issues.apache.org/jira/browse/BEAM-13020 https://issues.apache.org/jira/browse/AVRO-3228 From maurizio.cimadamore at oracle.com Fri Oct 8 08:06:13 2021 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 8 Oct 2021 09:06:13 +0100 Subject: Minor improvement to anonymous classes In-Reply-To: References: <424ad976-6f0c-5ada-ca22-f5a3d9c76dc1@oracle.com> <1630506b-f16e-7a47-0fd8-ea7980a80e7a@oracle.com> Message-ID: Fully agree... unless you embrace unchecked casts (e.g. if your impl cast to X and suppresses the warning, the program compiles). Which is what Kevin was trying to say: having a generic method that doesn't use its type-variable in its parameters can be a bit of a smell, in the sense that implementors can forget that they don't have much control over what the type parameter might be inferred to. All this goes back to my original point: we can't express a method that returns a conjunction of two types today; the alternative is to declare a new type just for that (but some of the arguments originally discussed in this thread apply: e.g. I might not have a great name for it, nor a desire for actually naming it), or resort to unchecked generic dark arts (which, as you and Kevin point out, 95% is subtly broken). Maurizio On 07/10/2021 22:25, Alan Malloy wrote: > You can't actually do this: that signature is a promise to return an > instance of /any/?class implementing those interfaces, not > /some/?class implementing them. If you try to implement your > getCloseableFoo method, you'll find that no implementation compiles. > For example: > > final class tmp { > ? static X getX() { > ? ? class Impl implements AutoCloseable, Serializable { > ? ? ? public void close() {} > ? ? } > ? ? return new Impl(); > ? } > } > > tmp.java:8: error: incompatible types: Impl cannot be converted to X > ? ? return new Impl(); > ? ? ? ? ? ?^ > ? where X is a type-variable: > ? ? X extends AutoCloseable,Serializable declared in method getX() > > This is because some caller may define their own MyImpl class > implementing those interfaces, and then write MyImpl i = getX(), > instantiating X to MyImpl, and your method doesn't know how to build > such an object. > > On Thu, Oct 7, 2021 at 7:37 AM Maurizio Cimadamore > > wrote: > > But I'm puzzled by the fact that programmers can, in a way, do > something similar to the above with a generic method: > > X getCloseableFoo(); > > Which kind of works, but it's quite an horrible hack (you > introduce a type parameter you don't need - which means compiler > will try to infer types, etc.) > From forax at univ-mlv.fr Wed Oct 13 17:09:55 2021 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 13 Oct 2021 19:09:55 +0200 (CEST) Subject: String Interpolation Message-ID: <583208882.1665266.1634144995832.JavaMail.zimbra@u-pem.fr> Hi everybody, i've spend some time to think how the String interpolation + Policy should be specified and implemented. The goal is to add a syntax specifying a user defined method to "interpolate" (for a lack of better word) a string with arguments. Given that it's a method, the exact semantics of the interpolation, things like how the arguments are escaped, how the formatted string is parsed, is written is Java, this will allow to support a wide range of use cases. This proposal does not differ from the original proposal of Brian and Jim in its goal but in the way a user declare the interpolation method(s). TLDR; you can declare an interpolation method and optionally an interpolation bootstrap method if you want a more efficient code at the price of having to play with the method handle API. --- The proposal of Brian and Jim uses an interface to define the policy but in this case, using an interface is not what we want. I think there are two main reasons, - the interpolation method can be an instance method but can also be a factory method, a static method, and an interface can not constraint a static method. - we want the signature of the interpolation method to be free to use any number of parameters of any types, something that can not be specified with type parameters in Java. So let's take a step back and write some examples, as a user of the interpolation method, we want to - be able to specify string interpolation, you can notice that this is a static method. String name = ... int value = ... String s = String."name: \(name) age: \(age)"; - we also want to be able to instantiate regex Pattern, and have a magic optimisation that creates the Pattern instance only one Pattern pattern = Pattern."foo|bar"; - we also want to support instance method, so the interpolation can escape the arguments differently depending on the context, here by example, escaping differently depending on the database driver. String username = ... Connection connection = ... connection.""" SELECT * FROM users where user == "\(username)" """; I think the simplest way to specify an interpolation method is to have a method with a special name, i will use __interpolate__ because i don't want to discuss the exact syntax here. This method can be a static method or an instance method and has a restriction, the first parameter has to be a String because the first argument is the formatted string. Here is an example of how the method __interpolate__ inside java.lang.String can be written. To avoid everybody to re-implement the parsing of the formatted string, the class java.lang.runtime.InterpolateMetafactory provides a helper method "formatIterator" that returns an iterator splitting the formatted string into text and binding. package java.lang; public class String { ... public static String __interpolate__(String format, Object... args) { var i = 0; var builder = new StringBuilder(); var iterator = InterpolateMetafactory.formatIterator(format); while(iterator.hasNext()) { switch(iterator.next()) { case Text(var text) -> builder.append(text); case Binding binding -> args[i++]; } } return builder.toString(); } ... } While this is nice, you may think that it's just syntactic sugar and it will not be more performant that String.valueOf(), i.e. it will be slow. That's why the specification allow you to provide a second more optimised version of the interpolation method using a method __interpolate__bootstrap__. This method __interpolate__bootstrap__ is not required, can not replace the method __interpolate__, both __interpolate__ and __interpolate__bootstrap__ has to be present and it's a backward compatible change to add a method __interpolate__bootstrap__ after the fact, there is no need to recompile all the client code. For that the compiler translation rely on invokedynamic to call the method bootstrap of the class InterpolateMetafactor that at runtime decide to trampoline either to the method __interpolate__bootstrap__ or to the method __interpolate__ if no __interpolate__bootstrap__ exists. Here is an example of how a call to the interpolation method of String is generated by javac For the Java code String name = ... int value = ... String s = String."name: \(name) age: \(age)"; the equivalent bytecode is aload_1. // load name iload_2. // load age invokedynamic __interpolate__ (Ljava/lang/StringI)Ljava/lang/String; java.lang.runtime.InterpolateMetafactory.bootstrap(Lookup, String, MethodType, String, MethodHandle):CallSite [ "name: \(name) age: \(age)", String::__interpolate__(String, Object[]):String ] >From the perspective of the compiler the method __interpolate__ works exactly like a method with a polymorphic method signature (the method annotated with @PolymorphicSignature), so the descriptor of invokedynamic is created by collecting the type of the argument, here the interpolation method is called with a String and an int, so the descriptor and the return type is String so the descriptor is (Ljava/lang/StringI)Ljava/lang/String; Considering the interpolation method as a polymorphic method is important in term of performance because it means that not boxing will be done by the compiler, if there are some boxing, they will be done by the runtime, so are optional if the __interpolate__bootstrap__ does not need to box arguments. You can also notice that the formatted string is passed as a bootstrap constant so all the parsing of the format can be done once outside of the hot path. A call to invokedynamic also pass as a second bootstrap argument the method handle to the method __interpolate__, so the implementation inside InterpolateMetafactory.bootstrap can called this method if no method __interpolate__bootstrap__ exists. Here is a raw implementation of the class InterpolateMetafactory. The method formatIterator() return an Iterator of Token which is a sealed class. The method bootstrap() first lookup to a method "__interpolate__bootstrap__" in the lookup class that takes a Lookup, a String, a MethodType, the format and the default implementation and call it if it exists or takes the default implementation, bind the formatted String and adapt the arguments using asType (ask for boxing, etc). package java.lang.runtime; public class InterpolateMetafactory { public sealed interface Token { public record Text(String text) implements Token {} public record Binding(String name) implements Token {} } public static Iterator formatIterator(String format) { ... } public static CallSite bootstrap(Lookup lookup, String name, MethodType methodType, String format, MethodHandle impl) throws Throwable { // check if there is a bootstrap method MethodHandle bootstrap; try { bootstrap = lookup.findStatic(lookup.lookupClass(), "__interpolate__bootstrap__", MethodType.methodType(CallSite.class, Lookup.class, String.class, MethodType.class, String.class, MethodHandle.class)); } catch(NoSuchMethodException e) { // bind the default implementation return new ConstantCallSite(impl.bindTo(format).asType(methodType)); } return boostrap.invoke(lookup, name, methodType, format, impl); } } Here is another example, showing how to declare the methods __interpolate__ and __interpolate__bootstrap__ inside java.util.regex.Pattern. The "default" implementation calls Pattern.compile() and the optimized one always returns the result of Pattern.compile() as a constant. package java.util.regex; public class Pattern { public static String __interpolate__(String format) {. // the formatted string can not have arguments return Pattern.compile(format); } private static CallSite __interpolate__bootstrap__(Lookup lookup, String name, MethodType methodType, String format, MethodHandle impl) { return new ConstantCallSite(MethodHandles.constant(Pattern.class, Pattern.compile(format))); } } The method __interpolate__ provides via its signature, the parameter types that are verified by the compiler. It also provides a code that can be used by the tools that does static analysis on the bytecode because those tools can not see through the method handle returned by a bootstrap method given that it's a runtime construct, it's usually not available at the time the static analysis is done. This should be enough to have tools like Graal VM native image to see through the invokedynamic in a similar way it sees through the invokedynamic used when creating a lambda. The fact that all invokedynamic goes through the method InterpolateMetafactory.bootstrap and trampoline from it means that adding or removing the method __interpolate__bootstrap__ is a binary compatible change, if __interpolate__bootstrap__ is declared private. So implementing __interpolate__bootstrap__ can be an afterthought. regards, R?mi From brian.goetz at oracle.com Wed Oct 13 19:32:19 2021 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 13 Oct 2021 15:32:19 -0400 Subject: String Interpolation In-Reply-To: <583208882.1665266.1634144995832.JavaMail.zimbra@u-pem.fr> References: <583208882.1665266.1634144995832.JavaMail.zimbra@u-pem.fr> Message-ID: <6bb17702-969d-a749-e438-c920d07ab4a4@oracle.com> The ability to capture per-call-site computation so it could be done exactly once (including generating an MH to describe it) has been part of the goal all along.? The JEP is deliberately cagey about this because we didn't want to descend down the translation rabbit hole before we'd achieved consensus on the broad strokes, any more than we wanted to descend down the syntax rabbit hole. (FWIW, all of these side-paths were ones we already traveled and rejected for various reasons :) As you correctly point out, without something like type classes, associating a static method like a bootstrap with a class requires committing some sort of sin, such as the "magic names" sins committed by serialization.? We surely didn't want to do that either. > - we also want to be able to instantiate regex Pattern, > and have a magic optimisation that creates the Pattern instance only one > > Pattern pattern = Pattern."foo|bar"; You said the magic anti-word, which is "magic".? We don't want this to be magic.? (Examples like this are better treated as a form of optimistic constant folding, along the lines explored at my JVMLS talk a few years ago.) Summary: wait for constant folding. > I think the simplest way to specify an interpolation method is to have a method with a special name, > i will use __interpolate__ because i don't want to discuss the exact syntax here. This is committing the same "magic name" sin as serialization. We deliberately avoided this in the design.? When we have type classes, we'll be able to use that as a way to bridge from a type name to a witness to a particular class.? Our design was crafted so that it could be gracefully extended to such a mechanism, when it is available (using a type name instead of an instance reference at the use site.) Summary: wait for type classes. > That's why the specification allow you to provide a second more optimised version of the interpolation method using a method __interpolate__bootstrap__. This is an obviously attractive goal, but the mechanism is way too ad-hoc -- and also too limited -- and also too advanced to be a language feature.? Bootstraps are way too complicated to expose in the source language in this way, especially not this magically.? And its too ad-hoc, since its specific to the interpolation feature, whereas one could imagine a number of other contexts where it is useful too.? So this is a bad tradeoff in many ways.? Jim's implementation very cleverly gets the equivalent of this using pure library implementation (which leans on MutableCallSite.) While it is surely a desirable goal to be able to optimize formatter implementation, it is also super-easy to become obsessed with this, and give it a bigger place in the feature than it deserves.? For some cases -- notably String::format -- there are huge savings to be had (from a number of sources, not least of which is that scanning the string at every invocation and choosing a strategy based on that is expensive.)? But in other cases, it is almost irrelevant.? For pure concatenation, it is already pretty fast; for SQL, the cost of constructing the query is a tiny part of the execution time, so its not even worth optimizing.? So this is a "nice to have" rather than the centerpiece of the feature. To be clear, the centerpiece is the gathering up of a template + parameters so that their combination can be handled by another entity, whether right now, later, or never.? Optimizing the case where it is done right now, using a predictable choice of entity, is an optimization, but not the centerpiece. Let me sketch out how we're envisioning this.? The API is something like: ??? interface TemplatePolicy { ??????? T apply(TemplatedString ts); ??????? // returns MethodHandle (TemplatePolicy, TemplatedString) -> Object ??????? default MethodHandle asMethodHandle(TemplatedString ts) { ??????????? return MH[TemplatePolicy::apply] ??????? } ??? } The API specification has a number of constraints on the implementation of asMethodHandle, which I'll get to in a second.? When the compiler encounters an immediate application P."...", it generates an indy, which uses a special bootstrap that returns a MutableCallSite.? The MutableCallSite initially has as its target a special secondary bootstrap MH, which represents an interpolation site that has not yet seen an actual invocation.? The secondary bootstrap MH has the shape of TemplatePolicy::apply (e.g., (TemplatePolicy, TemplatedString) -> Object), so on first invocation it receives the TP object and the TS.? It then calls TP::asMethodHandle, and wraps this MH with a GWT which validates the invariants and proceeds to that MH if they hold -- which they will 99.x% of the time. The invariant is that the dynamic type of the per-instantiation TP be == to the dynamic type of the TP that was present at secondary linkage.? That is, it be an instance of the same class, but not the same instance.? By definition, the string will always be the same as will the types of the parameters, since this is specific to concrete P."..." sites.? So the MH can take advantage of that. The constraint on TP::asMethodHandle is that it not undermine this invariant; that if it generates a MH that is dependent on TP state, it not bake that state into the resulting MH, but instead, treat the TP state as a parameter.? Further, the MH must be behaviorally equivalent to calling apply. If the GWT fails, it means the user is doing something like: ??? for (TP p : listOfProcessors) { ??????? blah blah p."foo \{a}" ??? } in which case the GWT falls back to the "just do an invokevirtual of TP::apply" strategy.? (It could get fancier but I don't see any point.) This lets us rescue indy-based translation without exposing a magic indy-hook in the JLS.? (Sorry, I know you wanted the magic indy hook.) On 10/13/2021 1:09 PM, Remi Forax wrote: > Hi everybody, i've spend some time to think how the String interpolation + Policy should be specified and implemented. > > > The goal is to add a syntax specifying a user defined method to "interpolate" (for a lack of better word) a string with arguments. > > Given that it's a method, the exact semantics of the interpolation, things like how the arguments are escaped, how the formatted string is parsed, is written is Java, this will allow to support a wide range of use cases. > > This proposal does not differ from the original proposal of Brian and Jim in its goal but in the way a user declare the interpolation method(s). > > TLDR; you can declare an interpolation method and optionally an interpolation bootstrap method if you want a more efficient code at the price of having to play with the method handle API. > > --- > > The proposal of Brian and Jim uses an interface to define the policy but in this case, using an interface is not what we want. > I think there are two main reasons, > - the interpolation method can be an instance method but can also be a factory method, a static method, and an interface can not constraint a static method. > - we want the signature of the interpolation method to be free to use any number of parameters of any types, something that can not be specified with type parameters in Java. > > So let's take a step back and write some examples, as a user of the interpolation method, we want to > - be able to specify string interpolation, > you can notice that this is a static method. > > String name = ... > int value = ... > String s = String."name: \(name) age: \(age)"; > > > - we also want to be able to instantiate regex Pattern, > and have a magic optimisation that creates the Pattern instance only one > > Pattern pattern = Pattern."foo|bar"; > > - we also want to support instance method, so the interpolation can escape the arguments differently depending on the context, > here by example, escaping differently depending on the database driver. > > String username = ... > Connection connection = ... > connection.""" > SELECT * FROM users where user == "\(username)" > """; > > I think the simplest way to specify an interpolation method is to have a method with a special name, > i will use __interpolate__ because i don't want to discuss the exact syntax here. > > This method can be a static method or an instance method and has a restriction, the first parameter has to be a String because the first argument is the formatted string. > > Here is an example of how the method __interpolate__ inside java.lang.String can be written. > To avoid everybody to re-implement the parsing of the formatted string, the class java.lang.runtime.InterpolateMetafactory provides a helper method "formatIterator" that returns an iterator splitting the formatted string into text and binding. > > > package java.lang; > > public class String { > ... > public static String __interpolate__(String format, Object... args) { > var i = 0; > var builder = new StringBuilder(); > var iterator = InterpolateMetafactory.formatIterator(format); > while(iterator.hasNext()) { > switch(iterator.next()) { > case Text(var text) -> builder.append(text); > case Binding binding -> args[i++]; > } > } > return builder.toString(); > } > ... > } > > While this is nice, you may think that it's just syntactic sugar and it will not be more performant that String.valueOf(), i.e. it will be slow. > > That's why the specification allow you to provide a second more optimised version of the interpolation method using a method __interpolate__bootstrap__. > This method __interpolate__bootstrap__ is not required, can not replace the method __interpolate__, both __interpolate__ and __interpolate__bootstrap__ > has to be present and it's a backward compatible change to add a method __interpolate__bootstrap__ after the fact, there is no need to recompile > all the client code. > > For that the compiler translation rely on invokedynamic to call the method bootstrap of the class InterpolateMetafactor that at runtime decide > to trampoline either to the method __interpolate__bootstrap__ or to the method __interpolate__ if no __interpolate__bootstrap__ exists. > > Here is an example of how a call to the interpolation method of String is generated by javac > For the Java code > > String name = ... > int value = ... > String s = String."name: \(name) age: \(age)"; > > the equivalent bytecode is > > aload_1. // load name > iload_2. // load age > invokedynamic __interpolate__ (Ljava/lang/StringI)Ljava/lang/String; > java.lang.runtime.InterpolateMetafactory.bootstrap(Lookup, String, MethodType, String, MethodHandle):CallSite > [ "name: \(name) age: \(age)", String::__interpolate__(String, Object[]):String ] > > From the perspective of the compiler the method __interpolate__ works exactly like a method with a polymorphic method signature (the method annotated with @PolymorphicSignature), > so the descriptor of invokedynamic is created by collecting the type of the argument, here the interpolation method is called with a String and an int, so the descriptor > and the return type is String so the descriptor is (Ljava/lang/StringI)Ljava/lang/String; > > Considering the interpolation method as a polymorphic method is important in term of performance because it means that not boxing will be done by the compiler, if there are some boxing, they will be done by the runtime, so are optional if the __interpolate__bootstrap__ does not need to box arguments. > > You can also notice that the formatted string is passed as a bootstrap constant so all the parsing of the format can be done once outside of the hot path. > A call to invokedynamic also pass as a second bootstrap argument the method handle to the method __interpolate__, so the implementation inside InterpolateMetafactory.bootstrap can called this method if no method __interpolate__bootstrap__ exists. > > > Here is a raw implementation of the class InterpolateMetafactory. > The method formatIterator() return an Iterator of Token which is a sealed class. > The method bootstrap() first lookup to a method "__interpolate__bootstrap__" in the lookup class that takes a Lookup, a String, a MethodType, the format and the default implementation and call it if it exists or takes the default implementation, bind the formatted String and adapt the arguments using asType (ask for boxing, etc). > > package java.lang.runtime; > > public class InterpolateMetafactory { > public sealed interface Token { > public record Text(String text) implements Token {} > public record Binding(String name) implements Token {} > } > > > public static Iterator formatIterator(String format) { > ... > } > > public static CallSite bootstrap(Lookup lookup, String name, MethodType methodType, String format, MethodHandle impl) throws Throwable { > // check if there is a bootstrap method > MethodHandle bootstrap; > try { > bootstrap = lookup.findStatic(lookup.lookupClass(), "__interpolate__bootstrap__", MethodType.methodType(CallSite.class, Lookup.class, String.class, MethodType.class, String.class, MethodHandle.class)); > } catch(NoSuchMethodException e) { > // bind the default implementation > return new ConstantCallSite(impl.bindTo(format).asType(methodType)); > } > return boostrap.invoke(lookup, name, methodType, format, impl); > } > } > > > Here is another example, showing how to declare the methods __interpolate__ and __interpolate__bootstrap__ inside java.util.regex.Pattern. > The "default" implementation calls Pattern.compile() and the optimized one always returns the result of Pattern.compile() as a constant. > > package java.util.regex; > > public class Pattern { > public static String __interpolate__(String format) {. // the formatted string can not have arguments > return Pattern.compile(format); > } > > private static CallSite __interpolate__bootstrap__(Lookup lookup, String name, MethodType methodType, String format, MethodHandle impl) { > return new ConstantCallSite(MethodHandles.constant(Pattern.class, Pattern.compile(format))); > } > } > > > The method __interpolate__ provides via its signature, the parameter types that are verified by the compiler. > It also provides a code that can be used by the tools that does static analysis on the bytecode because those tools can not see through the method handle returned by a bootstrap method given that it's a runtime construct, it's usually not available at the time the static analysis is done. This should be enough to have tools like Graal VM native image to see through the invokedynamic in a similar way it sees through the invokedynamic used when creating a lambda. > > The fact that all invokedynamic goes through the method InterpolateMetafactory.bootstrap and trampoline from it means that adding or removing the method __interpolate__bootstrap__ is a binary compatible change, if __interpolate__bootstrap__ is declared private. So implementing __interpolate__bootstrap__ can be an afterthought. > > regards, > R?mi From forax at univ-mlv.fr Wed Oct 13 20:07:34 2021 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Wed, 13 Oct 2021 22:07:34 +0200 (CEST) Subject: String Interpolation In-Reply-To: <6bb17702-969d-a749-e438-c920d07ab4a4@oracle.com> References: <583208882.1665266.1634144995832.JavaMail.zimbra@u-pem.fr> <6bb17702-969d-a749-e438-c920d07ab4a4@oracle.com> Message-ID: <1556797536.1701993.1634155654113.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "Remi Forax" , "amber-spec-experts" > > Sent: Mercredi 13 Octobre 2021 21:32:19 > Subject: Re: String Interpolation > The ability to capture per-call-site computation so it could be done exactly > once (including generating an MH to describe it) has been part of the goal all > along. The JEP is deliberately cagey about this because we didn't want to > descend down the translation rabbit hole before we'd achieved consensus on the > broad strokes, any more than we wanted to descend down the syntax rabbit hole. > (FWIW, all of these side-paths were ones we already traveled and rejected for > various reasons :) > As you correctly point out, without something like type classes, associating a > static method like a bootstrap with a class requires committing some sort of > sin, such as the "magic names" sins committed by serialization. We surely > didn't want to do that either. What we want here is a "protocol", a protocol is something that is really like a method call but with extra syntax, extra constraints. Java uses protocols. We are used to see a method with the class name and no return type as a constructor that we may not even realize that Java uses special names to indicate a protocol. Unlike Scala, the current syntax does not specify a method name, it's String."a text" and not String.method"a text". That why i've proposed to use a special name. Soon we will want to introduce user defined pattern, this is also a protocol, it's a kind of method call too, with extra syntax, extra constraints. Instead of using I agree that using a magic name is less than ideal to define a protocol, but >> - we also want to be able to instantiate regex Pattern, >> and have a magic optimisation that creates the Pattern instance only one >> Pattern pattern = Pattern."foo|bar"; > You said the magic anti-word, which is "magic". We don't want this to be magic. > (Examples like this are better treated as a form of optimistic constant > folding, along the lines explored at my JVMLS talk a few years ago.) > Summary: wait for constant folding. >> I think the simplest way to specify an interpolation method is to have a method >> with a special name, >> i will use __interpolate__ because i don't want to discuss the exact syntax >> here. > This is committing the same "magic name" sin as serialization. We deliberately > avoided this in the design. When we have type classes, we'll be able to use > that as a way to bridge from a type name to a witness to a particular class. > Our design was crafted so that it could be gracefully extended to such a > mechanism, when it is available (using a type name instead of an instance > reference at the use site.) > Summary: wait for type classes. >> That's why the specification allow you to provide a second more optimised >> version of the interpolation method using a method __interpolate__bootstrap__. > This is an obviously attractive goal, but the mechanism is way too ad-hoc -- and > also too limited -- and also too advanced to be a language feature. Bootstraps > are way too complicated to expose in the source language in this way, > especially not this magically. And its too ad-hoc, since its specific to the > interpolation feature, whereas one could imagine a number of other contexts > where it is useful too. So this is a bad tradeoff in many ways. Jim's > implementation very cleverly gets the equivalent of this using pure library > implementation (which leans on MutableCallSite.) > While it is surely a desirable goal to be able to optimize formatter > implementation, it is also super-easy to become obsessed with this, and give it > a bigger place in the feature than it deserves. For some cases -- notably > String::format -- there are huge savings to be had (from a number of sources, > not least of which is that scanning the string at every invocation and choosing > a strategy based on that is expensive.) But in other cases, it is almost > irrelevant. For pure concatenation, it is already pretty fast; for SQL, the > cost of constructing the query is a tiny part of the execution time, so its not > even worth optimizing. So this is a "nice to have" rather than the centerpiece > of the feature. > To be clear, the centerpiece is the gathering up of a template + parameters so > that their combination can be handled by another entity, whether right now, > later, or never. Optimizing the case where it is done right now, using a > predictable choice of entity, is an optimization, but not the centerpiece. > Let me sketch out how we're envisioning this. The API is something like: > interface TemplatePolicy { > T apply(TemplatedString ts); > // returns MethodHandle (TemplatePolicy, TemplatedString) -> Object > default MethodHandle asMethodHandle(TemplatedString ts) { > return MH[TemplatePolicy::apply] > } > } > The API specification has a number of constraints on the implementation of > asMethodHandle, which I'll get to in a second. When the compiler encounters an > immediate application P."...", it generates an indy, which uses a special > bootstrap that returns a MutableCallSite. The MutableCallSite initially has as > its target a special secondary bootstrap MH, which represents an interpolation > site that has not yet seen an actual invocation. The secondary bootstrap MH has > the shape of TemplatePolicy::apply (e.g., (TemplatePolicy, TemplatedString) -> > Object), so on first invocation it receives the TP object and the TS. It then > calls TP::asMethodHandle, and wraps this MH with a GWT which validates the > invariants and proceeds to that MH if they hold -- which they will 99.x% of the > time. > The invariant is that the dynamic type of the per-instantiation TP be == to the > dynamic type of the TP that was present at secondary linkage. That is, it be an > instance of the same class, but not the same instance. By definition, the > string will always be the same as will the types of the parameters, since this > is specific to concrete P."..." sites. So the MH can take advantage of that. > The constraint on TP::asMethodHandle is that it not undermine this invariant; > that if it generates a MH that is dependent on TP state, it not bake that state > into the resulting MH, but instead, treat the TP state as a parameter. Further, > the MH must be behaviorally equivalent to calling apply. > If the GWT fails, it means the user is doing something like: > for (TP p : listOfProcessors) { > blah blah p."foo \{a}" > } > in which case the GWT falls back to the "just do an invokevirtual of TP::apply" > strategy. (It could get fancier but I don't see any point.) > This lets us rescue indy-based translation without exposing a magic indy-hook in > the JLS. (Sorry, I know you wanted the magic indy hook.) > On 10/13/2021 1:09 PM, Remi Forax wrote: >> Hi everybody, i've spend some time to think how the String interpolation + >> Policy should be specified and implemented. >> The goal is to add a syntax specifying a user defined method to "interpolate" >> (for a lack of better word) a string with arguments. >> Given that it's a method, the exact semantics of the interpolation, things like >> how the arguments are escaped, how the formatted string is parsed, is written >> is Java, this will allow to support a wide range of use cases. >> This proposal does not differ from the original proposal of Brian and Jim in its >> goal but in the way a user declare the interpolation method(s). >> TLDR; you can declare an interpolation method and optionally an interpolation >> bootstrap method if you want a more efficient code at the price of having to >> play with the method handle API. >> --- >> The proposal of Brian and Jim uses an interface to define the policy but in this >> case, using an interface is not what we want. >> I think there are two main reasons, >> - the interpolation method can be an instance method but can also be a factory >> method, a static method, and an interface can not constraint a static method. >> - we want the signature of the interpolation method to be free to use any number >> of parameters of any types, something that can not be specified with type >> parameters in Java. >> So let's take a step back and write some examples, as a user of the >> interpolation method, we want to >> - be able to specify string interpolation, >> you can notice that this is a static method. >> String name = ... >> int value = ... >> String s = String."name: \(name) age: \(age)"; >> - we also want to be able to instantiate regex Pattern, >> and have a magic optimisation that creates the Pattern instance only one >> Pattern pattern = Pattern."foo|bar"; >> - we also want to support instance method, so the interpolation can escape the >> arguments differently depending on the context, >> here by example, escaping differently depending on the database driver. >> String username = ... >> Connection connection = ... >> connection.""" >> SELECT * FROM users where user == "\(username)" >> """; >> I think the simplest way to specify an interpolation method is to have a method >> with a special name, >> i will use __interpolate__ because i don't want to discuss the exact syntax >> here. >> This method can be a static method or an instance method and has a restriction, >> the first parameter has to be a String because the first argument is the >> formatted string. >> Here is an example of how the method __interpolate__ inside java.lang.String can >> be written. >> To avoid everybody to re-implement the parsing of the formatted string, the >> class java.lang.runtime.InterpolateMetafactory provides a helper method >> "formatIterator" that returns an iterator splitting the formatted string into >> text and binding. >> package java.lang; >> public class String { >> ... >> public static String __interpolate__(String format, Object... args) { >> var i = 0; >> var builder = new StringBuilder(); >> var iterator = InterpolateMetafactory.formatIterator(format); >> while(iterator.hasNext()) { >> switch(iterator.next()) { >> case Text(var text) -> builder.append(text); >> case Binding binding -> args[i++]; >> } >> } >> return builder.toString(); >> } >> ... >> } >> While this is nice, you may think that it's just syntactic sugar and it will not >> be more performant that String.valueOf(), i.e. it will be slow. >> That's why the specification allow you to provide a second more optimised >> version of the interpolation method using a method __interpolate__bootstrap__. >> This method __interpolate__bootstrap__ is not required, can not replace the >> method __interpolate__, both __interpolate__ and __interpolate__bootstrap__ >> has to be present and it's a backward compatible change to add a method >> __interpolate__bootstrap__ after the fact, there is no need to recompile >> all the client code. >> For that the compiler translation rely on invokedynamic to call the method >> bootstrap of the class InterpolateMetafactor that at runtime decide >> to trampoline either to the method __interpolate__bootstrap__ or to the method >> __interpolate__ if no __interpolate__bootstrap__ exists. >> Here is an example of how a call to the interpolation method of String is >> generated by javac >> For the Java code >> String name = ... >> int value = ... >> String s = String."name: \(name) age: \(age)"; >> the equivalent bytecode is >> aload_1. // load name >> iload_2. // load age >> invokedynamic __interpolate__ (Ljava/lang/StringI)Ljava/lang/String; >> java.lang.runtime.InterpolateMetafactory.bootstrap(Lookup, String, MethodType, >> String, MethodHandle):CallSite >> [ "name: \(name) age: \(age)", String::__interpolate__(String, Object[]):String >> ] >> From the perspective of the compiler the method __interpolate__ works exactly >> like a method with a polymorphic method signature (the method annotated with >> @PolymorphicSignature), >> so the descriptor of invokedynamic is created by collecting the type of the >> argument, here the interpolation method is called with a String and an int, so >> the descriptor >> and the return type is String so the descriptor is >> (Ljava/lang/StringI)Ljava/lang/String; >> Considering the interpolation method as a polymorphic method is important in >> term of performance because it means that not boxing will be done by the >> compiler, if there are some boxing, they will be done by the runtime, so are >> optional if the __interpolate__bootstrap__ does not need to box arguments. >> You can also notice that the formatted string is passed as a bootstrap constant >> so all the parsing of the format can be done once outside of the hot path. >> A call to invokedynamic also pass as a second bootstrap argument the method >> handle to the method __interpolate__, so the implementation inside >> InterpolateMetafactory.bootstrap can called this method if no method >> __interpolate__bootstrap__ exists. >> Here is a raw implementation of the class InterpolateMetafactory. >> The method formatIterator() return an Iterator of Token which is a sealed class. >> The method bootstrap() first lookup to a method "__interpolate__bootstrap__" in >> the lookup class that takes a Lookup, a String, a MethodType, the format and >> the default implementation and call it if it exists or takes the default >> implementation, bind the formatted String and adapt the arguments using asType >> (ask for boxing, etc). >> package java.lang.runtime; >> public class InterpolateMetafactory { >> public sealed interface Token { >> public record Text(String text) implements Token {} >> public record Binding(String name) implements Token {} >> } >> public static Iterator formatIterator(String format) { >> ... >> } >> public static CallSite bootstrap(Lookup lookup, String name, MethodType >> methodType, String format, MethodHandle impl) throws Throwable { >> // check if there is a bootstrap method >> MethodHandle bootstrap; >> try { >> bootstrap = lookup.findStatic(lookup.lookupClass(), >> "__interpolate__bootstrap__", MethodType.methodType(CallSite.class, >> Lookup.class, String.class, MethodType.class, String.class, >> MethodHandle.class)); >> } catch(NoSuchMethodException e) { >> // bind the default implementation >> return new ConstantCallSite(impl.bindTo(format).asType(methodType)); >> } >> return boostrap.invoke(lookup, name, methodType, format, impl); >> } >> } >> Here is another example, showing how to declare the methods __interpolate__ and >> __interpolate__bootstrap__ inside java.util.regex.Pattern. >> The "default" implementation calls Pattern.compile() and the optimized one >> always returns the result of Pattern.compile() as a constant. >> package java.util.regex; >> public class Pattern { >> public static String __interpolate__(String format) {. // the formatted string >> can not have arguments >> return Pattern.compile(format); >> } >> private static CallSite __interpolate__bootstrap__(Lookup lookup, String name, >> MethodType methodType, String format, MethodHandle impl) { >> return new ConstantCallSite(MethodHandles.constant(Pattern.class, >> Pattern.compile(format))); >> } >> } >> The method __interpolate__ provides via its signature, the parameter types that >> are verified by the compiler. >> It also provides a code that can be used by the tools that does static analysis >> on the bytecode because those tools can not see through the method handle >> returned by a bootstrap method given that it's a runtime construct, it's >> usually not available at the time the static analysis is done. This should be >> enough to have tools like Graal VM native image to see through the >> invokedynamic in a similar way it sees through the invokedynamic used when >> creating a lambda. >> The fact that all invokedynamic goes through the method >> InterpolateMetafactory.bootstrap and trampoline from it means that adding or >> removing the method __interpolate__bootstrap__ is a binary compatible change, >> if __interpolate__bootstrap__ is declared private. So implementing >> __interpolate__bootstrap__ can be an afterthought. >> regards, >> R?mi From forax at univ-mlv.fr Wed Oct 13 22:28:05 2021 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 14 Oct 2021 00:28:05 +0200 (CEST) Subject: String Interpolation In-Reply-To: <6bb17702-969d-a749-e438-c920d07ab4a4@oracle.com> References: <583208882.1665266.1634144995832.JavaMail.zimbra@u-pem.fr> <6bb17702-969d-a749-e438-c920d07ab4a4@oracle.com> Message-ID: <1089189662.1720947.1634164085994.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "Remi Forax" , "amber-spec-experts" > > Sent: Mercredi 13 Octobre 2021 21:32:19 > Subject: Re: String Interpolation > The ability to capture per-call-site computation so it could be done exactly > once (including generating an MH to describe it) has been part of the goal all > along. The JEP is deliberately cagey about this because we didn't want to > descend down the translation rabbit hole before we'd achieved consensus on the > broad strokes, any more than we wanted to descend down the syntax rabbit hole. > (FWIW, all of these side-paths were ones we already traveled and rejected for > various reasons :) > As you correctly point out, without something like type classes, associating a > static method like a bootstrap with a class requires committing some sort of > sin, such as the "magic names" sins committed by serialization. The current syntax is something like String."a text" There is no method name, so we have basically two choices, either make the syntax more like a method call, it's what Scala does String.method"a text" Or we specify what i would call a protocol. A protocol is like a method call by enhanced with an adhoc syntax and constraints. By example, the constructor of Java is a protocol, we name the method with the same name as the class and do not specify a return type and magically, it becomes a constructor with it's own set of rules. Soon we will introduce user defined pattern methods, this alspo needs a protocol, the current proposal for them is to use a special modifier like destructor or pattern. If you prefer to use a special modifier, i'm fine with that, if you think it's better to change the user site to specify a method name, i'm fine with that too. > We surely didn't want to do that either. >> - we also want to be able to instantiate regex Pattern, >> and have a magic optimisation that creates the Pattern instance only one >> Pattern pattern = Pattern."foo|bar"; > You said the magic anti-word, which is "magic". We don't want this to be magic. > (Examples like this are better treated as a form of optimistic constant > folding, along the lines explored at my JVMLS talk a few years ago.) > Summary: wait for constant folding. I don't like constant folding for several reasons, it's a one size fit for all, you can not specify in the code how you transform the format before it being constant folded, and the more the Java compiler is dumb the better. Constant folding is the kind of feature that tends to interact with all other feature (recent example, case Foo foo vs case Foo foo && true vs case Foo foo && 2 == 2). >> I think the simplest way to specify an interpolation method is to have a method >> with a special name, >> i will use __interpolate__ because i don't want to discuss the exact syntax >> here. > This is committing the same "magic name" sin as serialization. We deliberately > avoided this in the design. When we have type classes, we'll be able to use > that as a way to bridge from a type name to a witness to a particular class. > Our design was crafted so that it could be gracefully extended to such a > mechanism, when it is available (using a type name instead of an instance > reference at the use site.) > Summary: wait for type classes. Adding type classes may solve how to specify a contract on a static method, it does not solve the fact that you want the signature of the method (static or not) to be polymorphic. >> That's why the specification allow you to provide a second more optimised >> version of the interpolation method using a method __interpolate__bootstrap__. > This is an obviously attractive goal, but the mechanism is way too ad-hoc -- and > also too limited -- and also too advanced to be a language feature. Bootstraps > are way too complicated to expose in the source language in this way, > especially not this magically. And its too ad-hoc, since its specific to the > interpolation feature, whereas one could imagine a number of other contexts > where it is useful too. So this is a bad tradeoff in many ways. Jim's > implementation very cleverly gets the equivalent of this using pure library > implementation (which leans on MutableCallSite.) Using a MutableCallsite as a way to devirtualize something you have arbitrarily specified as virtual is the tail wagging the dog. I've written a library [1] that uses MutableCallsite where it should use ConstantCallSite to bypass the inability of javac to generate an invokedynamic. But at least, i always felt guilty about it. Adding mutable callsites to the runtime of Java is a mistake, the performance model is really tricky. > While it is surely a desirable goal to be able to optimize formatter > implementation, it is also super-easy to become obsessed with this, and give it > a bigger place in the feature than it deserves. For some cases -- notably > String::format -- there are huge savings to be had (from a number of sources, > not least of which is that scanning the string at every invocation and choosing > a strategy based on that is expensive.) But in other cases, it is almost > irrelevant. For pure concatenation, it is already pretty fast; for SQL, the > cost of constructing the query is a tiny part of the execution time, so its not > even worth optimizing. So this is a "nice to have" rather than the centerpiece > of the feature. > To be clear, the centerpiece is the gathering up of a template + parameters so > that their combination can be handled by another entity, whether right now, > later, or never. Optimizing the case where it is done right now, using a > predictable choice of entity, is an optimization, but not the centerpiece. > Let me sketch out how we're envisioning this. The API is something like: > interface TemplatePolicy { > T apply(TemplatedString ts); > // returns MethodHandle (TemplatePolicy, TemplatedString) -> Object > default MethodHandle asMethodHandle(TemplatedString ts) { > return MH[TemplatePolicy::apply] > } > } Let's not talk about the bootstrap method for a second. This API fails to indicate to the compiler the type of the parameters that are allowed before calling the template, by example, i may want to specify a query as a String but with only expression of type Expression as arguments. This API forces the implementation to be ready to have any arguments (and those have to be boxed). And you are re-inventing a strawman way to implement the JSR 292, you're design is actually this is quite close to the early designs of Gilad Bracha. At some point, you will want: - to share the same TemplatePolicy without doing inheritance, you will re-invent the Lookup object - avoid the unecessary boxing, you will pass the MethodType as parameter - avoid to have a PIC (Polymorphic Inliniing Cache) for things as simple as return always the same constant, you will make the API a function call not a method call - avoid to wait until until all arguments are on the stack and segregate between dynamic arguments and constant arguments, you you will re-invent the boostrap API The reason you will gravitate toward the bootstrap API is that fundmentally, it's a way to specify a linker in Java code, which is what you want here. > The API specification has a number of constraints on the implementation of > asMethodHandle, which I'll get to in a second. When the compiler encounters an > immediate application P."...", it generates an indy, which uses a special > bootstrap that returns a MutableCallSite. The MutableCallSite initially has as > its target a special secondary bootstrap MH, which represents an interpolation > site that has not yet seen an actual invocation. The secondary bootstrap MH has > the shape of TemplatePolicy::apply (e.g., (TemplatePolicy, TemplatedString) -> > Object), so on first invocation it receives the TP object and the TS. It then > calls TP::asMethodHandle, and wraps this MH with a GWT which validates the > invariants and proceeds to that MH if they hold -- which they will 99.x% of the > time. > The invariant is that the dynamic type of the per-instantiation TP be == to the > dynamic type of the TP that was present at secondary linkage. That is, it be an > instance of the same class, but not the same instance. By definition, the > string will always be the same as will the types of the parameters, since this > is specific to concrete P."..." sites. So the MH can take advantage of that. > The constraint on TP::asMethodHandle is that it not undermine this invariant; > that if it generates a MH that is dependent on TP state, it not bake that state > into the resulting MH, but instead, treat the TP state as a parameter. Further, > the MH must be behaviorally equivalent to calling apply. > If the GWT fails, it means the user is doing something like: > for (TP p : listOfProcessors) { > blah blah p."foo \{a}" > } > in which case the GWT falls back to the "just do an invokevirtual of TP::apply" > strategy. (It could get fancier but I don't see any point.) > This lets us rescue indy-based translation without exposing a magic indy-hook in > the JLS. (Sorry, I know you wanted the magic indy hook.) The issue is not about me asking you to add a magic hook, once you have an API that returns a MethodHandle used by an invokedynamic, you are providing a magic hook. The issue is that in your attempt to try to not provide a magic hook, you are providing a Smalltalk like magic hook, where all type information are lost (no typechecking by the compiler, no way to get the type information at runtime to avoid boxing) and with a crappy performance model (boxing again + a PIC for devirtualizing something which should not be virtual). R?mi [1] https://github.com/forax/exotic > On 10/13/2021 1:09 PM, Remi Forax wrote: >> Hi everybody, i've spend some time to think how the String interpolation + >> Policy should be specified and implemented. >> The goal is to add a syntax specifying a user defined method to "interpolate" >> (for a lack of better word) a string with arguments. >> Given that it's a method, the exact semantics of the interpolation, things like >> how the arguments are escaped, how the formatted string is parsed, is written >> is Java, this will allow to support a wide range of use cases. >> This proposal does not differ from the original proposal of Brian and Jim in its >> goal but in the way a user declare the interpolation method(s). >> TLDR; you can declare an interpolation method and optionally an interpolation >> bootstrap method if you want a more efficient code at the price of having to >> play with the method handle API. >> --- >> The proposal of Brian and Jim uses an interface to define the policy but in this >> case, using an interface is not what we want. >> I think there are two main reasons, >> - the interpolation method can be an instance method but can also be a factory >> method, a static method, and an interface can not constraint a static method. >> - we want the signature of the interpolation method to be free to use any number >> of parameters of any types, something that can not be specified with type >> parameters in Java. >> So let's take a step back and write some examples, as a user of the >> interpolation method, we want to >> - be able to specify string interpolation, >> you can notice that this is a static method. >> String name = ... >> int value = ... >> String s = String."name: \(name) age: \(age)"; >> - we also want to be able to instantiate regex Pattern, >> and have a magic optimisation that creates the Pattern instance only one >> Pattern pattern = Pattern."foo|bar"; >> - we also want to support instance method, so the interpolation can escape the >> arguments differently depending on the context, >> here by example, escaping differently depending on the database driver. >> String username = ... >> Connection connection = ... >> connection.""" >> SELECT * FROM users where user == "\(username)" >> """; >> I think the simplest way to specify an interpolation method is to have a method >> with a special name, >> i will use __interpolate__ because i don't want to discuss the exact syntax >> here. >> This method can be a static method or an instance method and has a restriction, >> the first parameter has to be a String because the first argument is the >> formatted string. >> Here is an example of how the method __interpolate__ inside java.lang.String can >> be written. >> To avoid everybody to re-implement the parsing of the formatted string, the >> class java.lang.runtime.InterpolateMetafactory provides a helper method >> "formatIterator" that returns an iterator splitting the formatted string into >> text and binding. >> package java.lang; >> public class String { >> ... >> public static String __interpolate__(String format, Object... args) { >> var i = 0; >> var builder = new StringBuilder(); >> var iterator = InterpolateMetafactory.formatIterator(format); >> while(iterator.hasNext()) { >> switch(iterator.next()) { >> case Text(var text) -> builder.append(text); >> case Binding binding -> args[i++]; >> } >> } >> return builder.toString(); >> } >> ... >> } >> While this is nice, you may think that it's just syntactic sugar and it will not >> be more performant that String.valueOf(), i.e. it will be slow. >> That's why the specification allow you to provide a second more optimised >> version of the interpolation method using a method __interpolate__bootstrap__. >> This method __interpolate__bootstrap__ is not required, can not replace the >> method __interpolate__, both __interpolate__ and __interpolate__bootstrap__ >> has to be present and it's a backward compatible change to add a method >> __interpolate__bootstrap__ after the fact, there is no need to recompile >> all the client code. >> For that the compiler translation rely on invokedynamic to call the method >> bootstrap of the class InterpolateMetafactor that at runtime decide >> to trampoline either to the method __interpolate__bootstrap__ or to the method >> __interpolate__ if no __interpolate__bootstrap__ exists. >> Here is an example of how a call to the interpolation method of String is >> generated by javac >> For the Java code >> String name = ... >> int value = ... >> String s = String."name: \(name) age: \(age)"; >> the equivalent bytecode is >> aload_1. // load name >> iload_2. // load age >> invokedynamic __interpolate__ (Ljava/lang/StringI)Ljava/lang/String; >> java.lang.runtime.InterpolateMetafactory.bootstrap(Lookup, String, MethodType, >> String, MethodHandle):CallSite >> [ "name: \(name) age: \(age)", String::__interpolate__(String, Object[]):String >> ] >> From the perspective of the compiler the method __interpolate__ works exactly >> like a method with a polymorphic method signature (the method annotated with >> @PolymorphicSignature), >> so the descriptor of invokedynamic is created by collecting the type of the >> argument, here the interpolation method is called with a String and an int, so >> the descriptor >> and the return type is String so the descriptor is >> (Ljava/lang/StringI)Ljava/lang/String; >> Considering the interpolation method as a polymorphic method is important in >> term of performance because it means that not boxing will be done by the >> compiler, if there are some boxing, they will be done by the runtime, so are >> optional if the __interpolate__bootstrap__ does not need to box arguments. >> You can also notice that the formatted string is passed as a bootstrap constant >> so all the parsing of the format can be done once outside of the hot path. >> A call to invokedynamic also pass as a second bootstrap argument the method >> handle to the method __interpolate__, so the implementation inside >> InterpolateMetafactory.bootstrap can called this method if no method >> __interpolate__bootstrap__ exists. >> Here is a raw implementation of the class InterpolateMetafactory. >> The method formatIterator() return an Iterator of Token which is a sealed class. >> The method bootstrap() first lookup to a method "__interpolate__bootstrap__" in >> the lookup class that takes a Lookup, a String, a MethodType, the format and >> the default implementation and call it if it exists or takes the default >> implementation, bind the formatted String and adapt the arguments using asType >> (ask for boxing, etc). >> package java.lang.runtime; >> public class InterpolateMetafactory { >> public sealed interface Token { >> public record Text(String text) implements Token {} >> public record Binding(String name) implements Token {} >> } >> public static Iterator formatIterator(String format) { >> ... >> } >> public static CallSite bootstrap(Lookup lookup, String name, MethodType >> methodType, String format, MethodHandle impl) throws Throwable { >> // check if there is a bootstrap method >> MethodHandle bootstrap; >> try { >> bootstrap = lookup.findStatic(lookup.lookupClass(), >> "__interpolate__bootstrap__", MethodType.methodType(CallSite.class, >> Lookup.class, String.class, MethodType.class, String.class, >> MethodHandle.class)); >> } catch(NoSuchMethodException e) { >> // bind the default implementation >> return new ConstantCallSite(impl.bindTo(format).asType(methodType)); >> } >> return boostrap.invoke(lookup, name, methodType, format, impl); >> } >> } >> Here is another example, showing how to declare the methods __interpolate__ and >> __interpolate__bootstrap__ inside java.util.regex.Pattern. >> The "default" implementation calls Pattern.compile() and the optimized one >> always returns the result of Pattern.compile() as a constant. >> package java.util.regex; >> public class Pattern { >> public static String __interpolate__(String format) {. // the formatted string >> can not have arguments >> return Pattern.compile(format); >> } >> private static CallSite __interpolate__bootstrap__(Lookup lookup, String name, >> MethodType methodType, String format, MethodHandle impl) { >> return new ConstantCallSite(MethodHandles.constant(Pattern.class, >> Pattern.compile(format))); >> } >> } >> The method __interpolate__ provides via its signature, the parameter types that >> are verified by the compiler. >> It also provides a code that can be used by the tools that does static analysis >> on the bytecode because those tools can not see through the method handle >> returned by a bootstrap method given that it's a runtime construct, it's >> usually not available at the time the static analysis is done. This should be >> enough to have tools like Graal VM native image to see through the >> invokedynamic in a similar way it sees through the invokedynamic used when >> creating a lambda. >> The fact that all invokedynamic goes through the method >> InterpolateMetafactory.bootstrap and trampoline from it means that adding or >> removing the method __interpolate__bootstrap__ is a binary compatible change, >> if __interpolate__bootstrap__ is declared private. So implementing >> __interpolate__bootstrap__ can be an afterthought. >> regards, >> R?mi From forax at univ-mlv.fr Fri Oct 15 17:09:57 2021 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 15 Oct 2021 19:09:57 +0200 (CEST) Subject: String Interpolation In-Reply-To: <6bb17702-969d-a749-e438-c920d07ab4a4@oracle.com> References: <583208882.1665266.1634144995832.JavaMail.zimbra@u-pem.fr> <6bb17702-969d-a749-e438-c920d07ab4a4@oracle.com> Message-ID: <1348841301.2745456.1634317797747.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "Remi Forax" , "amber-spec-experts" > > Sent: Mercredi 13 Octobre 2021 21:32:19 > Subject: Re: String Interpolation After grumbling a lot, let's restart [...] >> That's why the specification allow you to provide a second more optimised >> version of the interpolation method using a method __interpolate__bootstrap__. > This is an obviously attractive goal, but the mechanism is way too ad-hoc -- and > also too limited -- and also too advanced to be a language feature. Bootstraps > are way too complicated to expose in the source language in this way, > especially not this magically. And its too ad-hoc, since its specific to the > interpolation feature, whereas one could imagine a number of other contexts > where it is useful too. So this is a bad tradeoff in many ways. Jim's > implementation very cleverly gets the equivalent of this using pure library > implementation (which leans on MutableCallSite.) > While it is surely a desirable goal to be able to optimize formatter > implementation, it is also super-easy to become obsessed with this, and give it > a bigger place in the feature than it deserves. For some cases -- notably > String::format -- there are huge savings to be had (from a number of sources, > not least of which is that scanning the string at every invocation and choosing > a strategy based on that is expensive.) But in other cases, it is almost > irrelevant. For pure concatenation, it is already pretty fast; for SQL, the > cost of constructing the query is a tiny part of the execution time, so its not > even worth optimizing. So this is a "nice to have" rather than the centerpiece > of the feature. > To be clear, the centerpiece is the gathering up of a template + parameters so > that their combination can be handled by another entity, whether right now, > later, or never. Optimizing the case where it is done right now, using a > predictable choice of entity, is an optimization, but not the centerpiece. > Let me sketch out how we're envisioning this. The API is something like: > interface TemplatePolicy { > T apply(TemplatedString ts); > // returns MethodHandle (TemplatePolicy, TemplatedString) -> Object > default MethodHandle asMethodHandle(TemplatedString ts) { > return MH[TemplatePolicy::apply] > } > } I don't understand where you pass the arguments, is it not more something like public interface TemplatePolicy< T , E extends Exception> { T apply(TemplatedString template, Object... args) throws E ; // returns a MethodHandle with the signature T(TemplatePolicy, Object...) default MethodHandle asMethodHandle(TemplatedString template, MethodType type) { ... } } The second parameter of asMethodHandle is the descriptor of invokedynamic, this ensure that there is no boxing on the fast path, and if the implementation of TemplatePolicy is a final class. > The API specification has a number of constraints on the implementation of > asMethodHandle, which I'll get to in a second. When the compiler encounters an > immediate application P."...", it generates an indy, which uses a special > bootstrap that returns a MutableCallSite. The MutableCallSite initially has as > its target a special secondary bootstrap MH, which represents an interpolation > site that has not yet seen an actual invocation. The secondary bootstrap MH has > the shape of TemplatePolicy::apply (e.g., (TemplatePolicy, TemplatedString) -> > Object), so on first invocation it receives the TP object and the TS. It then > calls TP::asMethodHandle, and wraps this MH with a GWT which validates the > invariants and proceeds to that MH if they hold -- which they will 99.x% of the > time. > The invariant is that the dynamic type of the per-instantiation TP be == to the > dynamic type of the TP that was present at secondary linkage. That is, it be an > instance of the same class, but not the same instance. By definition, the > string will always be the same as will the types of the parameters, since this > is specific to concrete P."..." sites. So the MH can take advantage of that. > The constraint on TP::asMethodHandle is that it not undermine this invariant; > that if it generates a MH that is dependent on TP state, it not bake that state > into the resulting MH, but instead, treat the TP state as a parameter. Further, > the MH must be behaviorally equivalent to calling apply. > If the GWT fails, it means the user is doing something like: > for (TP p : listOfProcessors) { > blah blah p."foo \{a}" > } > in which case the GWT falls back to the "just do an invokevirtual of TP::apply" > strategy. (It could get fancier but I don't see any point.) > This lets us rescue indy-based translation without exposing a magic indy-hook in > the JLS. (Sorry, I know you wanted the magic indy hook.) As i said, i don't care about having the exact bootstrap API, but i care about the unnecessary boxing / class check / etc that can occur. I believe that if asMethodHandle() takes a MethodType as second parameter, performance should be Ok. Is it something that can be negotiated ? I've implemented a prototype to convince myself that with a MethodType as parameter is was not actually that bad. [ https://github.com/forax/java-interpolation | https://github.com/forax/java-interpolation ] (I also suppose that the TemplatedString is created with a constant dynamic ?) R?mi > On 10/13/2021 1:09 PM, Remi Forax wrote: >> Hi everybody, i've spend some time to think how the String interpolation + >> Policy should be specified and implemented. >> The goal is to add a syntax specifying a user defined method to "interpolate" >> (for a lack of better word) a string with arguments. >> Given that it's a method, the exact semantics of the interpolation, things like >> how the arguments are escaped, how the formatted string is parsed, is written >> is Java, this will allow to support a wide range of use cases. >> This proposal does not differ from the original proposal of Brian and Jim in its >> goal but in the way a user declare the interpolation method(s). >> TLDR; you can declare an interpolation method and optionally an interpolation >> bootstrap method if you want a more efficient code at the price of having to >> play with the method handle API. >> --- >> The proposal of Brian and Jim uses an interface to define the policy but in this >> case, using an interface is not what we want. >> I think there are two main reasons, >> - the interpolation method can be an instance method but can also be a factory >> method, a static method, and an interface can not constraint a static method. >> - we want the signature of the interpolation method to be free to use any number >> of parameters of any types, something that can not be specified with type >> parameters in Java. >> So let's take a step back and write some examples, as a user of the >> interpolation method, we want to >> - be able to specify string interpolation, >> you can notice that this is a static method. >> String name = ... >> int value = ... >> String s = String."name: \(name) age: \(age)"; >> - we also want to be able to instantiate regex Pattern, >> and have a magic optimisation that creates the Pattern instance only one >> Pattern pattern = Pattern."foo|bar"; >> - we also want to support instance method, so the interpolation can escape the >> arguments differently depending on the context, >> here by example, escaping differently depending on the database driver. >> String username = ... >> Connection connection = ... >> connection.""" >> SELECT * FROM users where user == "\(username)" >> """; >> I think the simplest way to specify an interpolation method is to have a method >> with a special name, >> i will use __interpolate__ because i don't want to discuss the exact syntax >> here. >> This method can be a static method or an instance method and has a restriction, >> the first parameter has to be a String because the first argument is the >> formatted string. >> Here is an example of how the method __interpolate__ inside java.lang.String can >> be written. >> To avoid everybody to re-implement the parsing of the formatted string, the >> class java.lang.runtime.InterpolateMetafactory provides a helper method >> "formatIterator" that returns an iterator splitting the formatted string into >> text and binding. >> package java.lang; >> public class String { >> ... >> public static String __interpolate__(String format, Object... args) { >> var i = 0; >> var builder = new StringBuilder(); >> var iterator = InterpolateMetafactory.formatIterator(format); >> while(iterator.hasNext()) { >> switch(iterator.next()) { >> case Text(var text) -> builder.append(text); >> case Binding binding -> args[i++]; >> } >> } >> return builder.toString(); >> } >> ... >> } >> While this is nice, you may think that it's just syntactic sugar and it will not >> be more performant that String.valueOf(), i.e. it will be slow. >> That's why the specification allow you to provide a second more optimised >> version of the interpolation method using a method __interpolate__bootstrap__. >> This method __interpolate__bootstrap__ is not required, can not replace the >> method __interpolate__, both __interpolate__ and __interpolate__bootstrap__ >> has to be present and it's a backward compatible change to add a method >> __interpolate__bootstrap__ after the fact, there is no need to recompile >> all the client code. >> For that the compiler translation rely on invokedynamic to call the method >> bootstrap of the class InterpolateMetafactor that at runtime decide >> to trampoline either to the method __interpolate__bootstrap__ or to the method >> __interpolate__ if no __interpolate__bootstrap__ exists. >> Here is an example of how a call to the interpolation method of String is >> generated by javac >> For the Java code >> String name = ... >> int value = ... >> String s = String."name: \(name) age: \(age)"; >> the equivalent bytecode is >> aload_1. // load name >> iload_2. // load age >> invokedynamic __interpolate__ (Ljava/lang/StringI)Ljava/lang/String; >> java.lang.runtime.InterpolateMetafactory.bootstrap(Lookup, String, MethodType, >> String, MethodHandle):CallSite >> [ "name: \(name) age: \(age)", String::__interpolate__(String, Object[]):String >> ] >> From the perspective of the compiler the method __interpolate__ works exactly >> like a method with a polymorphic method signature (the method annotated with >> @PolymorphicSignature), >> so the descriptor of invokedynamic is created by collecting the type of the >> argument, here the interpolation method is called with a String and an int, so >> the descriptor >> and the return type is String so the descriptor is >> (Ljava/lang/StringI)Ljava/lang/String; >> Considering the interpolation method as a polymorphic method is important in >> term of performance because it means that not boxing will be done by the >> compiler, if there are some boxing, they will be done by the runtime, so are >> optional if the __interpolate__bootstrap__ does not need to box arguments. >> You can also notice that the formatted string is passed as a bootstrap constant >> so all the parsing of the format can be done once outside of the hot path. >> A call to invokedynamic also pass as a second bootstrap argument the method >> handle to the method __interpolate__, so the implementation inside >> InterpolateMetafactory.bootstrap can called this method if no method >> __interpolate__bootstrap__ exists. >> Here is a raw implementation of the class InterpolateMetafactory. >> The method formatIterator() return an Iterator of Token which is a sealed class. >> The method bootstrap() first lookup to a method "__interpolate__bootstrap__" in >> the lookup class that takes a Lookup, a String, a MethodType, the format and >> the default implementation and call it if it exists or takes the default >> implementation, bind the formatted String and adapt the arguments using asType >> (ask for boxing, etc). >> package java.lang.runtime; >> public class InterpolateMetafactory { >> public sealed interface Token { >> public record Text(String text) implements Token {} >> public record Binding(String name) implements Token {} >> } >> public static Iterator formatIterator(String format) { >> ... >> } >> public static CallSite bootstrap(Lookup lookup, String name, MethodType >> methodType, String format, MethodHandle impl) throws Throwable { >> // check if there is a bootstrap method >> MethodHandle bootstrap; >> try { >> bootstrap = lookup.findStatic(lookup.lookupClass(), >> "__interpolate__bootstrap__", MethodType.methodType(CallSite.class, >> Lookup.class, String.class, MethodType.class, String.class, >> MethodHandle.class)); >> } catch(NoSuchMethodException e) { >> // bind the default implementation >> return new ConstantCallSite(impl.bindTo(format).asType(methodType)); >> } >> return boostrap.invoke(lookup, name, methodType, format, impl); >> } >> } >> Here is another example, showing how to declare the methods __interpolate__ and >> __interpolate__bootstrap__ inside java.util.regex.Pattern. >> The "default" implementation calls Pattern.compile() and the optimized one >> always returns the result of Pattern.compile() as a constant. >> package java.util.regex; >> public class Pattern { >> public static String __interpolate__(String format) {. // the formatted string >> can not have arguments >> return Pattern.compile(format); >> } >> private static CallSite __interpolate__bootstrap__(Lookup lookup, String name, >> MethodType methodType, String format, MethodHandle impl) { >> return new ConstantCallSite(MethodHandles.constant(Pattern.class, >> Pattern.compile(format))); >> } >> } >> The method __interpolate__ provides via its signature, the parameter types that >> are verified by the compiler. >> It also provides a code that can be used by the tools that does static analysis >> on the bytecode because those tools can not see through the method handle >> returned by a bootstrap method given that it's a runtime construct, it's >> usually not available at the time the static analysis is done. This should be >> enough to have tools like Graal VM native image to see through the >> invokedynamic in a similar way it sees through the invokedynamic used when >> creating a lambda. >> The fact that all invokedynamic goes through the method >> InterpolateMetafactory.bootstrap and trampoline from it means that adding or >> removing the method __interpolate__bootstrap__ is a binary compatible change, >> if __interpolate__bootstrap__ is declared private. So implementing >> __interpolate__bootstrap__ can be an afterthought. >> regards, >> R?mi From james.laskey at oracle.com Fri Oct 15 18:34:58 2021 From: james.laskey at oracle.com (Jim Laskey) Date: Fri, 15 Oct 2021 18:34:58 +0000 Subject: String Interpolation In-Reply-To: <1348841301.2745456.1634317797747.JavaMail.zimbra@u-pem.fr> References: <583208882.1665266.1634144995832.JavaMail.zimbra@u-pem.fr> <6bb17702-969d-a749-e438-c920d07ab4a4@oracle.com> <1348841301.2745456.1634317797747.JavaMail.zimbra@u-pem.fr> Message-ID: <650E55D3-972C-480D-9D58-EA4ABA297EDD@oracle.com> Yes, the methodology we have chosen does avoid boxing (and vararg). We don't need parameter types because those types are accessible from the TemplatedString implementation. So we only really need the user chosen return type. But to be honest, we don't even need that (the current prototype doesn't have the argument) because of template erasure. T is just Object from the bootstrap perspective and the policy can glean the return type elsewhere, if necessary, for MethodHandle construction. To allay any fears of performance, FMT."%s\{a} + %s\{b} = %s\{a + b}" is as fast as a + " + " + b + " = " + (a + b). -- Jim On Oct 15, 2021, at 2:09 PM, forax at univ-mlv.fr wrote: ________________________________ From: "Brian Goetz" > To: "Remi Forax" >, "amber-spec-experts" > Sent: Mercredi 13 Octobre 2021 21:32:19 Subject: Re: String Interpolation After grumbling a lot, let's restart [...] That's why the specification allow you to provide a second more optimised version of the interpolation method using a method __interpolate__bootstrap__. This is an obviously attractive goal, but the mechanism is way too ad-hoc -- and also too limited -- and also too advanced to be a language feature. Bootstraps are way too complicated to expose in the source language in this way, especially not this magically. And its too ad-hoc, since its specific to the interpolation feature, whereas one could imagine a number of other contexts where it is useful too. So this is a bad tradeoff in many ways. Jim's implementation very cleverly gets the equivalent of this using pure library implementation (which leans on MutableCallSite.) While it is surely a desirable goal to be able to optimize formatter implementation, it is also super-easy to become obsessed with this, and give it a bigger place in the feature than it deserves. For some cases -- notably String::format -- there are huge savings to be had (from a number of sources, not least of which is that scanning the string at every invocation and choosing a strategy based on that is expensive.) But in other cases, it is almost irrelevant. For pure concatenation, it is already pretty fast; for SQL, the cost of constructing the query is a tiny part of the execution time, so its not even worth optimizing. So this is a "nice to have" rather than the centerpiece of the feature. To be clear, the centerpiece is the gathering up of a template + parameters so that their combination can be handled by another entity, whether right now, later, or never. Optimizing the case where it is done right now, using a predictable choice of entity, is an optimization, but not the centerpiece. Let me sketch out how we're envisioning this. The API is something like: interface TemplatePolicy { T apply(TemplatedString ts); // returns MethodHandle (TemplatePolicy, TemplatedString) -> Object default MethodHandle asMethodHandle(TemplatedString ts) { return MH[TemplatePolicy::apply] } } I don't understand where you pass the arguments, is it not more something like public interface TemplatePolicy { T apply(TemplatedString template, Object... args) throws E; // returns a MethodHandle with the signature T(TemplatePolicy, Object...) default MethodHandle asMethodHandle(TemplatedString template, MethodType type) { ... } } The second parameter of asMethodHandle is the descriptor of invokedynamic, this ensure that there is no boxing on the fast path, and if the implementation of TemplatePolicy is a final class. The API specification has a number of constraints on the implementation of asMethodHandle, which I'll get to in a second. When the compiler encounters an immediate application P."...", it generates an indy, which uses a special bootstrap that returns a MutableCallSite. The MutableCallSite initially has as its target a special secondary bootstrap MH, which represents an interpolation site that has not yet seen an actual invocation. The secondary bootstrap MH has the shape of TemplatePolicy::apply (e.g., (TemplatePolicy, TemplatedString) -> Object), so on first invocation it receives the TP object and the TS. It then calls TP::asMethodHandle, and wraps this MH with a GWT which validates the invariants and proceeds to that MH if they hold -- which they will 99.x% of the time. The invariant is that the dynamic type of the per-instantiation TP be == to the dynamic type of the TP that was present at secondary linkage. That is, it be an instance of the same class, but not the same instance. By definition, the string will always be the same as will the types of the parameters, since this is specific to concrete P."..." sites. So the MH can take advantage of that. The constraint on TP::asMethodHandle is that it not undermine this invariant; that if it generates a MH that is dependent on TP state, it not bake that state into the resulting MH, but instead, treat the TP state as a parameter. Further, the MH must be behaviorally equivalent to calling apply. If the GWT fails, it means the user is doing something like: for (TP p : listOfProcessors) { blah blah p."foo \{a}" } in which case the GWT falls back to the "just do an invokevirtual of TP::apply" strategy. (It could get fancier but I don't see any point.) This lets us rescue indy-based translation without exposing a magic indy-hook in the JLS. (Sorry, I know you wanted the magic indy hook.) As i said, i don't care about having the exact bootstrap API, but i care about the unnecessary boxing / class check / etc that can occur. I believe that if asMethodHandle() takes a MethodType as second parameter, performance should be Ok. Is it something that can be negotiated ? I've implemented a prototype to convince myself that with a MethodType as parameter is was not actually that bad. https://github.com/forax/java-interpolation (I also suppose that the TemplatedString is created with a constant dynamic ?) R?mi On 10/13/2021 1:09 PM, Remi Forax wrote: Hi everybody, i've spend some time to think how the String interpolation + Policy should be specified and implemented. The goal is to add a syntax specifying a user defined method to "interpolate" (for a lack of better word) a string with arguments. Given that it's a method, the exact semantics of the interpolation, things like how the arguments are escaped, how the formatted string is parsed, is written is Java, this will allow to support a wide range of use cases. This proposal does not differ from the original proposal of Brian and Jim in its goal but in the way a user declare the interpolation method(s). TLDR; you can declare an interpolation method and optionally an interpolation bootstrap method if you want a more efficient code at the price of having to play with the method handle API. --- The proposal of Brian and Jim uses an interface to define the policy but in this case, using an interface is not what we want. I think there are two main reasons, - the interpolation method can be an instance method but can also be a factory method, a static method, and an interface can not constraint a static method. - we want the signature of the interpolation method to be free to use any number of parameters of any types, something that can not be specified with type parameters in Java. So let's take a step back and write some examples, as a user of the interpolation method, we want to - be able to specify string interpolation, you can notice that this is a static method. String name = ... int value = ... String s = String."name: \(name) age: \(age)"; - we also want to be able to instantiate regex Pattern, and have a magic optimisation that creates the Pattern instance only one Pattern pattern = Pattern."foo|bar"; - we also want to support instance method, so the interpolation can escape the arguments differently depending on the context, here by example, escaping differently depending on the database driver. String username = ... Connection connection = ... connection.""" SELECT * FROM users where user == "\(username)" """; I think the simplest way to specify an interpolation method is to have a method with a special name, i will use __interpolate__ because i don't want to discuss the exact syntax here. This method can be a static method or an instance method and has a restriction, the first parameter has to be a String because the first argument is the formatted string. Here is an example of how the method __interpolate__ inside java.lang.String can be written. To avoid everybody to re-implement the parsing of the formatted string, the class java.lang.runtime.InterpolateMetafactory provides a helper method "formatIterator" that returns an iterator splitting the formatted string into text and binding. package java.lang; public class String { ... public static String __interpolate__(String format, Object... args) { var i = 0; var builder = new StringBuilder(); var iterator = InterpolateMetafactory.formatIterator(format); while(iterator.hasNext()) { switch(iterator.next()) { case Text(var text) -> builder.append(text); case Binding binding -> args[i++]; } } return builder.toString(); } ... } While this is nice, you may think that it's just syntactic sugar and it will not be more performant that String.valueOf(), i.e. it will be slow. That's why the specification allow you to provide a second more optimised version of the interpolation method using a method __interpolate__bootstrap__. This method __interpolate__bootstrap__ is not required, can not replace the method __interpolate__, both __interpolate__ and __interpolate__bootstrap__ has to be present and it's a backward compatible change to add a method __interpolate__bootstrap__ after the fact, there is no need to recompile all the client code. For that the compiler translation rely on invokedynamic to call the method bootstrap of the class InterpolateMetafactor that at runtime decide to trampoline either to the method __interpolate__bootstrap__ or to the method __interpolate__ if no __interpolate__bootstrap__ exists. Here is an example of how a call to the interpolation method of String is generated by javac For the Java code String name = ... int value = ... String s = String."name: \(name) age: \(age)"; the equivalent bytecode is aload_1. // load name iload_2. // load age invokedynamic __interpolate__ (Ljava/lang/StringI)Ljava/lang/String; java.lang.runtime.InterpolateMetafactory.bootstrap(Lookup, String, MethodType, String, MethodHandle):CallSite [ "name: \(name) age: \(age)", String::__interpolate__(String, Object[]):String ] From the perspective of the compiler the method __interpolate__ works exactly like a method with a polymorphic method signature (the method annotated with @PolymorphicSignature), so the descriptor of invokedynamic is created by collecting the type of the argument, here the interpolation method is called with a String and an int, so the descriptor and the return type is String so the descriptor is (Ljava/lang/StringI)Ljava/lang/String; Considering the interpolation method as a polymorphic method is important in term of performance because it means that not boxing will be done by the compiler, if there are some boxing, they will be done by the runtime, so are optional if the __interpolate__bootstrap__ does not need to box arguments. You can also notice that the formatted string is passed as a bootstrap constant so all the parsing of the format can be done once outside of the hot path. A call to invokedynamic also pass as a second bootstrap argument the method handle to the method __interpolate__, so the implementation inside InterpolateMetafactory.bootstrap can called this method if no method __interpolate__bootstrap__ exists. Here is a raw implementation of the class InterpolateMetafactory. The method formatIterator() return an Iterator of Token which is a sealed class. The method bootstrap() first lookup to a method "__interpolate__bootstrap__" in the lookup class that takes a Lookup, a String, a MethodType, the format and the default implementation and call it if it exists or takes the default implementation, bind the formatted String and adapt the arguments using asType (ask for boxing, etc). package java.lang.runtime; public class InterpolateMetafactory { public sealed interface Token { public record Text(String text) implements Token {} public record Binding(String name) implements Token {} } public static Iterator formatIterator(String format) { ... } public static CallSite bootstrap(Lookup lookup, String name, MethodType methodType, String format, MethodHandle impl) throws Throwable { // check if there is a bootstrap method MethodHandle bootstrap; try { bootstrap = lookup.findStatic(lookup.lookupClass(), "__interpolate__bootstrap__", MethodType.methodType(CallSite.class, Lookup.class, String.class, MethodType.class, String.class, MethodHandle.class)); } catch(NoSuchMethodException e) { // bind the default implementation return new ConstantCallSite(impl.bindTo(format).asType(methodType)); } return boostrap.invoke(lookup, name, methodType, format, impl); } } Here is another example, showing how to declare the methods __interpolate__ and __interpolate__bootstrap__ inside java.util.regex.Pattern. The "default" implementation calls Pattern.compile() and the optimized one always returns the result of Pattern.compile() as a constant. package java.util.regex; public class Pattern { public static String __interpolate__(String format) {. // the formatted string can not have arguments return Pattern.compile(format); } private static CallSite __interpolate__bootstrap__(Lookup lookup, String name, MethodType methodType, String format, MethodHandle impl) { return new ConstantCallSite(MethodHandles.constant(Pattern.class, Pattern.compile(format))); } } The method __interpolate__ provides via its signature, the parameter types that are verified by the compiler. It also provides a code that can be used by the tools that does static analysis on the bytecode because those tools can not see through the method handle returned by a bootstrap method given that it's a runtime construct, it's usually not available at the time the static analysis is done. This should be enough to have tools like Graal VM native image to see through the invokedynamic in a similar way it sees through the invokedynamic used when creating a lambda. The fact that all invokedynamic goes through the method InterpolateMetafactory.bootstrap and trampoline from it means that adding or removing the method __interpolate__bootstrap__ is a binary compatible change, if __interpolate__bootstrap__ is declared private. So implementing __interpolate__bootstrap__ can be an afterthought. regards, R?mi From forax at univ-mlv.fr Fri Oct 15 20:26:16 2021 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 15 Oct 2021 22:26:16 +0200 (CEST) Subject: String Interpolation In-Reply-To: <650E55D3-972C-480D-9D58-EA4ABA297EDD@oracle.com> References: <583208882.1665266.1634144995832.JavaMail.zimbra@u-pem.fr> <6bb17702-969d-a749-e438-c920d07ab4a4@oracle.com> <1348841301.2745456.1634317797747.JavaMail.zimbra@u-pem.fr> <650E55D3-972C-480D-9D58-EA4ABA297EDD@oracle.com> Message-ID: <1270469810.2777441.1634329576804.JavaMail.zimbra@u-pem.fr> > From: "Jim Laskey" > To: "Remi Forax" > Cc: "Brian Goetz" , "amber-spec-experts" > > Sent: Vendredi 15 Octobre 2021 20:34:58 > Subject: Re: String Interpolation > Yes, the methodology we have chosen does avoid boxing (and vararg). We don't > need parameter types because those types are accessible from the > TemplatedString implementation. You're loosing two important types if you are using only the type of the bindings, - the type of the implementation of the TemplatePolicy (which is important because you can check if the implementation type is final or not), if it's final no guard is needed (the JIT can overcome that but you have created more method handles than necessary and you have to wait c2 to kick in). - the return type > So we only really need the user chosen return type. But to be honest, we don't > even need that (the current prototype doesn't have the argument) because of > template erasure. T is just Object from the bootstrap perspective and the > policy can glean the return type elsewhere, if necessary, for MethodHandle > construction. For the return type, one question is why the TemplatePolicy is parameterized by the return type. It forces boxing and discard the contextual type that comes from the caller site. Loosing the inferred return type is a big issue, because you are losing the ability to write policy that move values from the untyped world to the typed world. By example, let say you have a policy that works like Map.get(), actually you can not write int value1 = policy."\(key1)"; String value2 = policy."\(key2)"; because the closest you can come is Object. This is especially important when you start to throw patterns into the mix, switch(new JSONPolicy(jsonText)) { case """ { key1: \(int value1), key2: \(String value2) } """ -> { /* value1 is an int and value2 is a String */ } } > To allay any fears of performance, FMT."%s\{a} + %s\{b} = %s\{a + b}" is as fast > as a + " + " + b + " = " + (a + b). because this is a special case where for any formats the return type is fixed. > -- Jim R?mi >> On Oct 15, 2021, at 2:09 PM, [ mailto:forax at univ-mlv.fr | >> forax at univ-mlv.fr ] wrote: >>> From: "Brian Goetz" < [ mailto:brian.goetz at oracle.com | brian.goetz at oracle.com ] >>> > >>> To: "Remi Forax" < [ mailto:forax at univ-mlv.fr | forax at univ-mlv.fr ] >, >>> "amber-spec-experts" < [ mailto:amber-spec-experts at openjdk.java.net | >>> amber-spec-experts at openjdk.java.net ] > >>> Sent: Mercredi 13 Octobre 2021 21:32:19 >>> Subject: Re: String Interpolation >> After grumbling a lot, let's restart >> [...] >>>> That's why the specification allow you to provide a second more optimised >>>> version of the interpolation method using a method __interpolate__bootstrap__. >>> This is an obviously attractive goal, but the mechanism is way too ad-hoc -- and >>> also too limited -- and also too advanced to be a language feature. Bootstraps >>> are way too complicated to expose in the source language in this way, >>> especially not this magically. And its too ad-hoc, since its specific to the >>> interpolation feature, whereas one could imagine a number of other contexts >>> where it is useful too. So this is a bad tradeoff in many ways. Jim's >>> implementation very cleverly gets the equivalent of this using pure library >>> implementation (which leans on MutableCallSite.) >>> While it is surely a desirable goal to be able to optimize formatter >>> implementation, it is also super-easy to become obsessed with this, and give it >>> a bigger place in the feature than it deserves. For some cases -- notably >>> String::format -- there are huge savings to be had (from a number of sources, >>> not least of which is that scanning the string at every invocation and choosing >>> a strategy based on that is expensive.) But in other cases, it is almost >>> irrelevant. For pure concatenation, it is already pretty fast; for SQL, the >>> cost of constructing the query is a tiny part of the execution time, so its not >>> even worth optimizing. So this is a "nice to have" rather than the centerpiece >>> of the feature. >>> To be clear, the centerpiece is the gathering up of a template + parameters so >>> that their combination can be handled by another entity, whether right now, >>> later, or never. Optimizing the case where it is done right now, using a >>> predictable choice of entity, is an optimization, but not the centerpiece. >>> Let me sketch out how we're envisioning this. The API is something like: >>> interface TemplatePolicy { >>> T apply(TemplatedString ts); >>> // returns MethodHandle (TemplatePolicy, TemplatedString) -> Object >>> default MethodHandle asMethodHandle(TemplatedString ts) { >>> return MH[TemplatePolicy::apply] >>> } >>> } >> I don't understand where you pass the arguments, is it not more something like >> public interface TemplatePolicy< T , E extends Exception> { >> T apply(TemplatedString template, Object... args) throws E ; >> // returns a MethodHandle with the signature T(TemplatePolicy, Object...) >> default MethodHandle asMethodHandle(TemplatedString template, MethodType type) { >> ... >> } >> } >> The second parameter of asMethodHandle is the descriptor of invokedynamic, this >> ensure that there is no boxing on the fast path, and if the implementation of >> TemplatePolicy is a final class. >>> The API specification has a number of constraints on the implementation of >>> asMethodHandle, which I'll get to in a second. When the compiler encounters an >>> immediate application P."...", it generates an indy, which uses a special >>> bootstrap that returns a MutableCallSite. The MutableCallSite initially has as >>> its target a special secondary bootstrap MH, which represents an interpolation >>> site that has not yet seen an actual invocation. The secondary bootstrap MH has >>> the shape of TemplatePolicy::apply (e.g., (TemplatePolicy, TemplatedString) -> >>> Object), so on first invocation it receives the TP object and the TS. It then >>> calls TP::asMethodHandle, and wraps this MH with a GWT which validates the >>> invariants and proceeds to that MH if they hold -- which they will 99.x% of the >>> time. >>> The invariant is that the dynamic type of the per-instantiation TP be == to the >>> dynamic type of the TP that was present at secondary linkage. That is, it be an >>> instance of the same class, but not the same instance. By definition, the >>> string will always be the same as will the types of the parameters, since this >>> is specific to concrete P."..." sites. So the MH can take advantage of that. >>> The constraint on TP::asMethodHandle is that it not undermine this invariant; >>> that if it generates a MH that is dependent on TP state, it not bake that state >>> into the resulting MH, but instead, treat the TP state as a parameter. Further, >>> the MH must be behaviorally equivalent to calling apply. >>> If the GWT fails, it means the user is doing something like: >>> for (TP p : listOfProcessors) { >>> blah blah p."foo \{a}" >>> } >>> in which case the GWT falls back to the "just do an invokevirtual of TP::apply" >>> strategy. (It could get fancier but I don't see any point.) >>> This lets us rescue indy-based translation without exposing a magic indy-hook in >>> the JLS. (Sorry, I know you wanted the magic indy hook.) >> As i said, i don't care about having the exact bootstrap API, but i care about >> the unnecessary boxing / class check / etc that can occur. >> I believe that if asMethodHandle() takes a MethodType as second parameter, >> performance should be Ok. >> Is it something that can be negotiated ? >> I've implemented a prototype to convince myself that with a MethodType as >> parameter is was not actually that bad. >> [ https://github.com/forax/java-interpolation | >> https://github.com/forax/java-interpolation ] >> (I also suppose that the TemplatedString is created with a constant dynamic ?) >> R?mi >>> On 10/13/2021 1:09 PM, Remi Forax wrote: >>>> Hi everybody, i've spend some time to think how the String interpolation + >>>> Policy should be specified and implemented. >>>> The goal is to add a syntax specifying a user defined method to "interpolate" >>>> (for a lack of better word) a string with arguments. >>>> Given that it's a method, the exact semantics of the interpolation, things like >>>> how the arguments are escaped, how the formatted string is parsed, is written >>>> is Java, this will allow to support a wide range of use cases. >>>> This proposal does not differ from the original proposal of Brian and Jim in its >>>> goal but in the way a user declare the interpolation method(s). >>>> TLDR; you can declare an interpolation method and optionally an interpolation >>>> bootstrap method if you want a more efficient code at the price of having to >>>> play with the method handle API. >>>> --- >>>> The proposal of Brian and Jim uses an interface to define the policy but in this >>>> case, using an interface is not what we want. >>>> I think there are two main reasons, >>>> - the interpolation method can be an instance method but can also be a factory >>>> method, a static method, and an interface can not constraint a static method. >>>> - we want the signature of the interpolation method to be free to use any number >>>> of parameters of any types, something that can not be specified with type >>>> parameters in Java. >>>> So let's take a step back and write some examples, as a user of the >>>> interpolation method, we want to >>>> - be able to specify string interpolation, >>>> you can notice that this is a static method. >>>> String name = ... >>>> int value = ... >>>> String s = String."name: \(name) age: \(age)"; >>>> - we also want to be able to instantiate regex Pattern, >>>> and have a magic optimisation that creates the Pattern instance only one >>>> Pattern pattern = Pattern."foo|bar"; >>>> - we also want to support instance method, so the interpolation can escape the >>>> arguments differently depending on the context, >>>> here by example, escaping differently depending on the database driver. >>>> String username = ... >>>> Connection connection = ... >>>> connection.""" >>>> SELECT * FROM users where user == "\(username)" >>>> """; >>>> I think the simplest way to specify an interpolation method is to have a method >>>> with a special name, >>>> i will use __interpolate__ because i don't want to discuss the exact syntax >>>> here. >>>> This method can be a static method or an instance method and has a restriction, >>>> the first parameter has to be a String because the first argument is the >>>> formatted string. >>>> Here is an example of how the method __interpolate__ inside java.lang.String can >>>> be written. >>>> To avoid everybody to re-implement the parsing of the formatted string, the >>>> class java.lang.runtime.InterpolateMetafactory provides a helper method >>>> "formatIterator" that returns an iterator splitting the formatted string into >>>> text and binding. >>>> package java.lang; >>>> public class String { >>>> ... >>>> public static String __interpolate__(String format, Object... args) { >>>> var i = 0; >>>> var builder = new StringBuilder(); >>>> var iterator = InterpolateMetafactory.formatIterator(format); >>>> while(iterator.hasNext()) { >>>> switch(iterator.next()) { >>>> case Text(var text) -> builder.append(text); >>>> case Binding binding -> args[i++]; >>>> } >>>> } >>>> return builder.toString(); >>>> } >>>> ... >>>> } >>>> While this is nice, you may think that it's just syntactic sugar and it will not >>>> be more performant that String.valueOf(), i.e. it will be slow. >>>> That's why the specification allow you to provide a second more optimised >>>> version of the interpolation method using a method __interpolate__bootstrap__. >>>> This method __interpolate__bootstrap__ is not required, can not replace the >>>> method __interpolate__, both __interpolate__ and __interpolate__bootstrap__ >>>> has to be present and it's a backward compatible change to add a method >>>> __interpolate__bootstrap__ after the fact, there is no need to recompile >>>> all the client code. >>>> For that the compiler translation rely on invokedynamic to call the method >>>> bootstrap of the class InterpolateMetafactor that at runtime decide >>>> to trampoline either to the method __interpolate__bootstrap__ or to the method >>>> __interpolate__ if no __interpolate__bootstrap__ exists. >>>> Here is an example of how a call to the interpolation method of String is >>>> generated by javac >>>> For the Java code >>>> String name = ... >>>> int value = ... >>>> String s = String."name: \(name) age: \(age)"; >>>> the equivalent bytecode is >>>> aload_1. // load name >>>> iload_2. // load age >>>> invokedynamic __interpolate__ (Ljava/lang/StringI)Ljava/lang/String; >>>> java.lang.runtime.InterpolateMetafactory.bootstrap(Lookup, String, MethodType, >>>> String, MethodHandle):CallSite >>>> [ "name: \(name) age: \(age)", String::__interpolate__(String, Object[]):String >>>> ] >>>> From the perspective of the compiler the method __interpolate__ works exactly >>>> like a method with a polymorphic method signature (the method annotated with >>>> @PolymorphicSignature), >>>> so the descriptor of invokedynamic is created by collecting the type of the >>>> argument, here the interpolation method is called with a String and an int, so >>>> the descriptor >>>> and the return type is String so the descriptor is >>>> (Ljava/lang/StringI)Ljava/lang/String; >>>> Considering the interpolation method as a polymorphic method is important in >>>> term of performance because it means that not boxing will be done by the >>>> compiler, if there are some boxing, they will be done by the runtime, so are >>>> optional if the __interpolate__bootstrap__ does not need to box arguments. >>>> You can also notice that the formatted string is passed as a bootstrap constant >>>> so all the parsing of the format can be done once outside of the hot path. >>>> A call to invokedynamic also pass as a second bootstrap argument the method >>>> handle to the method __interpolate__, so the implementation inside >>>> InterpolateMetafactory.bootstrap can called this method if no method >>>> __interpolate__bootstrap__ exists. >>>> Here is a raw implementation of the class InterpolateMetafactory. >>>> The method formatIterator() return an Iterator of Token which is a sealed class. >>>> The method bootstrap() first lookup to a method "__interpolate__bootstrap__" in >>>> the lookup class that takes a Lookup, a String, a MethodType, the format and >>>> the default implementation and call it if it exists or takes the default >>>> implementation, bind the formatted String and adapt the arguments using asType >>>> (ask for boxing, etc). >>>> package java.lang.runtime; >>>> public class InterpolateMetafactory { >>>> public sealed interface Token { >>>> public record Text(String text) implements Token {} >>>> public record Binding(String name) implements Token {} >>>> } >>>> public static Iterator formatIterator(String format) { >>>> ... >>>> } >>>> public static CallSite bootstrap(Lookup lookup, String name, MethodType >>>> methodType, String format, MethodHandle impl) throws Throwable { >>>> // check if there is a bootstrap method >>>> MethodHandle bootstrap; >>>> try { >>>> bootstrap = lookup.findStatic(lookup.lookupClass(), >>>> "__interpolate__bootstrap__", MethodType.methodType(CallSite.class, >>>> Lookup.class, String.class, MethodType.class, String.class, >>>> MethodHandle.class)); >>>> } catch(NoSuchMethodException e) { >>>> // bind the default implementation >>>> return new ConstantCallSite(impl.bindTo(format).asType(methodType)); >>>> } >>>> return boostrap.invoke(lookup, name, methodType, format, impl); >>>> } >>>> } >>>> Here is another example, showing how to declare the methods __interpolate__ and >>>> __interpolate__bootstrap__ inside java.util.regex.Pattern. >>>> The "default" implementation calls Pattern.compile() and the optimized one >>>> always returns the result of Pattern.compile() as a constant. >>>> package java.util.regex; >>>> public class Pattern { >>>> public static String __interpolate__(String format) {. // the formatted string >>>> can not have arguments >>>> return Pattern.compile(format); >>>> } >>>> private static CallSite __interpolate__bootstrap__(Lookup lookup, String name, >>>> MethodType methodType, String format, MethodHandle impl) { >>>> return new ConstantCallSite(MethodHandles.constant(Pattern.class, >>>> Pattern.compile(format))); >>>> } >>>> } >>>> The method __interpolate__ provides via its signature, the parameter types that >>>> are verified by the compiler. >>>> It also provides a code that can be used by the tools that does static analysis >>>> on the bytecode because those tools can not see through the method handle >>>> returned by a bootstrap method given that it's a runtime construct, it's >>>> usually not available at the time the static analysis is done. This should be >>>> enough to have tools like Graal VM native image to see through the >>>> invokedynamic in a similar way it sees through the invokedynamic used when >>>> creating a lambda. >>>> The fact that all invokedynamic goes through the method >>>> InterpolateMetafactory.bootstrap and trampoline from it means that adding or >>>> removing the method __interpolate__bootstrap__ is a binary compatible change, >>>> if __interpolate__bootstrap__ is declared private. So implementing >>>> __interpolate__bootstrap__ can be an afterthought. >>>> regards, >>>> R?mi From forax at univ-mlv.fr Sun Oct 17 20:54:48 2021 From: forax at univ-mlv.fr (Remi Forax) Date: Sun, 17 Oct 2021 22:54:48 +0200 (CEST) Subject: Templated String and template policies, why the current design is bad Message-ID: <327759777.1307.1634504088937.JavaMail.zimbra@u-pem.fr> I've recently proposed another way to implement the templated string/template policies but i may not have made it clear why i think the current proposal [1] is bad. First, some vocabulary, a templated string is a string with some unnamed parameters that are filled with the result of expressions by example, if we use ${ expr } as escape sequence to introduce an expression the code var a = 3; var b = 4; "sum ${ a } + ${ b } = ${ a + b }" can be decomposed into - a string template that can be seen either as a string "sum @ + @ = @" with a special character (here '@') denoting a hole for each parameter or an array of strings ["sum ", " + ", " = ", ""] indicating the strings in between holes. - 3 parameters, param0, param1 and param2 initialized respectively with the results of the expressions a, b and a + b Before talking about the current proposal, let's take a look to the way both JavaScript and Scala, implement the string interpolation. For JavaScript [2], you define a function that the template as an array and as many parameters you need function foo(templateParts, param0, param1, param2) { ... } JavaScript uses backticks `` to delimit the templated strings and ${} as escape sequence so var a = 3; var b = 4; foo.`sum ${ a } + ${ b } = ${ a + b }` is equivalent to foo(["sum ", " + ", " = ", ""], a, b, a + b) In Scala, this mostly works the same way, there is a class StringContext that correspond to a templated string and you define the function foo as a method of StringContext that takes the parameters (in Scala, you can add methods on an already existing class using (abusing of) the implicit keyword). implicit class FooHelper(val template: StringContext) { // adds the following methods to StringContext def foo(param0: Any, param1: Any, param2: Any) { ... } } Scala uses quotes "" to delimit the templated string and ${} as escape sequence so val a = 3; val b = 4; foo."sum ${ a } + ${ b } = ${ a + b }" is equivalent to new StringContext("sum ", " + ", " = ", "").foo(a, b, a + b) In summary, for both JavaScript and Scala, the generalization of string interpolation is a function call which takes the templates string parts as first argument and the parameters of the templated string as the other parameters. So in Java, you would assume that - there is an object that represents a templated string with the holes - there is a method that takes the templated string as first parameter and the parameters of the templated string But this is not how the proposed design works. The TemplateString does not represent a string with some holes, it represents the string with some holes plus the values of the holes, as if the arguments of the parameters were partially applied. The TemplateString acts as a closure on the arguments, a glorified Supplier if you prefer. Because the arguments are already inside the TemplatedString, the TemplatePolicy, the function that should take the template and the parameters does not declare the types of the parameters. Which means that there is no way for someone that creates a TemplatePolicy to declare the types of the parameters, any parameters is always valid, so there is no type safety. This design is not unknown, this is the GString [4] of Groovy. While it makes sense for a dynamic language like Groovy to not have to declare the type of the parameters, it makes no sense for a language like Java which is statically typed to not have a way to declare the types of the parameters like Scala or TypeScript/JavaScript do. The other issue with the proposed design is that there is no way to declare the template policy as a static method, it has to be an instance method implementing an interface despite the fact that both JavaScript and Scala* support function first and lets the user adds supplementary arguments as a secondary mechanism (using currying in Scala and by adding a property on the function itself in JavaScript). There is a good reason to support static methods in Java, a lot of use-cases does not requires the template policy to have additional arguments (storing them in an instance is not necessary) so forcing the template policy to be defined as an instance method means a lot of boilerplate for no good reason. I hope i've convinced you that the current proposal for string interpolation in Java is not the right one. regards, R?mi * for Scala, it's a method on StringContext that acts as a function that takes a StringContext as first parameter. [1] https://bugs.openjdk.java.net/browse/JDK-8273943 [3] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals [4] https://docs.scala-lang.org/overviews/core/string-interpolation.html [2] https://docs.groovy-lang.org/docs/latest/html/api/groovy/lang/GString.html From brian.goetz at oracle.com Mon Oct 18 16:47:00 2021 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 18 Oct 2021 12:47:00 -0400 Subject: Templated String and template policies, why the current design is bad In-Reply-To: <327759777.1307.1634504088937.JavaMail.zimbra@u-pem.fr> References: <327759777.1307.1634504088937.JavaMail.zimbra@u-pem.fr> Message-ID: <741655d7-ca30-2226-f561-60facffacd71@oracle.com> This seems a very strange argument to me. Templates are by their nature dynamic -- a template has an unknown number of holes, and the holes are filled with arbitrary expressions.?? People like templates because they're easy to use, and they're easy to use because they're flexible.? Consider String::format: ??? String format(String formatString, Object... values) There are many dynamic conditions that are not statically checked here; that the format string is well-formed, that the number of holes matches the number of values provided, that the types of the values are suitable for filling the holes, etc.? Every templating policy will carry their own private interpretation of these requirements, which would require much more complex type systems to capture. When the templating policy is a well-known constant, such as java.lang.String.FMT, IDEs will be able to provide better checking based on the specification of the formatter, but that's a bonus. You're saying here that what we should reify is not format+values, but format+types.? This is not an unreasonable choice (but, doesn't rise to the bar you've set by "the current design is bad"), but I think your argument is an implementation preference dressed up in theoretical garb.? You want the abstraction to serve the implementation (a bootstrap), so you want to shape it like what a bootstrap wants to consume. The reality is that the current implementation can extract this information perfectly well, and can easily and cheaply test for the invariants that are needed to guard the computation.? The design choice here is that the abstraction we are exposing is one that is more useful **to the users**; the format string and associated values can now travel together as they pass through the layers of, say, a logging framework. So we've deliberately chosen an API that is best for users, and makes a little extra work for implementors, rather than the other way around.? (And yes, the decision was informed by roads previously explored by JavaScript and Groovy.) > The other issue with the proposed design is that there is no way to declare the template policy as a static method, it has to be an instance method implementing an interface despite the fact that both JavaScript and Scala* support function first and lets the user adds supplementary arguments as a secondary mechanism (using currying in Scala and by adding a property on the function itself in JavaScript). The Template policy is a SAM interface, so any static method of the right shape can be turned into a template policy with a method reference. I suspect what you mean by "no way", is "no way to access the super-optimized implementation strategy"?? And I'll say again the two answers I've already given to that: (a) many such formatters will not benefit from the low-level implementation strategy anyway, and (b) we should design the API to serve the users, not the implementors.? THere are many more users. On 10/17/2021 4:54 PM, Remi Forax wrote: > I've recently proposed another way to implement the templated string/template policies but i may not have made it clear why i think the current proposal [1] is bad. > > First, some vocabulary, a templated string is a string with some unnamed parameters that are filled with the result of expressions > by example, if we use ${ expr } as escape sequence to introduce an expression > the code > > var a = 3; > var b = 4; > "sum ${ a } + ${ b } = ${ a + b }" > > can be decomposed into > > - a string template that can be seen either as a string "sum @ + @ = @" with a special character (here '@') denoting a hole for each parameter > or an array of strings ["sum ", " + ", " = ", ""] indicating the strings in between holes. > - 3 parameters, param0, param1 and param2 initialized respectively with the results of the expressions a, b and a + b > > Before talking about the current proposal, let's take a look to the way both JavaScript and Scala, implement the string interpolation. > > For JavaScript [2], you define a function that the template as an array and as many parameters you need > function foo(templateParts, param0, param1, param2) { > ... > } > > JavaScript uses backticks `` to delimit the templated strings and ${} as escape sequence > so > var a = 3; > var b = 4; > foo.`sum ${ a } + ${ b } = ${ a + b }` > > is equivalent to > > foo(["sum ", " + ", " = ", ""], a, b, a + b) > > > In Scala, this mostly works the same way, there is a class StringContext that correspond to a templated string and you define the function foo as a method of StringContext that takes the parameters (in Scala, you can add methods on an already existing class using (abusing of) the implicit keyword). > > implicit class FooHelper(val template: StringContext) { // adds the following methods to StringContext > def foo(param0: Any, param1: Any, param2: Any) { > ... > } > } > > Scala uses quotes "" to delimit the templated string and ${} as escape sequence > so > val a = 3; > val b = 4; > foo."sum ${ a } + ${ b } = ${ a + b }" > > is equivalent to > new StringContext("sum ", " + ", " = ", "").foo(a, b, a + b) > > > > In summary, for both JavaScript and Scala, the generalization of string interpolation is a function call which takes the templates string parts as first argument and the parameters of the templated string as the other parameters. > > So in Java, you would assume that > - there is an object that represents a templated string with the holes > - there is a method that takes the templated string as first parameter and the parameters of the templated string > > But this is not how the proposed design works. > > The TemplateString does not represent a string with some holes, it represents the string with some holes plus the values of the holes, as if the arguments of the parameters were partially applied. The TemplateString acts as a closure on the arguments, a glorified Supplier if you prefer. > > Because the arguments are already inside the TemplatedString, the TemplatePolicy, the function that should take the template and the parameters does not declare the types of the parameters. > Which means that there is no way for someone that creates a TemplatePolicy to declare the types of the parameters, any parameters is always valid, so there is no type safety. > > This design is not unknown, this is the GString [4] of Groovy. While it makes sense for a dynamic language like Groovy to not have to declare the type of the parameters, it makes no sense for a language like Java which is statically typed to not have a way to declare the types of the parameters like Scala or TypeScript/JavaScript do. > > The other issue with the proposed design is that there is no way to declare the template policy as a static method, it has to be an instance method implementing an interface despite the fact that both JavaScript and Scala* support function first and lets the user adds supplementary arguments as a secondary mechanism (using currying in Scala and by adding a property on the function itself in JavaScript). > > There is a good reason to support static methods in Java, a lot of use-cases does not requires the template policy to have additional arguments (storing them in an instance is not necessary) so forcing the template policy to be defined as an instance method means a lot of boilerplate for no good reason. > > I hope i've convinced you that the current proposal for string interpolation in Java is not the right one. > > regards, > R?mi > > * for Scala, it's a method on StringContext that acts as a function that takes a StringContext as first parameter. > > [1] https://bugs.openjdk.java.net/browse/JDK-8273943 > [3] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals > [4] https://docs.scala-lang.org/overviews/core/string-interpolation.html > [2] https://docs.groovy-lang.org/docs/latest/html/api/groovy/lang/GString.html > From forax at univ-mlv.fr Mon Oct 18 19:12:21 2021 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 18 Oct 2021 21:12:21 +0200 (CEST) Subject: Templated String and template policies, why the current design is bad In-Reply-To: <741655d7-ca30-2226-f561-60facffacd71@oracle.com> References: <327759777.1307.1634504088937.JavaMail.zimbra@u-pem.fr> <741655d7-ca30-2226-f561-60facffacd71@oracle.com> Message-ID: <633810112.552086.1634584341616.JavaMail.zimbra@u-pem.fr> ----- Original Message ----- > From: "Brian Goetz" > To: "Remi Forax" , "amber-spec-experts" > Sent: Lundi 18 Octobre 2021 18:47:00 > Subject: Re: Templated String and template policies, why the current design is bad > This seems a very strange argument to me. > > Templates are by their nature dynamic -- a template has an unknown > number of holes, and the holes are filled with arbitrary expressions. > People like templates because they're easy to use, and they're easy to > use because they're flexible.? Consider String::format: > > ??? String format(String formatString, Object... values) > > There are many dynamic conditions that are not statically checked here; > that the format string is well-formed, that the number of holes matches > the number of values provided, that the types of the values are suitable > for filling the holes, etc.? Every templating policy will carry their > own private interpretation of these requirements, which would require > much more complex type systems to capture. There is a lot of structured text that ask for specific types in a specific order. By example, if a text that starts with a date and then some values new DatedTextTemplatePolicy().""" // Date \(LocalDate.now()) \(key1) : \(value1) \(key2) : \(value2) """; If i can declare the parameters like in JavaScript, i can write String apply(TemplatedString template, LocalDate date, Object... pairs) { ... } It also make all the constructs that are target typing, unusable. By example, how to use lambdas/method references that will be used as projection functions for several record instances. List persons = ... // generate all mails new MailGeneratorTemplatePolicy(persons).""" Dear \(Person::title) \(Person::lastName), i hope you enjoy ... ... """; As you know, you can not write this kind of code if the arguments are all typed Object. Another example is there grammar example of John, https://github.com/forax/java-interpolation/blob/master/src/test/java/com/github/forax/interpolator/GrammarTemplatePolicyTest.java#L22 Here you want all the arguments to be either a terminal or a non-terminal. It should be a compile time error if a user uses something else. > > When the templating policy is a well-known constant, such as > java.lang.String.FMT, IDEs will be able to provide better checking based > on the specification of the formatter, but that's a bonus. > > You're saying here that what we should reify is not format+values, but > format+types.? This is not an unreasonable choice (but, doesn't rise to > the bar you've set by "the current design is bad"), but I think your > argument is an implementation preference dressed up in theoretical > garb.? You want the abstraction to serve the implementation (a > bootstrap), so you want to shape it like what a bootstrap wants to consume. Nope, i want compile time safety when it's possible, Object.. should be a possible descriptor for the types of the parameters not the only descriptor. [...] > >> The other issue with the proposed design is that there is no way to declare the >> template policy as a static method, it has to be an instance method >> implementing an interface despite the fact that both JavaScript and Scala* >> support function first and lets the user adds supplementary arguments as a >> secondary mechanism (using currying in Scala and by adding a property on the >> function itself in JavaScript). > > The Template policy is a SAM interface, so any static method of the > right shape can be turned into a template policy with a method reference. Yes, that why i call it a glorified Supplier, but i don't see how it helps. In term of writing the code, in an IDE, i can not take a type Pattern." + CTRL-SPACE. As a JDK maintainer, you can cheat and say, do this import static on that class and all template policies you need are now available, but this approach does not scale. Otherwise, you have to memorize that FMT is in fact FormatTemplatePolicy.FMT, that PATTERN is in Fact PatternTemplatePolicy.PATTERN, etc. > > I suspect what you mean by "no way", is "no way to access the > super-optimized implementation strategy"?? And I'll say again the two > answers I've already given to that: (a) many such formatters will not > benefit from the low-level implementation strategy anyway, and (b) we > should design the API to serve the users, not the implementors.? THere > are many more users. It's a false dichotomy, i want both, an API easy to use and efficient. But in this thread i would like us to focus on the type checking part, the efficiency can be discussed in another thread. R?mi > > > > > > On 10/17/2021 4:54 PM, Remi Forax wrote: >> I've recently proposed another way to implement the templated string/template >> policies but i may not have made it clear why i think the current proposal [1] >> is bad. >> >> First, some vocabulary, a templated string is a string with some unnamed >> parameters that are filled with the result of expressions >> by example, if we use ${ expr } as escape sequence to introduce an expression >> the code >> >> var a = 3; >> var b = 4; >> "sum ${ a } + ${ b } = ${ a + b }" >> >> can be decomposed into >> >> - a string template that can be seen either as a string "sum @ + @ = @" with a >> special character (here '@') denoting a hole for each parameter >> or an array of strings ["sum ", " + ", " = ", ""] indicating the strings in >> between holes. >> - 3 parameters, param0, param1 and param2 initialized respectively with the >> results of the expressions a, b and a + b >> >> Before talking about the current proposal, let's take a look to the way both >> JavaScript and Scala, implement the string interpolation. >> >> For JavaScript [2], you define a function that the template as an array and as >> many parameters you need >> function foo(templateParts, param0, param1, param2) { >> ... >> } >> >> JavaScript uses backticks `` to delimit the templated strings and ${} as escape >> sequence >> so >> var a = 3; >> var b = 4; >> foo.`sum ${ a } + ${ b } = ${ a + b }` >> >> is equivalent to >> >> foo(["sum ", " + ", " = ", ""], a, b, a + b) >> >> >> In Scala, this mostly works the same way, there is a class StringContext that >> correspond to a templated string and you define the function foo as a method of >> StringContext that takes the parameters (in Scala, you can add methods on an >> already existing class using (abusing of) the implicit keyword). >> >> implicit class FooHelper(val template: StringContext) { // adds the following >> methods to StringContext >> def foo(param0: Any, param1: Any, param2: Any) { >> ... >> } >> } >> >> Scala uses quotes "" to delimit the templated string and ${} as escape sequence >> so >> val a = 3; >> val b = 4; >> foo."sum ${ a } + ${ b } = ${ a + b }" >> >> is equivalent to >> new StringContext("sum ", " + ", " = ", "").foo(a, b, a + b) >> >> >> >> In summary, for both JavaScript and Scala, the generalization of string >> interpolation is a function call which takes the templates string parts as >> first argument and the parameters of the templated string as the other >> parameters. >> >> So in Java, you would assume that >> - there is an object that represents a templated string with the holes >> - there is a method that takes the templated string as first parameter and the >> parameters of the templated string >> >> But this is not how the proposed design works. >> >> The TemplateString does not represent a string with some holes, it represents >> the string with some holes plus the values of the holes, as if the arguments of >> the parameters were partially applied. The TemplateString acts as a closure on >> the arguments, a glorified Supplier if you prefer. >> >> Because the arguments are already inside the TemplatedString, the >> TemplatePolicy, the function that should take the template and the parameters >> does not declare the types of the parameters. >> Which means that there is no way for someone that creates a TemplatePolicy to >> declare the types of the parameters, any parameters is always valid, so there >> is no type safety. >> >> This design is not unknown, this is the GString [4] of Groovy. While it makes >> sense for a dynamic language like Groovy to not have to declare the type of the >> parameters, it makes no sense for a language like Java which is statically >> typed to not have a way to declare the types of the parameters like Scala or >> TypeScript/JavaScript do. >> >> The other issue with the proposed design is that there is no way to declare the >> template policy as a static method, it has to be an instance method >> implementing an interface despite the fact that both JavaScript and Scala* >> support function first and lets the user adds supplementary arguments as a >> secondary mechanism (using currying in Scala and by adding a property on the >> function itself in JavaScript). >> >> There is a good reason to support static methods in Java, a lot of use-cases >> does not requires the template policy to have additional arguments (storing >> them in an instance is not necessary) so forcing the template policy to be >> defined as an instance method means a lot of boilerplate for no good reason. >> >> I hope i've convinced you that the current proposal for string interpolation in >> Java is not the right one. >> >> regards, >> R?mi >> >> * for Scala, it's a method on StringContext that acts as a function that takes a >> StringContext as first parameter. >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8273943 >> [3] >> https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals >> [4] https://docs.scala-lang.org/overviews/core/string-interpolation.html >> [2] https://docs.groovy-lang.org/docs/latest/html/api/groovy/lang/GString.html From brian.goetz at oracle.com Mon Oct 18 19:26:33 2021 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 18 Oct 2021 15:26:33 -0400 Subject: [External] : Re: Templated String and template policies, why the current design is bad In-Reply-To: <633810112.552086.1634584341616.JavaMail.zimbra@u-pem.fr> References: <327759777.1307.1634504088937.JavaMail.zimbra@u-pem.fr> <741655d7-ca30-2226-f561-60facffacd71@oracle.com> <633810112.552086.1634584341616.JavaMail.zimbra@u-pem.fr> Message-ID: <02ca6dc7-efd9-b440-9b76-9df4b6bf2441@oracle.com> Templates are an *implementation* mechanism; they're not an API building tool.? If you want stronger type checking, use the existing API construction features of the language -- such as exposing a method that takes a list of arguments of specific types, or takes a record; if the template contains lots of repetition, use a builder, etc, and then let the method/builder do the templating. I agree it would be nice to be able to preflight the invocation at compile time, so that validity errors could be caught at compile time and turned into diagnostics, but this is broader than just type checking -- this includes malformed template strings, etc.? That's a desirable feature that I hope we'll be able to layer on later. (Again, this is somewhat related to the compile-time constant folding feature, since that involved the compiler reflectively calling library code at compile time to at least partially validate compile-time constant information.) On 10/18/2021 3:12 PM, forax at univ-mlv.fr wrote: > ----- Original Message ----- >> From: "Brian Goetz" >> To: "Remi Forax" , "amber-spec-experts" >> Sent: Lundi 18 Octobre 2021 18:47:00 >> Subject: Re: Templated String and template policies, why the current design is bad >> This seems a very strange argument to me. >> >> Templates are by their nature dynamic -- a template has an unknown >> number of holes, and the holes are filled with arbitrary expressions. >> People like templates because they're easy to use, and they're easy to >> use because they're flexible.? Consider String::format: >> >> ??? String format(String formatString, Object... values) >> >> There are many dynamic conditions that are not statically checked here; >> that the format string is well-formed, that the number of holes matches >> the number of values provided, that the types of the values are suitable >> for filling the holes, etc.? Every templating policy will carry their >> own private interpretation of these requirements, which would require >> much more complex type systems to capture. > There is a lot of structured text that ask for specific types in a specific order. > > By example, if a text that starts with a date and then some values > > new DatedTextTemplatePolicy().""" > // Date \(LocalDate.now()) > \(key1) : \(value1) > \(key2) : \(value2) > """; > > If i can declare the parameters like in JavaScript, i can write > String apply(TemplatedString template, LocalDate date, Object... pairs) { ... } > > > It also make all the constructs that are target typing, unusable. > By example, how to use lambdas/method references that will be used as projection functions for several record instances. > > List persons = ... > > // generate all mails > new MailGeneratorTemplatePolicy(persons).""" > Dear \(Person::title) \(Person::lastName), > i hope you enjoy ... > ... > """; > > As you know, you can not write this kind of code if the arguments are all typed Object. > > > Another example is there grammar example of John, > https://urldefense.com/v3/__https://github.com/forax/java-interpolation/blob/master/src/test/java/com/github/forax/interpolator/GrammarTemplatePolicyTest.java*L22__;Iw!!ACWV5N9M2RV99hQ!asJ_WOx0QOyBjnKlGynqHgivYFVsbSTL8xDyi-0kCyI_qHiDdLU_IZ4tPbvTT0ua2w$ > > Here you want all the arguments to be either a terminal or a non-terminal. > It should be a compile time error if a user uses something else. > > >> When the templating policy is a well-known constant, such as >> java.lang.String.FMT, IDEs will be able to provide better checking based >> on the specification of the formatter, but that's a bonus. >> >> You're saying here that what we should reify is not format+values, but >> format+types.? This is not an unreasonable choice (but, doesn't rise to >> the bar you've set by "the current design is bad"), but I think your >> argument is an implementation preference dressed up in theoretical >> garb.? You want the abstraction to serve the implementation (a >> bootstrap), so you want to shape it like what a bootstrap wants to consume. > Nope, i want compile time safety when it's possible, Object.. should be a possible descriptor for the types of the parameters not the only descriptor. > > [...] > >>> The other issue with the proposed design is that there is no way to declare the >>> template policy as a static method, it has to be an instance method >>> implementing an interface despite the fact that both JavaScript and Scala* >>> support function first and lets the user adds supplementary arguments as a >>> secondary mechanism (using currying in Scala and by adding a property on the >>> function itself in JavaScript). >> The Template policy is a SAM interface, so any static method of the >> right shape can be turned into a template policy with a method reference. > Yes, that why i call it a glorified Supplier, but i don't see how it helps. > > In term of writing the code, in an IDE, i can not take a type Pattern." + CTRL-SPACE. > > As a JDK maintainer, you can cheat and say, do this import static on that class and all template policies you need are now available, but this approach does not scale. > > Otherwise, you have to memorize that FMT is in fact FormatTemplatePolicy.FMT, that PATTERN is in Fact PatternTemplatePolicy.PATTERN, etc. > > >> I suspect what you mean by "no way", is "no way to access the >> super-optimized implementation strategy"?? And I'll say again the two >> answers I've already given to that: (a) many such formatters will not >> benefit from the low-level implementation strategy anyway, and (b) we >> should design the API to serve the users, not the implementors.? THere >> are many more users. > It's a false dichotomy, i want both, an API easy to use and efficient. > > But in this thread i would like us to focus on the type checking part, the efficiency can be discussed in another thread. > > R?mi > >> >> >> >> >> On 10/17/2021 4:54 PM, Remi Forax wrote: >>> I've recently proposed another way to implement the templated string/template >>> policies but i may not have made it clear why i think the current proposal [1] >>> is bad. >>> >>> First, some vocabulary, a templated string is a string with some unnamed >>> parameters that are filled with the result of expressions >>> by example, if we use ${ expr } as escape sequence to introduce an expression >>> the code >>> >>> var a = 3; >>> var b = 4; >>> "sum ${ a } + ${ b } = ${ a + b }" >>> >>> can be decomposed into >>> >>> - a string template that can be seen either as a string "sum @ + @ = @" with a >>> special character (here '@') denoting a hole for each parameter >>> or an array of strings ["sum ", " + ", " = ", ""] indicating the strings in >>> between holes. >>> - 3 parameters, param0, param1 and param2 initialized respectively with the >>> results of the expressions a, b and a + b >>> >>> Before talking about the current proposal, let's take a look to the way both >>> JavaScript and Scala, implement the string interpolation. >>> >>> For JavaScript [2], you define a function that the template as an array and as >>> many parameters you need >>> function foo(templateParts, param0, param1, param2) { >>> ... >>> } >>> >>> JavaScript uses backticks `` to delimit the templated strings and ${} as escape >>> sequence >>> so >>> var a = 3; >>> var b = 4; >>> foo.`sum ${ a } + ${ b } = ${ a + b }` >>> >>> is equivalent to >>> >>> foo(["sum ", " + ", " = ", ""], a, b, a + b) >>> >>> >>> In Scala, this mostly works the same way, there is a class StringContext that >>> correspond to a templated string and you define the function foo as a method of >>> StringContext that takes the parameters (in Scala, you can add methods on an >>> already existing class using (abusing of) the implicit keyword). >>> >>> implicit class FooHelper(val template: StringContext) { // adds the following >>> methods to StringContext >>> def foo(param0: Any, param1: Any, param2: Any) { >>> ... >>> } >>> } >>> >>> Scala uses quotes "" to delimit the templated string and ${} as escape sequence >>> so >>> val a = 3; >>> val b = 4; >>> foo."sum ${ a } + ${ b } = ${ a + b }" >>> >>> is equivalent to >>> new StringContext("sum ", " + ", " = ", "").foo(a, b, a + b) >>> >>> >>> >>> In summary, for both JavaScript and Scala, the generalization of string >>> interpolation is a function call which takes the templates string parts as >>> first argument and the parameters of the templated string as the other >>> parameters. >>> >>> So in Java, you would assume that >>> - there is an object that represents a templated string with the holes >>> - there is a method that takes the templated string as first parameter and the >>> parameters of the templated string >>> >>> But this is not how the proposed design works. >>> >>> The TemplateString does not represent a string with some holes, it represents >>> the string with some holes plus the values of the holes, as if the arguments of >>> the parameters were partially applied. The TemplateString acts as a closure on >>> the arguments, a glorified Supplier if you prefer. >>> >>> Because the arguments are already inside the TemplatedString, the >>> TemplatePolicy, the function that should take the template and the parameters >>> does not declare the types of the parameters. >>> Which means that there is no way for someone that creates a TemplatePolicy to >>> declare the types of the parameters, any parameters is always valid, so there >>> is no type safety. >>> >>> This design is not unknown, this is the GString [4] of Groovy. While it makes >>> sense for a dynamic language like Groovy to not have to declare the type of the >>> parameters, it makes no sense for a language like Java which is statically >>> typed to not have a way to declare the types of the parameters like Scala or >>> TypeScript/JavaScript do. >>> >>> The other issue with the proposed design is that there is no way to declare the >>> template policy as a static method, it has to be an instance method >>> implementing an interface despite the fact that both JavaScript and Scala* >>> support function first and lets the user adds supplementary arguments as a >>> secondary mechanism (using currying in Scala and by adding a property on the >>> function itself in JavaScript). >>> >>> There is a good reason to support static methods in Java, a lot of use-cases >>> does not requires the template policy to have additional arguments (storing >>> them in an instance is not necessary) so forcing the template policy to be >>> defined as an instance method means a lot of boilerplate for no good reason. >>> >>> I hope i've convinced you that the current proposal for string interpolation in >>> Java is not the right one. >>> >>> regards, >>> R?mi >>> >>> * for Scala, it's a method on StringContext that acts as a function that takes a >>> StringContext as first parameter. >>> >>> [1] https://bugs.openjdk.java.net/browse/JDK-8273943 >>> [3] >>> https://urldefense.com/v3/__https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals__;!!ACWV5N9M2RV99hQ!asJ_WOx0QOyBjnKlGynqHgivYFVsbSTL8xDyi-0kCyI_qHiDdLU_IZ4tPbsWqtWjzg$ >>> [4] https://urldefense.com/v3/__https://docs.scala-lang.org/overviews/core/string-interpolation.html__;!!ACWV5N9M2RV99hQ!asJ_WOx0QOyBjnKlGynqHgivYFVsbSTL8xDyi-0kCyI_qHiDdLU_IZ4tPbtRBYzatg$ >>> [2] https://urldefense.com/v3/__https://docs.groovy-lang.org/docs/latest/html/api/groovy/lang/GString.html__;!!ACWV5N9M2RV99hQ!asJ_WOx0QOyBjnKlGynqHgivYFVsbSTL8xDyi-0kCyI_qHiDdLU_IZ4tPbsju7OQhw$ From gavin.bierman at oracle.com Wed Oct 20 15:56:54 2021 From: gavin.bierman at oracle.com (Gavin Bierman) Date: Wed, 20 Oct 2021 15:56:54 +0000 Subject: [patterns-switch] Draft Spec for JEP 420: Pattern Matching for switch (Second Preview) Message-ID: Dear experts: The first draft of the spec for JEP 420 (Pattern Matching for switch - Second Preview) is now available at: http://cr.openjdk.java.net/~gbierman/jep420/latest/ This contains the updates discussed on the list (GADT support, revised dominance rules for constant case labels etc). It doesn?t support inference of type arguments (yet). More details on that to follow. Comments welcome! Thanks, Gavin From forax at univ-mlv.fr Thu Oct 28 17:12:38 2021 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 28 Oct 2021 19:12:38 +0200 (CEST) Subject: switch expression with not explicit yield value should work ? In-Reply-To: References: <84434836.1074099.1630245621532.JavaMail.zimbra@u-pem.fr> Message-ID: <1998745512.2410609.1635441158261.JavaMail.zimbra@u-pem.fr> I fell in the same trap yet again :( Am i the only one that want to throw exceptions in all branches of a switch ? https://github.com/forax/write_your_own_java_framework/blob/master/interceptor/src/test/java/org/github/forax/framework/interceptor/InterceptorRegistryTest.java#L523 R?mi ----- Original Message ----- > From: "Tagir Valeev" > To: "Remi Forax" > Cc: "amber-spec-experts" > Sent: Lundi 30 Ao?t 2021 04:00:27 > Subject: Re: switch expression with not explicit yield value should work ? > Hello! > > I think this is not related to recent JEPs. This behavior is > standardised since Java 14 when Switch expression was introduced: > > // Compilation error > int x = switch(0) { > default -> throw new IllegalArgumentException(); > }; > > This is explicitly specified (15.28.1) [1]: > >> It is a compile-time error if a switch expression has no result expressions. > > There was some discussion about this rule in March, 2019 [2]. > Basically, the idea is to preserve the possibility of normal > (non-abrupt) execution of every expression. > I believe, preventing unreachable code has always been in the spirit > of Java. In your code sample, the execution of the 'return' statement > itself is unreachable, > so writing 'return' is redundant. In my sample above, the 'x' variable > is never assigned to anything, and the subsequent statements (if any) > are unreachable as well. > > I'd vote to keep the current behavior. While it may complicate code > generation and automatic refactorings, this additional complexity is > only marginal. The benefit is > that this behavior may save us from accidental mistakes. > > Btw, you may deceive the compiler introducing a method like > > static Object fail() { > throw new IllegalArgumentException(); > } > > And use "case Object __ -> fail()" > > With best regards, > Tagir Valeev. > > [1] https://docs.oracle.com/javase/specs/jls/se16/html/jls-15.html#jls-15.28.1 > [2] > https://mail.openjdk.java.net/pipermail/amber-spec-experts/2019-March/001067.html > > On Sun, Aug 29, 2021 at 9:00 PM Remi Forax wrote: >> >> Another case where the spec is weird, >> i've converted a project that generate a visitor from a grammar (something like >> yacc) to use a switch on type instead. >> >> Sometimes for a degenerate portion of the grammar i've an empty visitor that >> always throw an exception, >> the equivalent code with a switch is >> >> static Object result(Object o) { >> return switch (o) { >> case Object __ -> throw new AssertionError(); >> }; >> } >> >> >> Obviously i can tweak the code generator to generate >> >> static Object result(Object o) { >> throw new AssertionError(); >> } >> >> but not be able to compile the former code strike me as odd. >> >> An expression switch is a poly-expression, so the result type is back-propagated >> from the return type of the method result, so it should be Object. >> >> Moreover, if the switch is not a switch expression but a switch statement, the >> code is also valid >> >> static Object result(Object o) { >> switch (o) { >> case Object __ -> throw new AssertionError(); >> } >> } >> >> Not be able to compile a switch expression when there is no explicit result type >> but only an implicit type seems arbitrary to me >> (this change is backward compatible because it only makes more codes compiling). >> > > R?mi From brian.goetz at oracle.com Thu Oct 28 17:48:40 2021 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 28 Oct 2021 13:48:40 -0400 Subject: switch expression with not explicit yield value should work ? In-Reply-To: <1998745512.2410609.1635441158261.JavaMail.zimbra@u-pem.fr> References: <84434836.1074099.1630245621532.JavaMail.zimbra@u-pem.fr> <1998745512.2410609.1635441158261.JavaMail.zimbra@u-pem.fr> Message-ID: If all branches throw, then you can refactor ??? switch (x) { ??????? case X -> throw e; ??????? ... ??? } to ??? throw switch (x) { ???????? case X -> e; ??? } On 10/28/2021 1:12 PM, forax at univ-mlv.fr wrote: > I fell in the same trap yet again :( > > Am i the only one that want to throw exceptions in all branches of a switch ? > > https://github.com/forax/write_your_own_java_framework/blob/master/interceptor/src/test/java/org/github/forax/framework/interceptor/InterceptorRegistryTest.java#L523 > > R?mi > > ----- Original Message ----- >> From: "Tagir Valeev" >> To: "Remi Forax" >> Cc: "amber-spec-experts" >> Sent: Lundi 30 Ao?t 2021 04:00:27 >> Subject: Re: switch expression with not explicit yield value should work ? >> Hello! >> >> I think this is not related to recent JEPs. This behavior is >> standardised since Java 14 when Switch expression was introduced: >> >> // Compilation error >> int x = switch(0) { >> default -> throw new IllegalArgumentException(); >> }; >> >> This is explicitly specified (15.28.1) [1]: >> >>> It is a compile-time error if a switch expression has no result expressions. >> There was some discussion about this rule in March, 2019 [2]. >> Basically, the idea is to preserve the possibility of normal >> (non-abrupt) execution of every expression. >> I believe, preventing unreachable code has always been in the spirit >> of Java. In your code sample, the execution of the 'return' statement >> itself is unreachable, >> so writing 'return' is redundant. In my sample above, the 'x' variable >> is never assigned to anything, and the subsequent statements (if any) >> are unreachable as well. >> >> I'd vote to keep the current behavior. While it may complicate code >> generation and automatic refactorings, this additional complexity is >> only marginal. The benefit is >> that this behavior may save us from accidental mistakes. >> >> Btw, you may deceive the compiler introducing a method like >> >> static Object fail() { >> throw new IllegalArgumentException(); >> } >> >> And use "case Object __ -> fail()" >> >> With best regards, >> Tagir Valeev. >> >> [1]https://docs.oracle.com/javase/specs/jls/se16/html/jls-15.html#jls-15.28.1 >> [2] >> https://mail.openjdk.java.net/pipermail/amber-spec-experts/2019-March/001067.html >> >> On Sun, Aug 29, 2021 at 9:00 PM Remi Forax wrote: >>> Another case where the spec is weird, >>> i've converted a project that generate a visitor from a grammar (something like >>> yacc) to use a switch on type instead. >>> >>> Sometimes for a degenerate portion of the grammar i've an empty visitor that >>> always throw an exception, >>> the equivalent code with a switch is >>> >>> static Object result(Object o) { >>> return switch (o) { >>> case Object __ -> throw new AssertionError(); >>> }; >>> } >>> >>> >>> Obviously i can tweak the code generator to generate >>> >>> static Object result(Object o) { >>> throw new AssertionError(); >>> } >>> >>> but not be able to compile the former code strike me as odd. >>> >>> An expression switch is a poly-expression, so the result type is back-propagated >>> from the return type of the method result, so it should be Object. >>> >>> Moreover, if the switch is not a switch expression but a switch statement, the >>> code is also valid >>> >>> static Object result(Object o) { >>> switch (o) { >>> case Object __ -> throw new AssertionError(); >>> } >>> } >>> >>> Not be able to compile a switch expression when there is no explicit result type >>> but only an implicit type seems arbitrary to me >>> (this change is backward compatible because it only makes more codes compiling). >>> >>> R?mi From forax at univ-mlv.fr Thu Oct 28 18:04:53 2021 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 28 Oct 2021 20:04:53 +0200 (CEST) Subject: switch expression with not explicit yield value should work ? In-Reply-To: References: <84434836.1074099.1630245621532.JavaMail.zimbra@u-pem.fr> <1998745512.2410609.1635441158261.JavaMail.zimbra@u-pem.fr> Message-ID: <127716797.2419081.1635444293129.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "Remi Forax" , "Tagir Valeev" > Cc: "amber-spec-experts" > Sent: Jeudi 28 Octobre 2021 19:48:40 > Subject: Re: switch expression with not explicit yield value should work ? > If all branches throw, then you can refactor > switch (x) { > case X -> throw e; > ... > } > to > throw switch (x) { > case X -> e; > } now i feel i've been tricked by a Jedi :) R?mi > On 10/28/2021 1:12 PM, [ mailto:forax at univ-mlv.fr | forax at univ-mlv.fr ] wrote: >> I fell in the same trap yet again :( >> Am i the only one that want to throw exceptions in all branches of a switch ? [ >> https://github.com/forax/write_your_own_java_framework/blob/master/interceptor/src/test/java/org/github/forax/framework/interceptor/InterceptorRegistryTest.java#L523 >> | >> https://github.com/forax/write_your_own_java_framework/blob/master/interceptor/src/test/java/org/github/forax/framework/interceptor/InterceptorRegistryTest.java#L523 >> ] R?mi >> ----- Original Message ----- >>> From: "Tagir Valeev" [ mailto:amaembo at gmail.com | ] To: >>> "Remi Forax" [ mailto:forax at univ-mlv.fr | ] Cc: >>> "amber-spec-experts" [ mailto:amber-spec-experts at openjdk.java.net | >>> ] Sent: Lundi 30 Ao?t 2021 04:00:27 >>> Subject: Re: switch expression with not explicit yield value should work ? >>> Hello! >>> I think this is not related to recent JEPs. This behavior is >>> standardised since Java 14 when Switch expression was introduced: >>> // Compilation error >>> int x = switch(0) { >>> default -> throw new IllegalArgumentException(); >>> }; >>> This is explicitly specified (15.28.1) [1]: >>>> It is a compile-time error if a switch expression has no result expressions. >>> There was some discussion about this rule in March, 2019 [2]. >>> Basically, the idea is to preserve the possibility of normal >>> (non-abrupt) execution of every expression. >>> I believe, preventing unreachable code has always been in the spirit >>> of Java. In your code sample, the execution of the 'return' statement >>> itself is unreachable, >>> so writing 'return' is redundant. In my sample above, the 'x' variable >>> is never assigned to anything, and the subsequent statements (if any) >>> are unreachable as well. >>> I'd vote to keep the current behavior. While it may complicate code >>> generation and automatic refactorings, this additional complexity is >>> only marginal. The benefit is >>> that this behavior may save us from accidental mistakes. >>> Btw, you may deceive the compiler introducing a method like >>> static Object fail() { >>> throw new IllegalArgumentException(); >>> } >>> And use "case Object __ -> fail()" >>> With best regards, >>> Tagir Valeev. >>> [1] [ https://docs.oracle.com/javase/specs/jls/se16/html/jls-15.html#jls-15.28.1 >>> | https://docs.oracle.com/javase/specs/jls/se16/html/jls-15.html#jls-15.28.1 ] >>> [2] [ >>> https://mail.openjdk.java.net/pipermail/amber-spec-experts/2019-March/001067.html >>> | >>> https://mail.openjdk.java.net/pipermail/amber-spec-experts/2019-March/001067.html >>> ] On Sun, Aug 29, 2021 at 9:00 PM Remi Forax [ mailto:forax at univ-mlv.fr | >>> ] wrote: >>>> Another case where the spec is weird, >>>> i've converted a project that generate a visitor from a grammar (something like >>>> yacc) to use a switch on type instead. >>>> Sometimes for a degenerate portion of the grammar i've an empty visitor that >>>> always throw an exception, >>>> the equivalent code with a switch is >>>> static Object result(Object o) { >>>> return switch (o) { >>>> case Object __ -> throw new AssertionError(); >>>> }; >>>> } >>>> Obviously i can tweak the code generator to generate >>>> static Object result(Object o) { >>>> throw new AssertionError(); >>>> } >>>> but not be able to compile the former code strike me as odd. >>>> An expression switch is a poly-expression, so the result type is back-propagated >>>> from the return type of the method result, so it should be Object. >>>> Moreover, if the switch is not a switch expression but a switch statement, the >>>> code is also valid >>>> static Object result(Object o) { >>>> switch (o) { >>>> case Object __ -> throw new AssertionError(); >>>> } >>>> } >>>> Not be able to compile a switch expression when there is no explicit result type >>>> but only an implicit type seems arbitrary to me >>>> (this change is backward compatible because it only makes more codes compiling). >>>> R?mi From james.laskey at oracle.com Fri Oct 29 14:10:54 2021 From: james.laskey at oracle.com (Jim Laskey) Date: Fri, 29 Oct 2021 14:10:54 +0000 Subject: Are templated string embedded expressions "method parameters" or "lambdas"? Message-ID: <2F2E1E93-2B84-434E-B446-672270048C0A@oracle.com> For our early templated string prototypes, we restricted embedded expressions to just basic accessors and basic arithmetic. The intent was to keep things easy to read and to prevent side effects. Over time, we began thinking this restriction was unduly harsh. More precisely, we worried it that it would result in a complex, difficult-to-defend boundary. But we still would like users to not rely on side-effects. Consequently, a new proposal for embedded expressions - we would allow any Java expression with the restriction that you can't use single quotes, double quotes or escape sequences. We opted to keep this restriction to allow tools (ex., syntax highlighters) to isolate embedded expressions within strings without requiring sophisticated parsing. Given that an unprocessed templated string involves at least some deferred evaluation, should we frame templated string parameters as being more like method parameters (all parameters evaluated eagerly, left to right), or should we treat them as lambda expressions, which may capture (effectively final) variables from the environment, and evaluate the full parameters expressions when they are needed? Note too that the effectively final restriction rules out some of the worst side-effect offenders, like: int x = 0; formatter."One \{x++} plus two \{x++} is three \{x}"; -- even if we intend to then do eager evaluation! To help understand the issue, let's look at a simplification of how the two different paradigms (method parameter vs. lambda) might be implemented. Example: int x = 0; int method1() { System.out.println("one"); return 1; } int method2() { System.out.println("two"); return 2; } System.out.println("Before TemplatedString"); TemplatedString ts = "\{x} and \{method1()} and \{method2()}"; System.out.println("After TemplatedString"); System.out.println(CONCAT.apply(ts)); System.out.println("After Policy"); The method parameter paradigm would generate something like following for TemplatedString ts = "\{x} and \{method1()} and \{method2()}"; statement. Basically, capture the values of the evaluated expressions in instance fields. TemplatedString ts = new TemplatedString() { int expr$0 = x; int expr$1 = method1(); int expr$2 = method2(); String template() { return "\uFFFC and \uFFFC and \uFFFC"; } List