From gavin.bierman at oracle.com Fri Oct 1 12:49:01 2021 From: gavin.bierman at oracle.com (Gavin Bierman) Date: Fri, 1 Oct 2021 12:49:01 +0000 Subject: Pattern Matching for switch (Second Preview) In-Reply-To: <4fef4bed-da6c-f6c0-46df-bbe551408bae@oracle.com> References: <5321c128-ba4d-e405-255a-72025a002e0d@oracle.com> <1bed58ce-015c-289b-5696-6dac3c539f6a@oracle.com> <4fef4bed-da6c-f6c0-46df-bbe551408bae@oracle.com> Message-ID: <7A62F8A6-3946-413F-8CC4-3708176D3EE8@oracle.com> On 30 Sep 2021, at 23:25, Brian Goetz > wrote: [ moving to a-s-e ] I get the concern that a type pattern is no longer "just a variable declaration"; that was a nice part of the "patterns aren't really so hard to understand" story. But I think the usability is likely to be not very good. Take this example: sealed interface Node { } record AddNode(Node left, Node right) extends Node { } ... Node ni = ... switch (ni) { case AddNode(Node left, Node right) -> ... There's no instantiation of Node possible here *other than* Node. Which means we are forcing users to either redundantly type out the instantiation (which can get big), or use casts inside the body when they pull things out of left and right. (And patterns were supposed to make casts go away.) There's almost no case where someone wants a raw type here. But surely they should write var here? Gavin From forax at univ-mlv.fr Fri Oct 1 13:12:48 2021 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 1 Oct 2021 15:12:48 +0200 (CEST) Subject: Pattern Matching for switch (Second Preview) In-Reply-To: <7A62F8A6-3946-413F-8CC4-3708176D3EE8@oracle.com> References: <5321c128-ba4d-e405-255a-72025a002e0d@oracle.com> <1bed58ce-015c-289b-5696-6dac3c539f6a@oracle.com> <4fef4bed-da6c-f6c0-46df-bbe551408bae@oracle.com> <7A62F8A6-3946-413F-8CC4-3708176D3EE8@oracle.com> Message-ID: <537785940.2631200.1633093968865.JavaMail.zimbra@u-pem.fr> > From: "Gavin Bierman" > To: "Brian Goetz" > Cc: "amber-spec-experts" > Sent: Vendredi 1 Octobre 2021 14:49:01 > Subject: Re: Pattern Matching for switch (Second Preview) >> On 30 Sep 2021, at 23:25, Brian Goetz < [ mailto:brian.goetz at oracle.com | >> brian.goetz at oracle.com ] > wrote: >> [ moving to a-s-e ] >> I get the concern that a type pattern is no longer "just a variable >> declaration"; that was a nice part of the "patterns aren't really so hard to >> understand" story. But I think the usability is likely to be not very good. >> Take this example: >> sealed interface Node { } >> record AddNode(Node left, Node right) extends Node { } >> ... >> Node ni = ... >> switch (ni) { >> case AddNode(Node left, Node right) -> ... >> There's no instantiation of Node possible here *other than* Node. Which >> means we are forcing users to either redundantly type out the instantiation >> (which can get big), or use casts inside the body when they pull things out of >> left and right. (And patterns were supposed to make casts go away.) There's >> almost no case where someone wants a raw type here. > But surely they should write var here? yes, here is another example List list = ... switch(list) { case ArrayList al -> ... > Gavin R?mi From cushon at google.com Wed Oct 6 18:15:16 2021 From: cushon at google.com (Liam Miller-Cushon) Date: Wed, 6 Oct 2021 11:15:16 -0700 Subject: [External] : Re: Minor improvement to anonymous classes In-Reply-To: <49676f3c-100e-ccb1-ee9a-ef999f9f4a0d@oracle.com> References: <424ad976-6f0c-5ada-ca22-f5a3d9c76dc1@oracle.com> <2ED211EA-3F96-4129-B5BF-9A262C917D9F@oracle.com> <1320762308.901257.1627755887681.JavaMail.zimbra@u-pem.fr> <32bade08-4697-a2fd-52d2-491822c14d19@oracle.com> <1699135460.215325.1627933584253.JavaMail.zimbra@u-pem.fr> <49676f3c-100e-ccb1-ee9a-ef999f9f4a0d@oracle.com> Message-ID: Belatedly returning to this, +Joe Darcy helped with some corpus analysis in the CSR [1] (thanks!). The analysis didn't reveal any build breakages from optimizing away this$0, but it did reveal hundreds of textual occurrences of this$0. The behaviour of those occurrences of this$0 could potentially change if code is reflectively accessing the enclosing instance, and if it expects to be able to do that even if the inner class doesn't capture any enclosing instance state. As I mentioned earlier we've been using a version of the patch at Google since 2016 [2]. Rolling it out required a very small amount of cleanup, and I am not aware of it causing any issues since then (including with third party libraries that might have been relying on the hack). We have some remaining occurrences of this$0 in our code, which are not affected by the change because the only need to handle this$0 in classes that actually capture their enclosing instance. So from my (admittedly limited) perspective, this change is beneficial, and has minor compatibility impact relative to the other breaking changes we've absorbed. I'm curious if anyone has suggestions about how to get other data or perspectives that might help decide how to proceed here? If we can't get conclusive information on how much code would be affected by this, maybe it would be sufficient to roll the change out more conservatively, e.g. by only enabling it for new language levels? Has the 'preview feature' mechanism ever been used for things like this, or is intended more for new features that are visible in the spec? [1] https://bugs.openjdk.java.net/browse/JDK-8271717?focusedCommentId=14442858&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14442858 [2] https://bugs.openjdk.java.net/browse/JDK-8271623?focusedCommentId=14439152&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14439152 On Tue, Aug 3, 2021 at 5:41 AM Brian Goetz wrote: > Yes, local classes too. Essentially, this is for translation of > "effectively static" inner classes. > > I think this is independent of explicit-static or not; explicit-static > allows the programmer to capture intent and get more type checking as a > result. This is about generating better code. > > On 8/3/2021 12:52 AM, Tagir Valeev wrote: > > Another possible semantics change is the object lifetime. The code > > might rely on prolonged lifetime of the surrounding object if there > > are soft/weak/phantom references. E.g., the outer object might be > > registered via Cleaner, and the change may cause freeing the resource > > earlier than expected. Likely, this is a very rare scenario but if it > > happens, it could be quite hard to identify the root cause, as the > > problem will appear only if the object is collected within the > > specific timeframe. > > > > By the way, are we speaking about anonymous classes only? I think, > > local classes could be updated in the similar manner. Especially given > > the fact that now local records don't capture the surrounding "this" > > but if we convert the record to an equivalent local class, it will > > capture: > > > > public class Test { > > void test() { > > record R() {} // does not capture Test instance > > class C {} // captures Test instance > > } > > } > > > > Or should we allow explicit 'static' modifier on local classes? > > > > Best regards, > > Tagir Valeev. > > > > On Tue, Aug 3, 2021 at 2:47 AM wrote: > >> We may have some trouble with the usual suspect, Serialization, > >> There are classes like exceptions or Swing UI classes that are marked > as Serializable and can be implemented as an anonymous class. > >> In that case, removing the backpointer if it is not used may change the > serialization format. > >> > >> And yes, an anonymous class do not have a "stable" name but people do > not seem to care too much about that ... > >> > >> R?mi > >> > >> ----- Original Message ----- > >>> From: "Brian Goetz" > >>> To: "Liam Miller-Cushon" > >>> Cc: "Remi Forax" , "John Rose" < > john.r.rose at oracle.com>, "amber-spec-experts" > >>> > >>> Sent: Lundi 2 Ao?t 2021 20:18:56 > >>> Subject: Re: [External] : Re: Minor improvement to anonymous classes > >>> FWIW, making this fix not only reduces the memory leak risk, but has a > >>> number of nice follow-on benefits that can often trigger further > >>> follow-on benefits: > >>> > >>> - fewer fields, so reduced footprint; > >>> - fewer fields might mean more objects fall under the scalarization > >>> threshold, when applicable; > >>> - less work in constructors; > >>> - shorter constructors mean more constructors fall under the inlining > >>> threshold; > >>> - more inlining might lead to other optimizations. > >>> > >>> So it wouldn't surprise me to see macro-level effects even on programs > >>> without memory leaks. > >>> > >>>> I filed https://bugs.openjdk.java.net/browse/JDK-8271623 > >>>> to track that > >>>> enhancement. > > From brian.goetz at oracle.com Wed Oct 6 21:08:45 2021 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 6 Oct 2021 21:08:45 +0000 Subject: [External] : Re: Minor improvement to anonymous classes In-Reply-To: References: <424ad976-6f0c-5ada-ca22-f5a3d9c76dc1@oracle.com> <2ED211EA-3F96-4129-B5BF-9A262C917D9F@oracle.com> <1320762308.901257.1627755887681.JavaMail.zimbra@u-pem.fr> <32bade08-4697-a2fd-52d2-491822c14d19@oracle.com> <1699135460.215325.1627933584253.JavaMail.zimbra@u-pem.fr> <49676f3c-100e-ccb1-ee9a-ef999f9f4a0d@oracle.com> Message-ID: I think you?ve done pretty good due diligence here. One more thing we could do is reach out to the most popular libraries that do this and give them a heads up that they need to tolerate the field not being there. But overall, the benefit accrues to the 99.999% of users that follow the rules. Sent from my iPad On Oct 6, 2021, at 2:15 PM, Liam Miller-Cushon wrote: ? Belatedly returning to this, +Joe Darcy helped with some corpus analysis in the CSR [1] (thanks!). The analysis didn't reveal any build breakages from optimizing away this$0, but it did reveal hundreds of textual occurrences of this$0. The behaviour of those occurrences of this$0 could potentially change if code is reflectively accessing the enclosing instance, and if it expects to be able to do that even if the inner class doesn't capture any enclosing instance state. As I mentioned earlier we've been using a version of the patch at Google since 2016 [2]. Rolling it out required a very small amount of cleanup, and I am not aware of it causing any issues since then (including with third party libraries that might have been relying on the hack). We have some remaining occurrences of this$0 in our code, which are not affected by the change because the only need to handle this$0 in classes that actually capture their enclosing instance. So from my (admittedly limited) perspective, this change is beneficial, and has minor compatibility impact relative to the other breaking changes we've absorbed. I'm curious if anyone has suggestions about how to get other data or perspectives that might help decide how to proceed here? If we can't get conclusive information on how much code would be affected by this, maybe it would be sufficient to roll the change out more conservatively, e.g. by only enabling it for new language levels? Has the 'preview feature' mechanism ever been used for things like this, or is intended more for new features that are visible in the spec? [1] https://bugs.openjdk.java.net/browse/JDK-8271717?focusedCommentId=14442858&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14442858 [2] https://bugs.openjdk.java.net/browse/JDK-8271623?focusedCommentId=14439152&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14439152 On Tue, Aug 3, 2021 at 5:41 AM Brian Goetz > wrote: Yes, local classes too. Essentially, this is for translation of "effectively static" inner classes. I think this is independent of explicit-static or not; explicit-static allows the programmer to capture intent and get more type checking as a result. This is about generating better code. On 8/3/2021 12:52 AM, Tagir Valeev wrote: > Another possible semantics change is the object lifetime. The code > might rely on prolonged lifetime of the surrounding object if there > are soft/weak/phantom references. E.g., the outer object might be > registered via Cleaner, and the change may cause freeing the resource > earlier than expected. Likely, this is a very rare scenario but if it > happens, it could be quite hard to identify the root cause, as the > problem will appear only if the object is collected within the > specific timeframe. > > By the way, are we speaking about anonymous classes only? I think, > local classes could be updated in the similar manner. Especially given > the fact that now local records don't capture the surrounding "this" > but if we convert the record to an equivalent local class, it will > capture: > > public class Test { > void test() { > record R() {} // does not capture Test instance > class C {} // captures Test instance > } > } > > Or should we allow explicit 'static' modifier on local classes? > > Best regards, > Tagir Valeev. > > On Tue, Aug 3, 2021 at 2:47 AM > wrote: >> We may have some trouble with the usual suspect, Serialization, >> There are classes like exceptions or Swing UI classes that are marked as Serializable and can be implemented as an anonymous class. >> In that case, removing the backpointer if it is not used may change the serialization format. >> >> And yes, an anonymous class do not have a "stable" name but people do not seem to care too much about that ... >> >> R?mi >> >> ----- Original Message ----- >>> From: "Brian Goetz" > >>> To: "Liam Miller-Cushon" > >>> Cc: "Remi Forax" >, "John Rose" >, "amber-spec-experts" >>> > >>> Sent: Lundi 2 Ao?t 2021 20:18:56 >>> Subject: Re: [External] : Re: Minor improvement to anonymous classes >>> FWIW, making this fix not only reduces the memory leak risk, but has a >>> number of nice follow-on benefits that can often trigger further >>> follow-on benefits: >>> >>> - fewer fields, so reduced footprint; >>> - fewer fields might mean more objects fall under the scalarization >>> threshold, when applicable; >>> - less work in constructors; >>> - shorter constructors mean more constructors fall under the inlining >>> threshold; >>> - more inlining might lead to other optimizations. >>> >>> So it wouldn't surprise me to see macro-level effects even on programs >>> without memory leaks. >>> >>>> I filed https://bugs.openjdk.java.net/browse/JDK-8271623 >>>> to track that >>>> enhancement. From maurizio.cimadamore at oracle.com Thu Oct 7 14:36:55 2021 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 7 Oct 2021 15:36:55 +0100 Subject: Minor improvement to anonymous classes In-Reply-To: <424ad976-6f0c-5ada-ca22-f5a3d9c76dc1@oracle.com> References: <424ad976-6f0c-5ada-ca22-f5a3d9c76dc1@oracle.com> Message-ID: <1630506b-f16e-7a47-0fd8-ea7980a80e7a@oracle.com> Your proposal is for anon classes, which I think works well. One related case I found quite often is the desire to combine types e.g. in a return type: Foo & AutoCloseable getCloseableFoo(); This veers into declaration-land, visible to javadoc and all. So it probably doesn't have a good benefit vs. cost ratio. Also, this is effectively adding intersection types (I don't think restricting at return types only will be feasible). But I'm puzzled by the fact that programmers can, in a way, do something similar to the above with a generic method: X getCloseableFoo(); Which kind of works, but it's quite an horrible hack (you introduce a type parameter you don't need - which means compiler will try to infer types, etc.) I'm not suggesting we have to solve this - just wanted to make sure this was somewhere on the radar. Maurizio On 30/07/2021 15:52, Brian Goetz wrote: > I have been working on a library where I've found myself repeatedly > refactoring what should be anonymous classes into named (often local) > classes, for the sole reason that I want to combine interfaces with an > abstract base class: > > ??? interface Foo { ... lots of stuff .. } > ??? abstract class AbstractFoo { ... lots of base implementation ... } > > ??? interface RedFoo extends Foo { void red(); } > > and I want a factory that yields a RedFoo that is based on AbstractFoo > and implements red().? Trivial with a named class, but there's no > reason I should not be able to do that with an anonymous class, since > I have no need of the name. > > We already address this problem elsewhere; there are several places in > the grammar where you can append additional _interfaces_ with &, such as: > > ??? class X { ... } > > and casts (which can be target types for lambdas.) > > These are not full-blown intersection types, but accomodate for the > fact that classes have one superclass and potentially multiple > interfaces.? It appears simple to extend this to inner class creation > expressions: > > ??? new AbstractFoo(args) & RedFoo { ... } > > This would also smooth out a rough edge refactoring between lambdas > and anonymous classes. > > I suspect there are one or two other places in the spec that could use > this treatment. > > (Note that this is explicitly *not* a call for "let's do full-blown > intersection types"; this is solely about class declaration.) > > From kevinb at google.com Thu Oct 7 20:37:58 2021 From: kevinb at google.com (Kevin Bourrillion) Date: Thu, 7 Oct 2021 13:37:58 -0700 Subject: Minor improvement to anonymous classes In-Reply-To: <1630506b-f16e-7a47-0fd8-ea7980a80e7a@oracle.com> References: <424ad976-6f0c-5ada-ca22-f5a3d9c76dc1@oracle.com> <1630506b-f16e-7a47-0fd8-ea7980a80e7a@oracle.com> Message-ID: On Thu, Oct 7, 2021 at 7:37 AM Maurizio Cimadamore < maurizio.cimadamore at oracle.com> wrote: > X getCloseableFoo(); > > Which kind of works, but it's quite an horrible hack (you introduce a type > parameter you don't need - which means compiler will try to infer types, > etc.) > (Incidentally, we have Error Prone give a warning any time a method/constructor type parameter is unused in any of the formal parameter types, and I think the results have been good. A method like `emptySet()` has to suppress it, but it's a fairly special case.) On 30/07/2021 15:52, Brian Goetz wrote: > > I have been working on a library where I've found myself repeatedly > refactoring what should be anonymous classes into named (often local) > classes, for the sole reason that I want to combine interfaces with an > abstract base class: > > interface Foo { ... lots of stuff .. } > abstract class AbstractFoo { ... lots of base implementation ... } > > interface RedFoo extends Foo { void red(); } > > and I want a factory that yields a RedFoo that is based on AbstractFoo and > implements red(). Trivial with a named class, but there's no reason I > should not be able to do that with an anonymous class, since I have no need > of the name. > > We already address this problem elsewhere; there are several places in the > grammar where you can append additional _interfaces_ with &, such as: > > class X { ... } > > and casts (which can be target types for lambdas.) > > These are not full-blown intersection types, but accomodate for the fact > that classes have one superclass and potentially multiple interfaces. It > appears simple to extend this to inner class creation expressions: > > new AbstractFoo(args) & RedFoo { ... } > > This would also smooth out a rough edge refactoring between lambdas and > anonymous classes. > > I suspect there are one or two other places in the spec that could use > this treatment. > > (Note that this is explicitly *not* a call for "let's do full-blown > intersection types"; this is solely about class declaration.) > > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com From forax at univ-mlv.fr Thu Oct 7 21:08:48 2021 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 7 Oct 2021 23:08:48 +0200 (CEST) Subject: Minor improvement to anonymous classes In-Reply-To: References: <424ad976-6f0c-5ada-ca22-f5a3d9c76dc1@oracle.com> <1630506b-f16e-7a47-0fd8-ea7980a80e7a@oracle.com> Message-ID: <1430522104.2385623.1633640928181.JavaMail.zimbra@u-pem.fr> > From: "Kevin Bourrillion" > To: "Maurizio Cimadamore" > Cc: "Brian Goetz" , "amber-spec-experts" > > Sent: Jeudi 7 Octobre 2021 22:37:58 > Subject: Re: Minor improvement to anonymous classes > On Thu, Oct 7, 2021 at 7:37 AM Maurizio Cimadamore < [ > mailto:maurizio.cimadamore at oracle.com | maurizio.cimadamore at oracle.com ] > > wrote: >> X getCloseableFoo(); >> Which kind of works, but it's quite an horrible hack (you introduce a type >> parameter you don't need - which means compiler will try to infer types, etc.) > (Incidentally, we have Error Prone give a warning any time a method/constructor > type parameter is unused in any of the formal parameter types, and I think the > results have been good. A method like `emptySet()` has to suppress it, but it's > a fairly special case.) Using an "unused" parameter types as return type is not unusual either when returning null or when throwing an exception given that both the type of null and the "nothing" type can not be expressed in Java. See by example the javadoc of Assertions.fail() [ https://junit.org/junit5/docs/current/api/org.junit.jupiter.api/org/junit/jupiter/api/Assertions.html#fail(java.lang.String) | https://junit.org/junit5/docs/current/api/org.junit.jupiter.api/org/junit/jupiter/api/Assertions.html#fail(java.lang.String) ] The other usage i can see is to have a better type inference of the return type (avoid an explicit cast) when using a polymorphic signature but i'm not even sure javac support it. R?mi >> On 30/07/2021 15:52, Brian Goetz wrote: >>> I have been working on a library where I've found myself repeatedly refactoring >>> what should be anonymous classes into named (often local) classes, for the sole >>> reason that I want to combine interfaces with an abstract base class: >>> interface Foo { ... lots of stuff .. } >>> abstract class AbstractFoo { ... lots of base implementation ... } >>> interface RedFoo extends Foo { void red(); } >>> and I want a factory that yields a RedFoo that is based on AbstractFoo and >>> implements red(). Trivial with a named class, but there's no reason I should >>> not be able to do that with an anonymous class, since I have no need of the >>> name. >>> We already address this problem elsewhere; there are several places in the >>> grammar where you can append additional _interfaces_ with &, such as: >>> class X { ... } >>> and casts (which can be target types for lambdas.) >>> These are not full-blown intersection types, but accomodate for the fact that >>> classes have one superclass and potentially multiple interfaces. It appears >>> simple to extend this to inner class creation expressions: >>> new AbstractFoo(args) & RedFoo { ... } >>> This would also smooth out a rough edge refactoring between lambdas and >>> anonymous classes. >>> I suspect there are one or two other places in the spec that could use this >>> treatment. >>> (Note that this is explicitly *not* a call for "let's do full-blown intersection >>> types"; this is solely about class declaration.) > -- > Kevin Bourrillion | Java Librarian | Google, Inc. | [ mailto:kevinb at google.com | > kevinb at google.com ] From kevinb at google.com Thu Oct 7 21:12:21 2021 From: kevinb at google.com (Kevin Bourrillion) Date: Thu, 7 Oct 2021 14:12:21 -0700 Subject: Minor improvement to anonymous classes In-Reply-To: <1430522104.2385623.1633640928181.JavaMail.zimbra@u-pem.fr> References: <424ad976-6f0c-5ada-ca22-f5a3d9c76dc1@oracle.com> <1630506b-f16e-7a47-0fd8-ea7980a80e7a@oracle.com> <1430522104.2385623.1633640928181.JavaMail.zimbra@u-pem.fr> Message-ID: I'm sorry that I appeared to be suggesting that there were no other reasons to suppress it. I was actually giving just one example. Nevertheless, the check has done more good than "harm" (in the form of these small suppression costs). On Thu, Oct 7, 2021 at 2:08 PM Remi Forax wrote: > > > ------------------------------ > > *From: *"Kevin Bourrillion" > *To: *"Maurizio Cimadamore" > *Cc: *"Brian Goetz" , "amber-spec-experts" < > amber-spec-experts at openjdk.java.net> > *Sent: *Jeudi 7 Octobre 2021 22:37:58 > *Subject: *Re: Minor improvement to anonymous classes > > On Thu, Oct 7, 2021 at 7:37 AM Maurizio Cimadamore < > maurizio.cimadamore at oracle.com> wrote: > >> X getCloseableFoo(); >> >> Which kind of works, but it's quite an horrible hack (you introduce a >> type parameter you don't need - which means compiler will try to infer >> types, etc.) >> > (Incidentally, we have Error Prone give a warning any time a > method/constructor type parameter is unused in any of the formal parameter > types, and I think the results have been good. A method like `emptySet()` > has to suppress it, but it's a fairly special case.) > > > Using an "unused" parameter types as return type is not unusual either > when returning null or when throwing an exception given that both the type > of null and the "nothing" type can not be expressed in Java. > > See by example the javadoc of Assertions.fail() > > https://junit.org/junit5/docs/current/api/org.junit.jupiter.api/org/junit/jupiter/api/Assertions.html#fail(java.lang.String) > > The other usage i can see is to have a better type inference of the return > type (avoid an explicit cast) when using a polymorphic signature but i'm > not even sure javac support it. > > R?mi > > > > > On 30/07/2021 15:52, Brian Goetz wrote: >> >> I have been working on a library where I've found myself repeatedly >> refactoring what should be anonymous classes into named (often local) >> classes, for the sole reason that I want to combine interfaces with an >> abstract base class: >> >> interface Foo { ... lots of stuff .. } >> abstract class AbstractFoo { ... lots of base implementation ... } >> >> interface RedFoo extends Foo { void red(); } >> >> and I want a factory that yields a RedFoo that is based on AbstractFoo >> and implements red(). Trivial with a named class, but there's no reason I >> should not be able to do that with an anonymous class, since I have no need >> of the name. >> >> We already address this problem elsewhere; there are several places in >> the grammar where you can append additional _interfaces_ with &, such as: >> >> class X { ... } >> >> and casts (which can be target types for lambdas.) >> >> These are not full-blown intersection types, but accomodate for the fact >> that classes have one superclass and potentially multiple interfaces. It >> appears simple to extend this to inner class creation expressions: >> >> new AbstractFoo(args) & RedFoo { ... } >> >> This would also smooth out a rough edge refactoring between lambdas and >> anonymous classes. >> >> I suspect there are one or two other places in the spec that could use >> this treatment. >> >> (Note that this is explicitly *not* a call for "let's do full-blown >> intersection types"; this is solely about class declaration.) >> >> >> > > -- > Kevin Bourrillion | Java Librarian | Google, Inc. |kevinb at google.com > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com From amalloy at google.com Thu Oct 7 21:25:54 2021 From: amalloy at google.com (Alan Malloy) Date: Thu, 7 Oct 2021 14:25:54 -0700 Subject: Minor improvement to anonymous classes In-Reply-To: <1630506b-f16e-7a47-0fd8-ea7980a80e7a@oracle.com> References: <424ad976-6f0c-5ada-ca22-f5a3d9c76dc1@oracle.com> <1630506b-f16e-7a47-0fd8-ea7980a80e7a@oracle.com> Message-ID: You can't actually do this: that signature is a promise to return an instance of *any* class implementing those interfaces, not *some* class implementing them. If you try to implement your getCloseableFoo method, you'll find that no implementation compiles. For example: final class tmp { static X getX() { class Impl implements AutoCloseable, Serializable { public void close() {} } return new Impl(); } } tmp.java:8: error: incompatible types: Impl cannot be converted to X return new Impl(); ^ where X is a type-variable: X extends AutoCloseable,Serializable declared in method getX() This is because some caller may define their own MyImpl class implementing those interfaces, and then write MyImpl i = getX(), instantiating X to MyImpl, and your method doesn't know how to build such an object. On Thu, Oct 7, 2021 at 7:37 AM Maurizio Cimadamore < maurizio.cimadamore at oracle.com> wrote: > But I'm puzzled by the fact that programmers can, in a way, do something > similar to the above with a generic method: > > X getCloseableFoo(); > > Which kind of works, but it's quite an horrible hack (you introduce a type > parameter you don't need - which means compiler will try to infer types, > etc.) > > From cushon at google.com Thu Oct 7 21:57:43 2021 From: cushon at google.com (Liam Miller-Cushon) Date: Thu, 7 Oct 2021 14:57:43 -0700 Subject: [External] : Re: Minor improvement to anonymous classes In-Reply-To: References: <424ad976-6f0c-5ada-ca22-f5a3d9c76dc1@oracle.com> <2ED211EA-3F96-4129-B5BF-9A262C917D9F@oracle.com> <1320762308.901257.1627755887681.JavaMail.zimbra@u-pem.fr> <32bade08-4697-a2fd-52d2-491822c14d19@oracle.com> <1699135460.215325.1627933584253.JavaMail.zimbra@u-pem.fr> <49676f3c-100e-ccb1-ee9a-ef999f9f4a0d@oracle.com> Message-ID: On Wed, Oct 6, 2021 at 2:08 PM Brian Goetz wrote: > One more thing we could do is reach out to the most popular libraries that > do this and give them a heads up that they need to tolerate the field not > being there. > Good idea, I filed bugs against several libraries that are reflecting on fields named this$. I skipped examples that contain textual occurrences of this$ that were clearly safe, e.g. because they were filtering fields with that name out of the results of getDeclaredFields(). https://github.com/classgraph/classgraph/issues/570 https://github.com/robolectric/robolectric/issues/6757 https://github.com/micrometer-metrics/micrometer/issues/2806 https://github.com/awaitility/awaitility/issues/223 https://issues.apache.org/jira/browse/MAPREDUCE-7364 https://issues.apache.org/jira/browse/BEAM-13020 https://issues.apache.org/jira/browse/AVRO-3228 From maurizio.cimadamore at oracle.com Fri Oct 8 08:06:13 2021 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 8 Oct 2021 09:06:13 +0100 Subject: Minor improvement to anonymous classes In-Reply-To: References: <424ad976-6f0c-5ada-ca22-f5a3d9c76dc1@oracle.com> <1630506b-f16e-7a47-0fd8-ea7980a80e7a@oracle.com> Message-ID: Fully agree... unless you embrace unchecked casts (e.g. if your impl cast to X and suppresses the warning, the program compiles). Which is what Kevin was trying to say: having a generic method that doesn't use its type-variable in its parameters can be a bit of a smell, in the sense that implementors can forget that they don't have much control over what the type parameter might be inferred to. All this goes back to my original point: we can't express a method that returns a conjunction of two types today; the alternative is to declare a new type just for that (but some of the arguments originally discussed in this thread apply: e.g. I might not have a great name for it, nor a desire for actually naming it), or resort to unchecked generic dark arts (which, as you and Kevin point out, 95% is subtly broken). Maurizio On 07/10/2021 22:25, Alan Malloy wrote: > You can't actually do this: that signature is a promise to return an > instance of /any/?class implementing those interfaces, not > /some/?class implementing them. If you try to implement your > getCloseableFoo method, you'll find that no implementation compiles. > For example: > > final class tmp { > ? static X getX() { > ? ? class Impl implements AutoCloseable, Serializable { > ? ? ? public void close() {} > ? ? } > ? ? return new Impl(); > ? } > } > > tmp.java:8: error: incompatible types: Impl cannot be converted to X > ? ? return new Impl(); > ? ? ? ? ? ?^ > ? where X is a type-variable: > ? ? X extends AutoCloseable,Serializable declared in method getX() > > This is because some caller may define their own MyImpl class > implementing those interfaces, and then write MyImpl i = getX(), > instantiating X to MyImpl, and your method doesn't know how to build > such an object. > > On Thu, Oct 7, 2021 at 7:37 AM Maurizio Cimadamore > > wrote: > > But I'm puzzled by the fact that programmers can, in a way, do > something similar to the above with a generic method: > > X getCloseableFoo(); > > Which kind of works, but it's quite an horrible hack (you > introduce a type parameter you don't need - which means compiler > will try to infer types, etc.) > From forax at univ-mlv.fr Wed Oct 13 17:09:55 2021 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 13 Oct 2021 19:09:55 +0200 (CEST) Subject: String Interpolation Message-ID: <583208882.1665266.1634144995832.JavaMail.zimbra@u-pem.fr> Hi everybody, i've spend some time to think how the String interpolation + Policy should be specified and implemented. The goal is to add a syntax specifying a user defined method to "interpolate" (for a lack of better word) a string with arguments. Given that it's a method, the exact semantics of the interpolation, things like how the arguments are escaped, how the formatted string is parsed, is written is Java, this will allow to support a wide range of use cases. This proposal does not differ from the original proposal of Brian and Jim in its goal but in the way a user declare the interpolation method(s). TLDR; you can declare an interpolation method and optionally an interpolation bootstrap method if you want a more efficient code at the price of having to play with the method handle API. --- The proposal of Brian and Jim uses an interface to define the policy but in this case, using an interface is not what we want. I think there are two main reasons, - the interpolation method can be an instance method but can also be a factory method, a static method, and an interface can not constraint a static method. - we want the signature of the interpolation method to be free to use any number of parameters of any types, something that can not be specified with type parameters in Java. So let's take a step back and write some examples, as a user of the interpolation method, we want to - be able to specify string interpolation, you can notice that this is a static method. String name = ... int value = ... String s = String."name: \(name) age: \(age)"; - we also want to be able to instantiate regex Pattern, and have a magic optimisation that creates the Pattern instance only one Pattern pattern = Pattern."foo|bar"; - we also want to support instance method, so the interpolation can escape the arguments differently depending on the context, here by example, escaping differently depending on the database driver. String username = ... Connection connection = ... connection.""" SELECT * FROM users where user == "\(username)" """; I think the simplest way to specify an interpolation method is to have a method with a special name, i will use __interpolate__ because i don't want to discuss the exact syntax here. This method can be a static method or an instance method and has a restriction, the first parameter has to be a String because the first argument is the formatted string. Here is an example of how the method __interpolate__ inside java.lang.String can be written. To avoid everybody to re-implement the parsing of the formatted string, the class java.lang.runtime.InterpolateMetafactory provides a helper method "formatIterator" that returns an iterator splitting the formatted string into text and binding. package java.lang; public class String { ... public static String __interpolate__(String format, Object... args) { var i = 0; var builder = new StringBuilder(); var iterator = InterpolateMetafactory.formatIterator(format); while(iterator.hasNext()) { switch(iterator.next()) { case Text(var text) -> builder.append(text); case Binding binding -> args[i++]; } } return builder.toString(); } ... } While this is nice, you may think that it's just syntactic sugar and it will not be more performant that String.valueOf(), i.e. it will be slow. That's why the specification allow you to provide a second more optimised version of the interpolation method using a method __interpolate__bootstrap__. This method __interpolate__bootstrap__ is not required, can not replace the method __interpolate__, both __interpolate__ and __interpolate__bootstrap__ has to be present and it's a backward compatible change to add a method __interpolate__bootstrap__ after the fact, there is no need to recompile all the client code. For that the compiler translation rely on invokedynamic to call the method bootstrap of the class InterpolateMetafactor that at runtime decide to trampoline either to the method __interpolate__bootstrap__ or to the method __interpolate__ if no __interpolate__bootstrap__ exists. Here is an example of how a call to the interpolation method of String is generated by javac For the Java code String name = ... int value = ... String s = String."name: \(name) age: \(age)"; the equivalent bytecode is aload_1. // load name iload_2. // load age invokedynamic __interpolate__ (Ljava/lang/StringI)Ljava/lang/String; java.lang.runtime.InterpolateMetafactory.bootstrap(Lookup, String, MethodType, String, MethodHandle):CallSite [ "name: \(name) age: \(age)", String::__interpolate__(String, Object[]):String ] >From the perspective of the compiler the method __interpolate__ works exactly like a method with a polymorphic method signature (the method annotated with @PolymorphicSignature), so the descriptor of invokedynamic is created by collecting the type of the argument, here the interpolation method is called with a String and an int, so the descriptor and the return type is String so the descriptor is (Ljava/lang/StringI)Ljava/lang/String; Considering the interpolation method as a polymorphic method is important in term of performance because it means that not boxing will be done by the compiler, if there are some boxing, they will be done by the runtime, so are optional if the __interpolate__bootstrap__ does not need to box arguments. You can also notice that the formatted string is passed as a bootstrap constant so all the parsing of the format can be done once outside of the hot path. A call to invokedynamic also pass as a second bootstrap argument the method handle to the method __interpolate__, so the implementation inside InterpolateMetafactory.bootstrap can called this method if no method __interpolate__bootstrap__ exists. Here is a raw implementation of the class InterpolateMetafactory. The method formatIterator() return an Iterator of Token which is a sealed class. The method bootstrap() first lookup to a method "__interpolate__bootstrap__" in the lookup class that takes a Lookup, a String, a MethodType, the format and the default implementation and call it if it exists or takes the default implementation, bind the formatted String and adapt the arguments using asType (ask for boxing, etc). package java.lang.runtime; public class InterpolateMetafactory { public sealed interface Token { public record Text(String text) implements Token {} public record Binding(String name) implements Token {} } public static Iterator formatIterator(String format) { ... } public static CallSite bootstrap(Lookup lookup, String name, MethodType methodType, String format, MethodHandle impl) throws Throwable { // check if there is a bootstrap method MethodHandle bootstrap; try { bootstrap = lookup.findStatic(lookup.lookupClass(), "__interpolate__bootstrap__", MethodType.methodType(CallSite.class, Lookup.class, String.class, MethodType.class, String.class, MethodHandle.class)); } catch(NoSuchMethodException e) { // bind the default implementation return new ConstantCallSite(impl.bindTo(format).asType(methodType)); } return boostrap.invoke(lookup, name, methodType, format, impl); } } Here is another example, showing how to declare the methods __interpolate__ and __interpolate__bootstrap__ inside java.util.regex.Pattern. The "default" implementation calls Pattern.compile() and the optimized one always returns the result of Pattern.compile() as a constant. package java.util.regex; public class Pattern { public static String __interpolate__(String format) {. // the formatted string can not have arguments return Pattern.compile(format); } private static CallSite __interpolate__bootstrap__(Lookup lookup, String name, MethodType methodType, String format, MethodHandle impl) { return new ConstantCallSite(MethodHandles.constant(Pattern.class, Pattern.compile(format))); } } The method __interpolate__ provides via its signature, the parameter types that are verified by the compiler. It also provides a code that can be used by the tools that does static analysis on the bytecode because those tools can not see through the method handle returned by a bootstrap method given that it's a runtime construct, it's usually not available at the time the static analysis is done. This should be enough to have tools like Graal VM native image to see through the invokedynamic in a similar way it sees through the invokedynamic used when creating a lambda. The fact that all invokedynamic goes through the method InterpolateMetafactory.bootstrap and trampoline from it means that adding or removing the method __interpolate__bootstrap__ is a binary compatible change, if __interpolate__bootstrap__ is declared private. So implementing __interpolate__bootstrap__ can be an afterthought. regards, R?mi From brian.goetz at oracle.com Wed Oct 13 19:32:19 2021 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 13 Oct 2021 15:32:19 -0400 Subject: String Interpolation In-Reply-To: <583208882.1665266.1634144995832.JavaMail.zimbra@u-pem.fr> References: <583208882.1665266.1634144995832.JavaMail.zimbra@u-pem.fr> Message-ID: <6bb17702-969d-a749-e438-c920d07ab4a4@oracle.com> The ability to capture per-call-site computation so it could be done exactly once (including generating an MH to describe it) has been part of the goal all along.? The JEP is deliberately cagey about this because we didn't want to descend down the translation rabbit hole before we'd achieved consensus on the broad strokes, any more than we wanted to descend down the syntax rabbit hole. (FWIW, all of these side-paths were ones we already traveled and rejected for various reasons :) As you correctly point out, without something like type classes, associating a static method like a bootstrap with a class requires committing some sort of sin, such as the "magic names" sins committed by serialization.? We surely didn't want to do that either. > - we also want to be able to instantiate regex Pattern, > and have a magic optimisation that creates the Pattern instance only one > > Pattern pattern = Pattern."foo|bar"; You said the magic anti-word, which is "magic".? We don't want this to be magic.? (Examples like this are better treated as a form of optimistic constant folding, along the lines explored at my JVMLS talk a few years ago.) Summary: wait for constant folding. > I think the simplest way to specify an interpolation method is to have a method with a special name, > i will use __interpolate__ because i don't want to discuss the exact syntax here. This is committing the same "magic name" sin as serialization. We deliberately avoided this in the design.? When we have type classes, we'll be able to use that as a way to bridge from a type name to a witness to a particular class.? Our design was crafted so that it could be gracefully extended to such a mechanism, when it is available (using a type name instead of an instance reference at the use site.) Summary: wait for type classes. > That's why the specification allow you to provide a second more optimised version of the interpolation method using a method __interpolate__bootstrap__. This is an obviously attractive goal, but the mechanism is way too ad-hoc -- and also too limited -- and also too advanced to be a language feature.? Bootstraps are way too complicated to expose in the source language in this way, especially not this magically.? And its too ad-hoc, since its specific to the interpolation feature, whereas one could imagine a number of other contexts where it is useful too.? So this is a bad tradeoff in many ways.? Jim's implementation very cleverly gets the equivalent of this using pure library implementation (which leans on MutableCallSite.) While it is surely a desirable goal to be able to optimize formatter implementation, it is also super-easy to become obsessed with this, and give it a bigger place in the feature than it deserves.? For some cases -- notably String::format -- there are huge savings to be had (from a number of sources, not least of which is that scanning the string at every invocation and choosing a strategy based on that is expensive.)? But in other cases, it is almost irrelevant.? For pure concatenation, it is already pretty fast; for SQL, the cost of constructing the query is a tiny part of the execution time, so its not even worth optimizing.? So this is a "nice to have" rather than the centerpiece of the feature. To be clear, the centerpiece is the gathering up of a template + parameters so that their combination can be handled by another entity, whether right now, later, or never.? Optimizing the case where it is done right now, using a predictable choice of entity, is an optimization, but not the centerpiece. Let me sketch out how we're envisioning this.? The API is something like: ??? interface TemplatePolicy { ??????? T apply(TemplatedString ts); ??????? // returns MethodHandle (TemplatePolicy, TemplatedString) -> Object ??????? default MethodHandle asMethodHandle(TemplatedString ts) { ??????????? return MH[TemplatePolicy::apply] ??????? } ??? } The API specification has a number of constraints on the implementation of asMethodHandle, which I'll get to in a second.? When the compiler encounters an immediate application P."...", it generates an indy, which uses a special bootstrap that returns a MutableCallSite.? The MutableCallSite initially has as its target a special secondary bootstrap MH, which represents an interpolation site that has not yet seen an actual invocation.? The secondary bootstrap MH has the shape of TemplatePolicy::apply (e.g., (TemplatePolicy, TemplatedString) -> Object), so on first invocation it receives the TP object and the TS.? It then calls TP::asMethodHandle, and wraps this MH with a GWT which validates the invariants and proceeds to that MH if they hold -- which they will 99.x% of the time. The invariant is that the dynamic type of the per-instantiation TP be == to the dynamic type of the TP that was present at secondary linkage.? That is, it be an instance of the same class, but not the same instance.? By definition, the string will always be the same as will the types of the parameters, since this is specific to concrete P."..." sites.? So the MH can take advantage of that. The constraint on TP::asMethodHandle is that it not undermine this invariant; that if it generates a MH that is dependent on TP state, it not bake that state into the resulting MH, but instead, treat the TP state as a parameter.? Further, the MH must be behaviorally equivalent to calling apply. If the GWT fails, it means the user is doing something like: ??? for (TP p : listOfProcessors) { ??????? blah blah p."foo \{a}" ??? } in which case the GWT falls back to the "just do an invokevirtual of TP::apply" strategy.? (It could get fancier but I don't see any point.) This lets us rescue indy-based translation without exposing a magic indy-hook in the JLS.? (Sorry, I know you wanted the magic indy hook.) On 10/13/2021 1:09 PM, Remi Forax wrote: > Hi everybody, i've spend some time to think how the String interpolation + Policy should be specified and implemented. > > > The goal is to add a syntax specifying a user defined method to "interpolate" (for a lack of better word) a string with arguments. > > Given that it's a method, the exact semantics of the interpolation, things like how the arguments are escaped, how the formatted string is parsed, is written is Java, this will allow to support a wide range of use cases. > > This proposal does not differ from the original proposal of Brian and Jim in its goal but in the way a user declare the interpolation method(s). > > TLDR; you can declare an interpolation method and optionally an interpolation bootstrap method if you want a more efficient code at the price of having to play with the method handle API. > > --- > > The proposal of Brian and Jim uses an interface to define the policy but in this case, using an interface is not what we want. > I think there are two main reasons, > - the interpolation method can be an instance method but can also be a factory method, a static method, and an interface can not constraint a static method. > - we want the signature of the interpolation method to be free to use any number of parameters of any types, something that can not be specified with type parameters in Java. > > So let's take a step back and write some examples, as a user of the interpolation method, we want to > - be able to specify string interpolation, > you can notice that this is a static method. > > String name = ... > int value = ... > String s = String."name: \(name) age: \(age)"; > > > - we also want to be able to instantiate regex Pattern, > and have a magic optimisation that creates the Pattern instance only one > > Pattern pattern = Pattern."foo|bar"; > > - we also want to support instance method, so the interpolation can escape the arguments differently depending on the context, > here by example, escaping differently depending on the database driver. > > String username = ... > Connection connection = ... > connection.""" > SELECT * FROM users where user == "\(username)" > """; > > I think the simplest way to specify an interpolation method is to have a method with a special name, > i will use __interpolate__ because i don't want to discuss the exact syntax here. > > This method can be a static method or an instance method and has a restriction, the first parameter has to be a String because the first argument is the formatted string. > > Here is an example of how the method __interpolate__ inside java.lang.String can be written. > To avoid everybody to re-implement the parsing of the formatted string, the class java.lang.runtime.InterpolateMetafactory provides a helper method "formatIterator" that returns an iterator splitting the formatted string into text and binding. > > > package java.lang; > > public class String { > ... > public static String __interpolate__(String format, Object... args) { > var i = 0; > var builder = new StringBuilder(); > var iterator = InterpolateMetafactory.formatIterator(format); > while(iterator.hasNext()) { > switch(iterator.next()) { > case Text(var text) -> builder.append(text); > case Binding binding -> args[i++]; > } > } > return builder.toString(); > } > ... > } > > While this is nice, you may think that it's just syntactic sugar and it will not be more performant that String.valueOf(), i.e. it will be slow. > > That's why the specification allow you to provide a second more optimised version of the interpolation method using a method __interpolate__bootstrap__. > This method __interpolate__bootstrap__ is not required, can not replace the method __interpolate__, both __interpolate__ and __interpolate__bootstrap__ > has to be present and it's a backward compatible change to add a method __interpolate__bootstrap__ after the fact, there is no need to recompile > all the client code. > > For that the compiler translation rely on invokedynamic to call the method bootstrap of the class InterpolateMetafactor that at runtime decide > to trampoline either to the method __interpolate__bootstrap__ or to the method __interpolate__ if no __interpolate__bootstrap__ exists. > > Here is an example of how a call to the interpolation method of String is generated by javac > For the Java code > > String name = ... > int value = ... > String s = String."name: \(name) age: \(age)"; > > the equivalent bytecode is > > aload_1. // load name > iload_2. // load age > invokedynamic __interpolate__ (Ljava/lang/StringI)Ljava/lang/String; > java.lang.runtime.InterpolateMetafactory.bootstrap(Lookup, String, MethodType, String, MethodHandle):CallSite > [ "name: \(name) age: \(age)", String::__interpolate__(String, Object[]):String ] > > From the perspective of the compiler the method __interpolate__ works exactly like a method with a polymorphic method signature (the method annotated with @PolymorphicSignature), > so the descriptor of invokedynamic is created by collecting the type of the argument, here the interpolation method is called with a String and an int, so the descriptor > and the return type is String so the descriptor is (Ljava/lang/StringI)Ljava/lang/String; > > Considering the interpolation method as a polymorphic method is important in term of performance because it means that not boxing will be done by the compiler, if there are some boxing, they will be done by the runtime, so are optional if the __interpolate__bootstrap__ does not need to box arguments. > > You can also notice that the formatted string is passed as a bootstrap constant so all the parsing of the format can be done once outside of the hot path. > A call to invokedynamic also pass as a second bootstrap argument the method handle to the method __interpolate__, so the implementation inside InterpolateMetafactory.bootstrap can called this method if no method __interpolate__bootstrap__ exists. > > > Here is a raw implementation of the class InterpolateMetafactory. > The method formatIterator() return an Iterator of Token which is a sealed class. > The method bootstrap() first lookup to a method "__interpolate__bootstrap__" in the lookup class that takes a Lookup, a String, a MethodType, the format and the default implementation and call it if it exists or takes the default implementation, bind the formatted String and adapt the arguments using asType (ask for boxing, etc). > > package java.lang.runtime; > > public class InterpolateMetafactory { > public sealed interface Token { > public record Text(String text) implements Token {} > public record Binding(String name) implements Token {} > } > > > public static Iterator formatIterator(String format) { > ... > } > > public static CallSite bootstrap(Lookup lookup, String name, MethodType methodType, String format, MethodHandle impl) throws Throwable { > // check if there is a bootstrap method > MethodHandle bootstrap; > try { > bootstrap = lookup.findStatic(lookup.lookupClass(), "__interpolate__bootstrap__", MethodType.methodType(CallSite.class, Lookup.class, String.class, MethodType.class, String.class, MethodHandle.class)); > } catch(NoSuchMethodException e) { > // bind the default implementation > return new ConstantCallSite(impl.bindTo(format).asType(methodType)); > } > return boostrap.invoke(lookup, name, methodType, format, impl); > } > } > > > Here is another example, showing how to declare the methods __interpolate__ and __interpolate__bootstrap__ inside java.util.regex.Pattern. > The "default" implementation calls Pattern.compile() and the optimized one always returns the result of Pattern.compile() as a constant. > > package java.util.regex; > > public class Pattern { > public static String __interpolate__(String format) {. // the formatted string can not have arguments > return Pattern.compile(format); > } > > private static CallSite __interpolate__bootstrap__(Lookup lookup, String name, MethodType methodType, String format, MethodHandle impl) { > return new ConstantCallSite(MethodHandles.constant(Pattern.class, Pattern.compile(format))); > } > } > > > The method __interpolate__ provides via its signature, the parameter types that are verified by the compiler. > It also provides a code that can be used by the tools that does static analysis on the bytecode because those tools can not see through the method handle returned by a bootstrap method given that it's a runtime construct, it's usually not available at the time the static analysis is done. This should be enough to have tools like Graal VM native image to see through the invokedynamic in a similar way it sees through the invokedynamic used when creating a lambda. > > The fact that all invokedynamic goes through the method InterpolateMetafactory.bootstrap and trampoline from it means that adding or removing the method __interpolate__bootstrap__ is a binary compatible change, if __interpolate__bootstrap__ is declared private. So implementing __interpolate__bootstrap__ can be an afterthought. > > regards, > R?mi From forax at univ-mlv.fr Wed Oct 13 20:07:34 2021 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Wed, 13 Oct 2021 22:07:34 +0200 (CEST) Subject: String Interpolation In-Reply-To: <6bb17702-969d-a749-e438-c920d07ab4a4@oracle.com> References: <583208882.1665266.1634144995832.JavaMail.zimbra@u-pem.fr> <6bb17702-969d-a749-e438-c920d07ab4a4@oracle.com> Message-ID: <1556797536.1701993.1634155654113.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "Remi Forax" , "amber-spec-experts" > > Sent: Mercredi 13 Octobre 2021 21:32:19 > Subject: Re: String Interpolation > The ability to capture per-call-site computation so it could be done exactly > once (including generating an MH to describe it) has been part of the goal all > along. The JEP is deliberately cagey about this because we didn't want to > descend down the translation rabbit hole before we'd achieved consensus on the > broad strokes, any more than we wanted to descend down the syntax rabbit hole. > (FWIW, all of these side-paths were ones we already traveled and rejected for > various reasons :) > As you correctly point out, without something like type classes, associating a > static method like a bootstrap with a class requires committing some sort of > sin, such as the "magic names" sins committed by serialization. We surely > didn't want to do that either. What we want here is a "protocol", a protocol is something that is really like a method call but with extra syntax, extra constraints. Java uses protocols. We are used to see a method with the class name and no return type as a constructor that we may not even realize that Java uses special names to indicate a protocol. Unlike Scala, the current syntax does not specify a method name, it's String."a text" and not String.method"a text". That why i've proposed to use a special name. Soon we will want to introduce user defined pattern, this is also a protocol, it's a kind of method call too, with extra syntax, extra constraints. Instead of using I agree that using a magic name is less than ideal to define a protocol, but >> - we also want to be able to instantiate regex Pattern, >> and have a magic optimisation that creates the Pattern instance only one >> Pattern pattern = Pattern."foo|bar"; > You said the magic anti-word, which is "magic". We don't want this to be magic. > (Examples like this are better treated as a form of optimistic constant > folding, along the lines explored at my JVMLS talk a few years ago.) > Summary: wait for constant folding. >> I think the simplest way to specify an interpolation method is to have a method >> with a special name, >> i will use __interpolate__ because i don't want to discuss the exact syntax >> here. > This is committing the same "magic name" sin as serialization. We deliberately > avoided this in the design. When we have type classes, we'll be able to use > that as a way to bridge from a type name to a witness to a particular class. > Our design was crafted so that it could be gracefully extended to such a > mechanism, when it is available (using a type name instead of an instance > reference at the use site.) > Summary: wait for type classes. >> That's why the specification allow you to provide a second more optimised >> version of the interpolation method using a method __interpolate__bootstrap__. > This is an obviously attractive goal, but the mechanism is way too ad-hoc -- and > also too limited -- and also too advanced to be a language feature. Bootstraps > are way too complicated to expose in the source language in this way, > especially not this magically. And its too ad-hoc, since its specific to the > interpolation feature, whereas one could imagine a number of other contexts > where it is useful too. So this is a bad tradeoff in many ways. Jim's > implementation very cleverly gets the equivalent of this using pure library > implementation (which leans on MutableCallSite.) > While it is surely a desirable goal to be able to optimize formatter > implementation, it is also super-easy to become obsessed with this, and give it > a bigger place in the feature than it deserves. For some cases -- notably > String::format -- there are huge savings to be had (from a number of sources, > not least of which is that scanning the string at every invocation and choosing > a strategy based on that is expensive.) But in other cases, it is almost > irrelevant. For pure concatenation, it is already pretty fast; for SQL, the > cost of constructing the query is a tiny part of the execution time, so its not > even worth optimizing. So this is a "nice to have" rather than the centerpiece > of the feature. > To be clear, the centerpiece is the gathering up of a template + parameters so > that their combination can be handled by another entity, whether right now, > later, or never. Optimizing the case where it is done right now, using a > predictable choice of entity, is an optimization, but not the centerpiece. > Let me sketch out how we're envisioning this. The API is something like: > interface TemplatePolicy { > T apply(TemplatedString ts); > // returns MethodHandle (TemplatePolicy, TemplatedString) -> Object > default MethodHandle asMethodHandle(TemplatedString ts) { > return MH[TemplatePolicy::apply] > } > } > The API specification has a number of constraints on the implementation of > asMethodHandle, which I'll get to in a second. When the compiler encounters an > immediate application P."...", it generates an indy, which uses a special > bootstrap that returns a MutableCallSite. The MutableCallSite initially has as > its target a special secondary bootstrap MH, which represents an interpolation > site that has not yet seen an actual invocation. The secondary bootstrap MH has > the shape of TemplatePolicy::apply (e.g., (TemplatePolicy, TemplatedString) -> > Object), so on first invocation it receives the TP object and the TS. It then > calls TP::asMethodHandle, and wraps this MH with a GWT which validates the > invariants and proceeds to that MH if they hold -- which they will 99.x% of the > time. > The invariant is that the dynamic type of the per-instantiation TP be == to the > dynamic type of the TP that was present at secondary linkage. That is, it be an > instance of the same class, but not the same instance. By definition, the > string will always be the same as will the types of the parameters, since this > is specific to concrete P."..." sites. So the MH can take advantage of that. > The constraint on TP::asMethodHandle is that it not undermine this invariant; > that if it generates a MH that is dependent on TP state, it not bake that state > into the resulting MH, but instead, treat the TP state as a parameter. Further, > the MH must be behaviorally equivalent to calling apply. > If the GWT fails, it means the user is doing something like: > for (TP p : listOfProcessors) { > blah blah p."foo \{a}" > } > in which case the GWT falls back to the "just do an invokevirtual of TP::apply" > strategy. (It could get fancier but I don't see any point.) > This lets us rescue indy-based translation without exposing a magic indy-hook in > the JLS. (Sorry, I know you wanted the magic indy hook.) > On 10/13/2021 1:09 PM, Remi Forax wrote: >> Hi everybody, i've spend some time to think how the String interpolation + >> Policy should be specified and implemented. >> The goal is to add a syntax specifying a user defined method to "interpolate" >> (for a lack of better word) a string with arguments. >> Given that it's a method, the exact semantics of the interpolation, things like >> how the arguments are escaped, how the formatted string is parsed, is written >> is Java, this will allow to support a wide range of use cases. >> This proposal does not differ from the original proposal of Brian and Jim in its >> goal but in the way a user declare the interpolation method(s). >> TLDR; you can declare an interpolation method and optionally an interpolation >> bootstrap method if you want a more efficient code at the price of having to >> play with the method handle API. >> --- >> The proposal of Brian and Jim uses an interface to define the policy but in this >> case, using an interface is not what we want. >> I think there are two main reasons, >> - the interpolation method can be an instance method but can also be a factory >> method, a static method, and an interface can not constraint a static method. >> - we want the signature of the interpolation method to be free to use any number >> of parameters of any types, something that can not be specified with type >> parameters in Java. >> So let's take a step back and write some examples, as a user of the >> interpolation method, we want to >> - be able to specify string interpolation, >> you can notice that this is a static method. >> String name = ... >> int value = ... >> String s = String."name: \(name) age: \(age)"; >> - we also want to be able to instantiate regex Pattern, >> and have a magic optimisation that creates the Pattern instance only one >> Pattern pattern = Pattern."foo|bar"; >> - we also want to support instance method, so the interpolation can escape the >> arguments differently depending on the context, >> here by example, escaping differently depending on the database driver. >> String username = ... >> Connection connection = ... >> connection.""" >> SELECT * FROM users where user == "\(username)" >> """; >> I think the simplest way to specify an interpolation method is to have a method >> with a special name, >> i will use __interpolate__ because i don't want to discuss the exact syntax >> here. >> This method can be a static method or an instance method and has a restriction, >> the first parameter has to be a String because the first argument is the >> formatted string. >> Here is an example of how the method __interpolate__ inside java.lang.String can >> be written. >> To avoid everybody to re-implement the parsing of the formatted string, the >> class java.lang.runtime.InterpolateMetafactory provides a helper method >> "formatIterator" that returns an iterator splitting the formatted string into >> text and binding. >> package java.lang; >> public class String { >> ... >> public static String __interpolate__(String format, Object... args) { >> var i = 0; >> var builder = new StringBuilder(); >> var iterator = InterpolateMetafactory.formatIterator(format); >> while(iterator.hasNext()) { >> switch(iterator.next()) { >> case Text(var text) -> builder.append(text); >> case Binding binding -> args[i++]; >> } >> } >> return builder.toString(); >> } >> ... >> } >> While this is nice, you may think that it's just syntactic sugar and it will not >> be more performant that String.valueOf(), i.e. it will be slow. >> That's why the specification allow you to provide a second more optimised >> version of the interpolation method using a method __interpolate__bootstrap__. >> This method __interpolate__bootstrap__ is not required, can not replace the >> method __interpolate__, both __interpolate__ and __interpolate__bootstrap__ >> has to be present and it's a backward compatible change to add a method >> __interpolate__bootstrap__ after the fact, there is no need to recompile >> all the client code. >> For that the compiler translation rely on invokedynamic to call the method >> bootstrap of the class InterpolateMetafactor that at runtime decide >> to trampoline either to the method __interpolate__bootstrap__ or to the method >> __interpolate__ if no __interpolate__bootstrap__ exists. >> Here is an example of how a call to the interpolation method of String is >> generated by javac >> For the Java code >> String name = ... >> int value = ... >> String s = String."name: \(name) age: \(age)"; >> the equivalent bytecode is >> aload_1. // load name >> iload_2. // load age >> invokedynamic __interpolate__ (Ljava/lang/StringI)Ljava/lang/String; >> java.lang.runtime.InterpolateMetafactory.bootstrap(Lookup, String, MethodType, >> String, MethodHandle):CallSite >> [ "name: \(name) age: \(age)", String::__interpolate__(String, Object[]):String >> ] >> From the perspective of the compiler the method __interpolate__ works exactly >> like a method with a polymorphic method signature (the method annotated with >> @PolymorphicSignature), >> so the descriptor of invokedynamic is created by collecting the type of the >> argument, here the interpolation method is called with a String and an int, so >> the descriptor >> and the return type is String so the descriptor is >> (Ljava/lang/StringI)Ljava/lang/String; >> Considering the interpolation method as a polymorphic method is important in >> term of performance because it means that not boxing will be done by the >> compiler, if there are some boxing, they will be done by the runtime, so are >> optional if the __interpolate__bootstrap__ does not need to box arguments. >> You can also notice that the formatted string is passed as a bootstrap constant >> so all the parsing of the format can be done once outside of the hot path. >> A call to invokedynamic also pass as a second bootstrap argument the method >> handle to the method __interpolate__, so the implementation inside >> InterpolateMetafactory.bootstrap can called this method if no method >> __interpolate__bootstrap__ exists. >> Here is a raw implementation of the class InterpolateMetafactory. >> The method formatIterator() return an Iterator of Token which is a sealed class. >> The method bootstrap() first lookup to a method "__interpolate__bootstrap__" in >> the lookup class that takes a Lookup, a String, a MethodType, the format and >> the default implementation and call it if it exists or takes the default >> implementation, bind the formatted String and adapt the arguments using asType >> (ask for boxing, etc). >> package java.lang.runtime; >> public class InterpolateMetafactory { >> public sealed interface Token { >> public record Text(String text) implements Token {} >> public record Binding(String name) implements Token {} >> } >> public static Iterator formatIterator(String format) { >> ... >> } >> public static CallSite bootstrap(Lookup lookup, String name, MethodType >> methodType, String format, MethodHandle impl) throws Throwable { >> // check if there is a bootstrap method >> MethodHandle bootstrap; >> try { >> bootstrap = lookup.findStatic(lookup.lookupClass(), >> "__interpolate__bootstrap__", MethodType.methodType(CallSite.class, >> Lookup.class, String.class, MethodType.class, String.class, >> MethodHandle.class)); >> } catch(NoSuchMethodException e) { >> // bind the default implementation >> return new ConstantCallSite(impl.bindTo(format).asType(methodType)); >> } >> return boostrap.invoke(lookup, name, methodType, format, impl); >> } >> } >> Here is another example, showing how to declare the methods __interpolate__ and >> __interpolate__bootstrap__ inside java.util.regex.Pattern. >> The "default" implementation calls Pattern.compile() and the optimized one >> always returns the result of Pattern.compile() as a constant. >> package java.util.regex; >> public class Pattern { >> public static String __interpolate__(String format) {. // the formatted string >> can not have arguments >> return Pattern.compile(format); >> } >> private static CallSite __interpolate__bootstrap__(Lookup lookup, String name, >> MethodType methodType, String format, MethodHandle impl) { >> return new ConstantCallSite(MethodHandles.constant(Pattern.class, >> Pattern.compile(format))); >> } >> } >> The method __interpolate__ provides via its signature, the parameter types that >> are verified by the compiler. >> It also provides a code that can be used by the tools that does static analysis >> on the bytecode because those tools can not see through the method handle >> returned by a bootstrap method given that it's a runtime construct, it's >> usually not available at the time the static analysis is done. This should be >> enough to have tools like Graal VM native image to see through the >> invokedynamic in a similar way it sees through the invokedynamic used when >> creating a lambda. >> The fact that all invokedynamic goes through the method >> InterpolateMetafactory.bootstrap and trampoline from it means that adding or >> removing the method __interpolate__bootstrap__ is a binary compatible change, >> if __interpolate__bootstrap__ is declared private. So implementing >> __interpolate__bootstrap__ can be an afterthought. >> regards, >> R?mi From forax at univ-mlv.fr Wed Oct 13 22:28:05 2021 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 14 Oct 2021 00:28:05 +0200 (CEST) Subject: String Interpolation In-Reply-To: <6bb17702-969d-a749-e438-c920d07ab4a4@oracle.com> References: <583208882.1665266.1634144995832.JavaMail.zimbra@u-pem.fr> <6bb17702-969d-a749-e438-c920d07ab4a4@oracle.com> Message-ID: <1089189662.1720947.1634164085994.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "Remi Forax" , "amber-spec-experts" > > Sent: Mercredi 13 Octobre 2021 21:32:19 > Subject: Re: String Interpolation > The ability to capture per-call-site computation so it could be done exactly > once (including generating an MH to describe it) has been part of the goal all > along. The JEP is deliberately cagey about this because we didn't want to > descend down the translation rabbit hole before we'd achieved consensus on the > broad strokes, any more than we wanted to descend down the syntax rabbit hole. > (FWIW, all of these side-paths were ones we already traveled and rejected for > various reasons :) > As you correctly point out, without something like type classes, associating a > static method like a bootstrap with a class requires committing some sort of > sin, such as the "magic names" sins committed by serialization. The current syntax is something like String."a text" There is no method name, so we have basically two choices, either make the syntax more like a method call, it's what Scala does String.method"a text" Or we specify what i would call a protocol. A protocol is like a method call by enhanced with an adhoc syntax and constraints. By example, the constructor of Java is a protocol, we name the method with the same name as the class and do not specify a return type and magically, it becomes a constructor with it's own set of rules. Soon we will introduce user defined pattern methods, this alspo needs a protocol, the current proposal for them is to use a special modifier like destructor or pattern. If you prefer to use a special modifier, i'm fine with that, if you think it's better to change the user site to specify a method name, i'm fine with that too. > We surely didn't want to do that either. >> - we also want to be able to instantiate regex Pattern, >> and have a magic optimisation that creates the Pattern instance only one >> Pattern pattern = Pattern."foo|bar"; > You said the magic anti-word, which is "magic". We don't want this to be magic. > (Examples like this are better treated as a form of optimistic constant > folding, along the lines explored at my JVMLS talk a few years ago.) > Summary: wait for constant folding. I don't like constant folding for several reasons, it's a one size fit for all, you can not specify in the code how you transform the format before it being constant folded, and the more the Java compiler is dumb the better. Constant folding is the kind of feature that tends to interact with all other feature (recent example, case Foo foo vs case Foo foo && true vs case Foo foo && 2 == 2). >> I think the simplest way to specify an interpolation method is to have a method >> with a special name, >> i will use __interpolate__ because i don't want to discuss the exact syntax >> here. > This is committing the same "magic name" sin as serialization. We deliberately > avoided this in the design. When we have type classes, we'll be able to use > that as a way to bridge from a type name to a witness to a particular class. > Our design was crafted so that it could be gracefully extended to such a > mechanism, when it is available (using a type name instead of an instance > reference at the use site.) > Summary: wait for type classes. Adding type classes may solve how to specify a contract on a static method, it does not solve the fact that you want the signature of the method (static or not) to be polymorphic. >> That's why the specification allow you to provide a second more optimised >> version of the interpolation method using a method __interpolate__bootstrap__. > This is an obviously attractive goal, but the mechanism is way too ad-hoc -- and > also too limited -- and also too advanced to be a language feature. Bootstraps > are way too complicated to expose in the source language in this way, > especially not this magically. And its too ad-hoc, since its specific to the > interpolation feature, whereas one could imagine a number of other contexts > where it is useful too. So this is a bad tradeoff in many ways. Jim's > implementation very cleverly gets the equivalent of this using pure library > implementation (which leans on MutableCallSite.) Using a MutableCallsite as a way to devirtualize something you have arbitrarily specified as virtual is the tail wagging the dog. I've written a library [1] that uses MutableCallsite where it should use ConstantCallSite to bypass the inability of javac to generate an invokedynamic. But at least, i always felt guilty about it. Adding mutable callsites to the runtime of Java is a mistake, the performance model is really tricky. > While it is surely a desirable goal to be able to optimize formatter > implementation, it is also super-easy to become obsessed with this, and give it > a bigger place in the feature than it deserves. For some cases -- notably > String::format -- there are huge savings to be had (from a number of sources, > not least of which is that scanning the string at every invocation and choosing > a strategy based on that is expensive.) But in other cases, it is almost > irrelevant. For pure concatenation, it is already pretty fast; for SQL, the > cost of constructing the query is a tiny part of the execution time, so its not > even worth optimizing. So this is a "nice to have" rather than the centerpiece > of the feature. > To be clear, the centerpiece is the gathering up of a template + parameters so > that their combination can be handled by another entity, whether right now, > later, or never. Optimizing the case where it is done right now, using a > predictable choice of entity, is an optimization, but not the centerpiece. > Let me sketch out how we're envisioning this. The API is something like: > interface TemplatePolicy { > T apply(TemplatedString ts); > // returns MethodHandle (TemplatePolicy, TemplatedString) -> Object > default MethodHandle asMethodHandle(TemplatedString ts) { > return MH[TemplatePolicy::apply] > } > } Let's not talk about the bootstrap method for a second. This API fails to indicate to the compiler the type of the parameters that are allowed before calling the template, by example, i may want to specify a query as a String but with only expression of type Expression as arguments. This API forces the implementation to be ready to have any arguments (and those have to be boxed). And you are re-inventing a strawman way to implement the JSR 292, you're design is actually this is quite close to the early designs of Gilad Bracha. At some point, you will want: - to share the same TemplatePolicy without doing inheritance, you will re-invent the Lookup object - avoid the unecessary boxing, you will pass the MethodType as parameter - avoid to have a PIC (Polymorphic Inliniing Cache) for things as simple as return always the same constant, you will make the API a function call not a method call - avoid to wait until until all arguments are on the stack and segregate between dynamic arguments and constant arguments, you you will re-invent the boostrap API The reason you will gravitate toward the bootstrap API is that fundmentally, it's a way to specify a linker in Java code, which is what you want here. > The API specification has a number of constraints on the implementation of > asMethodHandle, which I'll get to in a second. When the compiler encounters an > immediate application P."...", it generates an indy, which uses a special > bootstrap that returns a MutableCallSite. The MutableCallSite initially has as > its target a special secondary bootstrap MH, which represents an interpolation > site that has not yet seen an actual invocation. The secondary bootstrap MH has > the shape of TemplatePolicy::apply (e.g., (TemplatePolicy, TemplatedString) -> > Object), so on first invocation it receives the TP object and the TS. It then > calls TP::asMethodHandle, and wraps this MH with a GWT which validates the > invariants and proceeds to that MH if they hold -- which they will 99.x% of the > time. > The invariant is that the dynamic type of the per-instantiation TP be == to the > dynamic type of the TP that was present at secondary linkage. That is, it be an > instance of the same class, but not the same instance. By definition, the > string will always be the same as will the types of the parameters, since this > is specific to concrete P."..." sites. So the MH can take advantage of that. > The constraint on TP::asMethodHandle is that it not undermine this invariant; > that if it generates a MH that is dependent on TP state, it not bake that state > into the resulting MH, but instead, treat the TP state as a parameter. Further, > the MH must be behaviorally equivalent to calling apply. > If the GWT fails, it means the user is doing something like: > for (TP p : listOfProcessors) { > blah blah p."foo \{a}" > } > in which case the GWT falls back to the "just do an invokevirtual of TP::apply" > strategy. (It could get fancier but I don't see any point.) > This lets us rescue indy-based translation without exposing a magic indy-hook in > the JLS. (Sorry, I know you wanted the magic indy hook.) The issue is not about me asking you to add a magic hook, once you have an API that returns a MethodHandle used by an invokedynamic, you are providing a magic hook. The issue is that in your attempt to try to not provide a magic hook, you are providing a Smalltalk like magic hook, where all type information are lost (no typechecking by the compiler, no way to get the type information at runtime to avoid boxing) and with a crappy performance model (boxing again + a PIC for devirtualizing something which should not be virtual). R?mi [1] https://github.com/forax/exotic > On 10/13/2021 1:09 PM, Remi Forax wrote: >> Hi everybody, i've spend some time to think how the String interpolation + >> Policy should be specified and implemented. >> The goal is to add a syntax specifying a user defined method to "interpolate" >> (for a lack of better word) a string with arguments. >> Given that it's a method, the exact semantics of the interpolation, things like >> how the arguments are escaped, how the formatted string is parsed, is written >> is Java, this will allow to support a wide range of use cases. >> This proposal does not differ from the original proposal of Brian and Jim in its >> goal but in the way a user declare the interpolation method(s). >> TLDR; you can declare an interpolation method and optionally an interpolation >> bootstrap method if you want a more efficient code at the price of having to >> play with the method handle API. >> --- >> The proposal of Brian and Jim uses an interface to define the policy but in this >> case, using an interface is not what we want. >> I think there are two main reasons, >> - the interpolation method can be an instance method but can also be a factory >> method, a static method, and an interface can not constraint a static method. >> - we want the signature of the interpolation method to be free to use any number >> of parameters of any types, something that can not be specified with type >> parameters in Java. >> So let's take a step back and write some examples, as a user of the >> interpolation method, we want to >> - be able to specify string interpolation, >> you can notice that this is a static method. >> String name = ... >> int value = ... >> String s = String."name: \(name) age: \(age)"; >> - we also want to be able to instantiate regex Pattern, >> and have a magic optimisation that creates the Pattern instance only one >> Pattern pattern = Pattern."foo|bar"; >> - we also want to support instance method, so the interpolation can escape the >> arguments differently depending on the context, >> here by example, escaping differently depending on the database driver. >> String username = ... >> Connection connection = ... >> connection.""" >> SELECT * FROM users where user == "\(username)" >> """; >> I think the simplest way to specify an interpolation method is to have a method >> with a special name, >> i will use __interpolate__ because i don't want to discuss the exact syntax >> here. >> This method can be a static method or an instance method and has a restriction, >> the first parameter has to be a String because the first argument is the >> formatted string. >> Here is an example of how the method __interpolate__ inside java.lang.String can >> be written. >> To avoid everybody to re-implement the parsing of the formatted string, the >> class java.lang.runtime.InterpolateMetafactory provides a helper method >> "formatIterator" that returns an iterator splitting the formatted string into >> text and binding. >> package java.lang; >> public class String { >> ... >> public static String __interpolate__(String format, Object... args) { >> var i = 0; >> var builder = new StringBuilder(); >> var iterator = InterpolateMetafactory.formatIterator(format); >> while(iterator.hasNext()) { >> switch(iterator.next()) { >> case Text(var text) -> builder.append(text); >> case Binding binding -> args[i++]; >> } >> } >> return builder.toString(); >> } >> ... >> } >> While this is nice, you may think that it's just syntactic sugar and it will not >> be more performant that String.valueOf(), i.e. it will be slow. >> That's why the specification allow you to provide a second more optimised >> version of the interpolation method using a method __interpolate__bootstrap__. >> This method __interpolate__bootstrap__ is not required, can not replace the >> method __interpolate__, both __interpolate__ and __interpolate__bootstrap__ >> has to be present and it's a backward compatible change to add a method >> __interpolate__bootstrap__ after the fact, there is no need to recompile >> all the client code. >> For that the compiler translation rely on invokedynamic to call the method >> bootstrap of the class InterpolateMetafactor that at runtime decide >> to trampoline either to the method __interpolate__bootstrap__ or to the method >> __interpolate__ if no __interpolate__bootstrap__ exists. >> Here is an example of how a call to the interpolation method of String is >> generated by javac >> For the Java code >> String name = ... >> int value = ... >> String s = String."name: \(name) age: \(age)"; >> the equivalent bytecode is >> aload_1. // load name >> iload_2. // load age >> invokedynamic __interpolate__ (Ljava/lang/StringI)Ljava/lang/String; >> java.lang.runtime.InterpolateMetafactory.bootstrap(Lookup, String, MethodType, >> String, MethodHandle):CallSite >> [ "name: \(name) age: \(age)", String::__interpolate__(String, Object[]):String >> ] >> From the perspective of the compiler the method __interpolate__ works exactly >> like a method with a polymorphic method signature (the method annotated with >> @PolymorphicSignature), >> so the descriptor of invokedynamic is created by collecting the type of the >> argument, here the interpolation method is called with a String and an int, so >> the descriptor >> and the return type is String so the descriptor is >> (Ljava/lang/StringI)Ljava/lang/String; >> Considering the interpolation method as a polymorphic method is important in >> term of performance because it means that not boxing will be done by the >> compiler, if there are some boxing, they will be done by the runtime, so are >> optional if the __interpolate__bootstrap__ does not need to box arguments. >> You can also notice that the formatted string is passed as a bootstrap constant >> so all the parsing of the format can be done once outside of the hot path. >> A call to invokedynamic also pass as a second bootstrap argument the method >> handle to the method __interpolate__, so the implementation inside >> InterpolateMetafactory.bootstrap can called this method if no method >> __interpolate__bootstrap__ exists. >> Here is a raw implementation of the class InterpolateMetafactory. >> The method formatIterator() return an Iterator of Token which is a sealed class. >> The method bootstrap() first lookup to a method "__interpolate__bootstrap__" in >> the lookup class that takes a Lookup, a String, a MethodType, the format and >> the default implementation and call it if it exists or takes the default >> implementation, bind the formatted String and adapt the arguments using asType >> (ask for boxing, etc). >> package java.lang.runtime; >> public class InterpolateMetafactory { >> public sealed interface Token { >> public record Text(String text) implements Token {} >> public record Binding(String name) implements Token {} >> } >> public static Iterator formatIterator(String format) { >> ... >> } >> public static CallSite bootstrap(Lookup lookup, String name, MethodType >> methodType, String format, MethodHandle impl) throws Throwable { >> // check if there is a bootstrap method >> MethodHandle bootstrap; >> try { >> bootstrap = lookup.findStatic(lookup.lookupClass(), >> "__interpolate__bootstrap__", MethodType.methodType(CallSite.class, >> Lookup.class, String.class, MethodType.class, String.class, >> MethodHandle.class)); >> } catch(NoSuchMethodException e) { >> // bind the default implementation >> return new ConstantCallSite(impl.bindTo(format).asType(methodType)); >> } >> return boostrap.invoke(lookup, name, methodType, format, impl); >> } >> } >> Here is another example, showing how to declare the methods __interpolate__ and >> __interpolate__bootstrap__ inside java.util.regex.Pattern. >> The "default" implementation calls Pattern.compile() and the optimized one >> always returns the result of Pattern.compile() as a constant. >> package java.util.regex; >> public class Pattern { >> public static String __interpolate__(String format) {. // the formatted string >> can not have arguments >> return Pattern.compile(format); >> } >> private static CallSite __interpolate__bootstrap__(Lookup lookup, String name, >> MethodType methodType, String format, MethodHandle impl) { >> return new ConstantCallSite(MethodHandles.constant(Pattern.class, >> Pattern.compile(format))); >> } >> } >> The method __interpolate__ provides via its signature, the parameter types that >> are verified by the compiler. >> It also provides a code that can be used by the tools that does static analysis >> on the bytecode because those tools can not see through the method handle >> returned by a bootstrap method given that it's a runtime construct, it's >> usually not available at the time the static analysis is done. This should be >> enough to have tools like Graal VM native image to see through the >> invokedynamic in a similar way it sees through the invokedynamic used when >> creating a lambda. >> The fact that all invokedynamic goes through the method >> InterpolateMetafactory.bootstrap and trampoline from it means that adding or >> removing the method __interpolate__bootstrap__ is a binary compatible change, >> if __interpolate__bootstrap__ is declared private. So implementing >> __interpolate__bootstrap__ can be an afterthought. >> regards, >> R?mi From forax at univ-mlv.fr Fri Oct 15 17:09:57 2021 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 15 Oct 2021 19:09:57 +0200 (CEST) Subject: String Interpolation In-Reply-To: <6bb17702-969d-a749-e438-c920d07ab4a4@oracle.com> References: <583208882.1665266.1634144995832.JavaMail.zimbra@u-pem.fr> <6bb17702-969d-a749-e438-c920d07ab4a4@oracle.com> Message-ID: <1348841301.2745456.1634317797747.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "Remi Forax" , "amber-spec-experts" > > Sent: Mercredi 13 Octobre 2021 21:32:19 > Subject: Re: String Interpolation After grumbling a lot, let's restart [...] >> That's why the specification allow you to provide a second more optimised >> version of the interpolation method using a method __interpolate__bootstrap__. > This is an obviously attractive goal, but the mechanism is way too ad-hoc -- and > also too limited -- and also too advanced to be a language feature. Bootstraps > are way too complicated to expose in the source language in this way, > especially not this magically. And its too ad-hoc, since its specific to the > interpolation feature, whereas one could imagine a number of other contexts > where it is useful too. So this is a bad tradeoff in many ways. Jim's > implementation very cleverly gets the equivalent of this using pure library > implementation (which leans on MutableCallSite.) > While it is surely a desirable goal to be able to optimize formatter > implementation, it is also super-easy to become obsessed with this, and give it > a bigger place in the feature than it deserves. For some cases -- notably > String::format -- there are huge savings to be had (from a number of sources, > not least of which is that scanning the string at every invocation and choosing > a strategy based on that is expensive.) But in other cases, it is almost > irrelevant. For pure concatenation, it is already pretty fast; for SQL, the > cost of constructing the query is a tiny part of the execution time, so its not > even worth optimizing. So this is a "nice to have" rather than the centerpiece > of the feature. > To be clear, the centerpiece is the gathering up of a template + parameters so > that their combination can be handled by another entity, whether right now, > later, or never. Optimizing the case where it is done right now, using a > predictable choice of entity, is an optimization, but not the centerpiece. > Let me sketch out how we're envisioning this. The API is something like: > interface TemplatePolicy { > T apply(TemplatedString ts); > // returns MethodHandle (TemplatePolicy, TemplatedString) -> Object > default MethodHandle asMethodHandle(TemplatedString ts) { > return MH[TemplatePolicy::apply] > } > } I don't understand where you pass the arguments, is it not more something like public interface TemplatePolicy< T , E extends Exception> { T apply(TemplatedString template, Object... args) throws E ; // returns a MethodHandle with the signature T(TemplatePolicy, Object...) default MethodHandle asMethodHandle(TemplatedString template, MethodType type) { ... } } The second parameter of asMethodHandle is the descriptor of invokedynamic, this ensure that there is no boxing on the fast path, and if the implementation of TemplatePolicy is a final class. > The API specification has a number of constraints on the implementation of > asMethodHandle, which I'll get to in a second. When the compiler encounters an > immediate application P."...", it generates an indy, which uses a special > bootstrap that returns a MutableCallSite. The MutableCallSite initially has as > its target a special secondary bootstrap MH, which represents an interpolation > site that has not yet seen an actual invocation. The secondary bootstrap MH has > the shape of TemplatePolicy::apply (e.g., (TemplatePolicy, TemplatedString) -> > Object), so on first invocation it receives the TP object and the TS. It then > calls TP::asMethodHandle, and wraps this MH with a GWT which validates the > invariants and proceeds to that MH if they hold -- which they will 99.x% of the > time. > The invariant is that the dynamic type of the per-instantiation TP be == to the > dynamic type of the TP that was present at secondary linkage. That is, it be an > instance of the same class, but not the same instance. By definition, the > string will always be the same as will the types of the parameters, since this > is specific to concrete P."..." sites. So the MH can take advantage of that. > The constraint on TP::asMethodHandle is that it not undermine this invariant; > that if it generates a MH that is dependent on TP state, it not bake that state > into the resulting MH, but instead, treat the TP state as a parameter. Further, > the MH must be behaviorally equivalent to calling apply. > If the GWT fails, it means the user is doing something like: > for (TP p : listOfProcessors) { > blah blah p."foo \{a}" > } > in which case the GWT falls back to the "just do an invokevirtual of TP::apply" > strategy. (It could get fancier but I don't see any point.) > This lets us rescue indy-based translation without exposing a magic indy-hook in > the JLS. (Sorry, I know you wanted the magic indy hook.) As i said, i don't care about having the exact bootstrap API, but i care about the unnecessary boxing / class check / etc that can occur. I believe that if asMethodHandle() takes a MethodType as second parameter, performance should be Ok. Is it something that can be negotiated ? I've implemented a prototype to convince myself that with a MethodType as parameter is was not actually that bad. [ https://github.com/forax/java-interpolation | https://github.com/forax/java-interpolation ] (I also suppose that the TemplatedString is created with a constant dynamic ?) R?mi > On 10/13/2021 1:09 PM, Remi Forax wrote: >> Hi everybody, i've spend some time to think how the String interpolation + >> Policy should be specified and implemented. >> The goal is to add a syntax specifying a user defined method to "interpolate" >> (for a lack of better word) a string with arguments. >> Given that it's a method, the exact semantics of the interpolation, things like >> how the arguments are escaped, how the formatted string is parsed, is written >> is Java, this will allow to support a wide range of use cases. >> This proposal does not differ from the original proposal of Brian and Jim in its >> goal but in the way a user declare the interpolation method(s). >> TLDR; you can declare an interpolation method and optionally an interpolation >> bootstrap method if you want a more efficient code at the price of having to >> play with the method handle API. >> --- >> The proposal of Brian and Jim uses an interface to define the policy but in this >> case, using an interface is not what we want. >> I think there are two main reasons, >> - the interpolation method can be an instance method but can also be a factory >> method, a static method, and an interface can not constraint a static method. >> - we want the signature of the interpolation method to be free to use any number >> of parameters of any types, something that can not be specified with type >> parameters in Java. >> So let's take a step back and write some examples, as a user of the >> interpolation method, we want to >> - be able to specify string interpolation, >> you can notice that this is a static method. >> String name = ... >> int value = ... >> String s = String."name: \(name) age: \(age)"; >> - we also want to be able to instantiate regex Pattern, >> and have a magic optimisation that creates the Pattern instance only one >> Pattern pattern = Pattern."foo|bar"; >> - we also want to support instance method, so the interpolation can escape the >> arguments differently depending on the context, >> here by example, escaping differently depending on the database driver. >> String username = ... >> Connection connection = ... >> connection.""" >> SELECT * FROM users where user == "\(username)" >> """; >> I think the simplest way to specify an interpolation method is to have a method >> with a special name, >> i will use __interpolate__ because i don't want to discuss the exact syntax >> here. >> This method can be a static method or an instance method and has a restriction, >> the first parameter has to be a String because the first argument is the >> formatted string. >> Here is an example of how the method __interpolate__ inside java.lang.String can >> be written. >> To avoid everybody to re-implement the parsing of the formatted string, the >> class java.lang.runtime.InterpolateMetafactory provides a helper method >> "formatIterator" that returns an iterator splitting the formatted string into >> text and binding. >> package java.lang; >> public class String { >> ... >> public static String __interpolate__(String format, Object... args) { >> var i = 0; >> var builder = new StringBuilder(); >> var iterator = InterpolateMetafactory.formatIterator(format); >> while(iterator.hasNext()) { >> switch(iterator.next()) { >> case Text(var text) -> builder.append(text); >> case Binding binding -> args[i++]; >> } >> } >> return builder.toString(); >> } >> ... >> } >> While this is nice, you may think that it's just syntactic sugar and it will not >> be more performant that String.valueOf(), i.e. it will be slow. >> That's why the specification allow you to provide a second more optimised >> version of the interpolation method using a method __interpolate__bootstrap__. >> This method __interpolate__bootstrap__ is not required, can not replace the >> method __interpolate__, both __interpolate__ and __interpolate__bootstrap__ >> has to be present and it's a backward compatible change to add a method >> __interpolate__bootstrap__ after the fact, there is no need to recompile >> all the client code. >> For that the compiler translation rely on invokedynamic to call the method >> bootstrap of the class InterpolateMetafactor that at runtime decide >> to trampoline either to the method __interpolate__bootstrap__ or to the method >> __interpolate__ if no __interpolate__bootstrap__ exists. >> Here is an example of how a call to the interpolation method of String is >> generated by javac >> For the Java code >> String name = ... >> int value = ... >> String s = String."name: \(name) age: \(age)"; >> the equivalent bytecode is >> aload_1. // load name >> iload_2. // load age >> invokedynamic __interpolate__ (Ljava/lang/StringI)Ljava/lang/String; >> java.lang.runtime.InterpolateMetafactory.bootstrap(Lookup, String, MethodType, >> String, MethodHandle):CallSite >> [ "name: \(name) age: \(age)", String::__interpolate__(String, Object[]):String >> ] >> From the perspective of the compiler the method __interpolate__ works exactly >> like a method with a polymorphic method signature (the method annotated with >> @PolymorphicSignature), >> so the descriptor of invokedynamic is created by collecting the type of the >> argument, here the interpolation method is called with a String and an int, so >> the descriptor >> and the return type is String so the descriptor is >> (Ljava/lang/StringI)Ljava/lang/String; >> Considering the interpolation method as a polymorphic method is important in >> term of performance because it means that not boxing will be done by the >> compiler, if there are some boxing, they will be done by the runtime, so are >> optional if the __interpolate__bootstrap__ does not need to box arguments. >> You can also notice that the formatted string is passed as a bootstrap constant >> so all the parsing of the format can be done once outside of the hot path. >> A call to invokedynamic also pass as a second bootstrap argument the method >> handle to the method __interpolate__, so the implementation inside >> InterpolateMetafactory.bootstrap can called this method if no method >> __interpolate__bootstrap__ exists. >> Here is a raw implementation of the class InterpolateMetafactory. >> The method formatIterator() return an Iterator of Token which is a sealed class. >> The method bootstrap() first lookup to a method "__interpolate__bootstrap__" in >> the lookup class that takes a Lookup, a String, a MethodType, the format and >> the default implementation and call it if it exists or takes the default >> implementation, bind the formatted String and adapt the arguments using asType >> (ask for boxing, etc). >> package java.lang.runtime; >> public class InterpolateMetafactory { >> public sealed interface Token { >> public record Text(String text) implements Token {} >> public record Binding(String name) implements Token {} >> } >> public static Iterator formatIterator(String format) { >> ... >> } >> public static CallSite bootstrap(Lookup lookup, String name, MethodType >> methodType, String format, MethodHandle impl) throws Throwable { >> // check if there is a bootstrap method >> MethodHandle bootstrap; >> try { >> bootstrap = lookup.findStatic(lookup.lookupClass(), >> "__interpolate__bootstrap__", MethodType.methodType(CallSite.class, >> Lookup.class, String.class, MethodType.class, String.class, >> MethodHandle.class)); >> } catch(NoSuchMethodException e) { >> // bind the default implementation >> return new ConstantCallSite(impl.bindTo(format).asType(methodType)); >> } >> return boostrap.invoke(lookup, name, methodType, format, impl); >> } >> } >> Here is another example, showing how to declare the methods __interpolate__ and >> __interpolate__bootstrap__ inside java.util.regex.Pattern. >> The "default" implementation calls Pattern.compile() and the optimized one >> always returns the result of Pattern.compile() as a constant. >> package java.util.regex; >> public class Pattern { >> public static String __interpolate__(String format) {. // the formatted string >> can not have arguments >> return Pattern.compile(format); >> } >> private static CallSite __interpolate__bootstrap__(Lookup lookup, String name, >> MethodType methodType, String format, MethodHandle impl) { >> return new ConstantCallSite(MethodHandles.constant(Pattern.class, >> Pattern.compile(format))); >> } >> } >> The method __interpolate__ provides via its signature, the parameter types that >> are verified by the compiler. >> It also provides a code that can be used by the tools that does static analysis >> on the bytecode because those tools can not see through the method handle >> returned by a bootstrap method given that it's a runtime construct, it's >> usually not available at the time the static analysis is done. This should be >> enough to have tools like Graal VM native image to see through the >> invokedynamic in a similar way it sees through the invokedynamic used when >> creating a lambda. >> The fact that all invokedynamic goes through the method >> InterpolateMetafactory.bootstrap and trampoline from it means that adding or >> removing the method __interpolate__bootstrap__ is a binary compatible change, >> if __interpolate__bootstrap__ is declared private. So implementing >> __interpolate__bootstrap__ can be an afterthought. >> regards, >> R?mi From james.laskey at oracle.com Fri Oct 15 18:34:58 2021 From: james.laskey at oracle.com (Jim Laskey) Date: Fri, 15 Oct 2021 18:34:58 +0000 Subject: String Interpolation In-Reply-To: <1348841301.2745456.1634317797747.JavaMail.zimbra@u-pem.fr> References: <583208882.1665266.1634144995832.JavaMail.zimbra@u-pem.fr> <6bb17702-969d-a749-e438-c920d07ab4a4@oracle.com> <1348841301.2745456.1634317797747.JavaMail.zimbra@u-pem.fr> Message-ID: <650E55D3-972C-480D-9D58-EA4ABA297EDD@oracle.com> Yes, the methodology we have chosen does avoid boxing (and vararg). We don't need parameter types because those types are accessible from the TemplatedString implementation. So we only really need the user chosen return type. But to be honest, we don't even need that (the current prototype doesn't have the argument) because of template erasure. T is just Object from the bootstrap perspective and the policy can glean the return type elsewhere, if necessary, for MethodHandle construction. To allay any fears of performance, FMT."%s\{a} + %s\{b} = %s\{a + b}" is as fast as a + " + " + b + " = " + (a + b). -- Jim On Oct 15, 2021, at 2:09 PM, forax at univ-mlv.fr wrote: ________________________________ From: "Brian Goetz" > To: "Remi Forax" >, "amber-spec-experts" > Sent: Mercredi 13 Octobre 2021 21:32:19 Subject: Re: String Interpolation After grumbling a lot, let's restart [...] That's why the specification allow you to provide a second more optimised version of the interpolation method using a method __interpolate__bootstrap__. This is an obviously attractive goal, but the mechanism is way too ad-hoc -- and also too limited -- and also too advanced to be a language feature. Bootstraps are way too complicated to expose in the source language in this way, especially not this magically. And its too ad-hoc, since its specific to the interpolation feature, whereas one could imagine a number of other contexts where it is useful too. So this is a bad tradeoff in many ways. Jim's implementation very cleverly gets the equivalent of this using pure library implementation (which leans on MutableCallSite.) While it is surely a desirable goal to be able to optimize formatter implementation, it is also super-easy to become obsessed with this, and give it a bigger place in the feature than it deserves. For some cases -- notably String::format -- there are huge savings to be had (from a number of sources, not least of which is that scanning the string at every invocation and choosing a strategy based on that is expensive.) But in other cases, it is almost irrelevant. For pure concatenation, it is already pretty fast; for SQL, the cost of constructing the query is a tiny part of the execution time, so its not even worth optimizing. So this is a "nice to have" rather than the centerpiece of the feature. To be clear, the centerpiece is the gathering up of a template + parameters so that their combination can be handled by another entity, whether right now, later, or never. Optimizing the case where it is done right now, using a predictable choice of entity, is an optimization, but not the centerpiece. Let me sketch out how we're envisioning this. The API is something like: interface TemplatePolicy { T apply(TemplatedString ts); // returns MethodHandle (TemplatePolicy, TemplatedString) -> Object default MethodHandle asMethodHandle(TemplatedString ts) { return MH[TemplatePolicy::apply] } } I don't understand where you pass the arguments, is it not more something like public interface TemplatePolicy { T apply(TemplatedString template, Object... args) throws E; // returns a MethodHandle with the signature T(TemplatePolicy, Object...) default MethodHandle asMethodHandle(TemplatedString template, MethodType type) { ... } } The second parameter of asMethodHandle is the descriptor of invokedynamic, this ensure that there is no boxing on the fast path, and if the implementation of TemplatePolicy is a final class. The API specification has a number of constraints on the implementation of asMethodHandle, which I'll get to in a second. When the compiler encounters an immediate application P."...", it generates an indy, which uses a special bootstrap that returns a MutableCallSite. The MutableCallSite initially has as its target a special secondary bootstrap MH, which represents an interpolation site that has not yet seen an actual invocation. The secondary bootstrap MH has the shape of TemplatePolicy::apply (e.g., (TemplatePolicy, TemplatedString) -> Object), so on first invocation it receives the TP object and the TS. It then calls TP::asMethodHandle, and wraps this MH with a GWT which validates the invariants and proceeds to that MH if they hold -- which they will 99.x% of the time. The invariant is that the dynamic type of the per-instantiation TP be == to the dynamic type of the TP that was present at secondary linkage. That is, it be an instance of the same class, but not the same instance. By definition, the string will always be the same as will the types of the parameters, since this is specific to concrete P."..." sites. So the MH can take advantage of that. The constraint on TP::asMethodHandle is that it not undermine this invariant; that if it generates a MH that is dependent on TP state, it not bake that state into the resulting MH, but instead, treat the TP state as a parameter. Further, the MH must be behaviorally equivalent to calling apply. If the GWT fails, it means the user is doing something like: for (TP p : listOfProcessors) { blah blah p."foo \{a}" } in which case the GWT falls back to the "just do an invokevirtual of TP::apply" strategy. (It could get fancier but I don't see any point.) This lets us rescue indy-based translation without exposing a magic indy-hook in the JLS. (Sorry, I know you wanted the magic indy hook.) As i said, i don't care about having the exact bootstrap API, but i care about the unnecessary boxing / class check / etc that can occur. I believe that if asMethodHandle() takes a MethodType as second parameter, performance should be Ok. Is it something that can be negotiated ? I've implemented a prototype to convince myself that with a MethodType as parameter is was not actually that bad. https://github.com/forax/java-interpolation (I also suppose that the TemplatedString is created with a constant dynamic ?) R?mi On 10/13/2021 1:09 PM, Remi Forax wrote: Hi everybody, i've spend some time to think how the String interpolation + Policy should be specified and implemented. The goal is to add a syntax specifying a user defined method to "interpolate" (for a lack of better word) a string with arguments. Given that it's a method, the exact semantics of the interpolation, things like how the arguments are escaped, how the formatted string is parsed, is written is Java, this will allow to support a wide range of use cases. This proposal does not differ from the original proposal of Brian and Jim in its goal but in the way a user declare the interpolation method(s). TLDR; you can declare an interpolation method and optionally an interpolation bootstrap method if you want a more efficient code at the price of having to play with the method handle API. --- The proposal of Brian and Jim uses an interface to define the policy but in this case, using an interface is not what we want. I think there are two main reasons, - the interpolation method can be an instance method but can also be a factory method, a static method, and an interface can not constraint a static method. - we want the signature of the interpolation method to be free to use any number of parameters of any types, something that can not be specified with type parameters in Java. So let's take a step back and write some examples, as a user of the interpolation method, we want to - be able to specify string interpolation, you can notice that this is a static method. String name = ... int value = ... String s = String."name: \(name) age: \(age)"; - we also want to be able to instantiate regex Pattern, and have a magic optimisation that creates the Pattern instance only one Pattern pattern = Pattern."foo|bar"; - we also want to support instance method, so the interpolation can escape the arguments differently depending on the context, here by example, escaping differently depending on the database driver. String username = ... Connection connection = ... connection.""" SELECT * FROM users where user == "\(username)" """; I think the simplest way to specify an interpolation method is to have a method with a special name, i will use __interpolate__ because i don't want to discuss the exact syntax here. This method can be a static method or an instance method and has a restriction, the first parameter has to be a String because the first argument is the formatted string. Here is an example of how the method __interpolate__ inside java.lang.String can be written. To avoid everybody to re-implement the parsing of the formatted string, the class java.lang.runtime.InterpolateMetafactory provides a helper method "formatIterator" that returns an iterator splitting the formatted string into text and binding. package java.lang; public class String { ... public static String __interpolate__(String format, Object... args) { var i = 0; var builder = new StringBuilder(); var iterator = InterpolateMetafactory.formatIterator(format); while(iterator.hasNext()) { switch(iterator.next()) { case Text(var text) -> builder.append(text); case Binding binding -> args[i++]; } } return builder.toString(); } ... } While this is nice, you may think that it's just syntactic sugar and it will not be more performant that String.valueOf(), i.e. it will be slow. That's why the specification allow you to provide a second more optimised version of the interpolation method using a method __interpolate__bootstrap__. This method __interpolate__bootstrap__ is not required, can not replace the method __interpolate__, both __interpolate__ and __interpolate__bootstrap__ has to be present and it's a backward compatible change to add a method __interpolate__bootstrap__ after the fact, there is no need to recompile all the client code. For that the compiler translation rely on invokedynamic to call the method bootstrap of the class InterpolateMetafactor that at runtime decide to trampoline either to the method __interpolate__bootstrap__ or to the method __interpolate__ if no __interpolate__bootstrap__ exists. Here is an example of how a call to the interpolation method of String is generated by javac For the Java code String name = ... int value = ... String s = String."name: \(name) age: \(age)"; the equivalent bytecode is aload_1. // load name iload_2. // load age invokedynamic __interpolate__ (Ljava/lang/StringI)Ljava/lang/String; java.lang.runtime.InterpolateMetafactory.bootstrap(Lookup, String, MethodType, String, MethodHandle):CallSite [ "name: \(name) age: \(age)", String::__interpolate__(String, Object[]):String ] From the perspective of the compiler the method __interpolate__ works exactly like a method with a polymorphic method signature (the method annotated with @PolymorphicSignature), so the descriptor of invokedynamic is created by collecting the type of the argument, here the interpolation method is called with a String and an int, so the descriptor and the return type is String so the descriptor is (Ljava/lang/StringI)Ljava/lang/String; Considering the interpolation method as a polymorphic method is important in term of performance because it means that not boxing will be done by the compiler, if there are some boxing, they will be done by the runtime, so are optional if the __interpolate__bootstrap__ does not need to box arguments. You can also notice that the formatted string is passed as a bootstrap constant so all the parsing of the format can be done once outside of the hot path. A call to invokedynamic also pass as a second bootstrap argument the method handle to the method __interpolate__, so the implementation inside InterpolateMetafactory.bootstrap can called this method if no method __interpolate__bootstrap__ exists. Here is a raw implementation of the class InterpolateMetafactory. The method formatIterator() return an Iterator of Token which is a sealed class. The method bootstrap() first lookup to a method "__interpolate__bootstrap__" in the lookup class that takes a Lookup, a String, a MethodType, the format and the default implementation and call it if it exists or takes the default implementation, bind the formatted String and adapt the arguments using asType (ask for boxing, etc). package java.lang.runtime; public class InterpolateMetafactory { public sealed interface Token { public record Text(String text) implements Token {} public record Binding(String name) implements Token {} } public static Iterator formatIterator(String format) { ... } public static CallSite bootstrap(Lookup lookup, String name, MethodType methodType, String format, MethodHandle impl) throws Throwable { // check if there is a bootstrap method MethodHandle bootstrap; try { bootstrap = lookup.findStatic(lookup.lookupClass(), "__interpolate__bootstrap__", MethodType.methodType(CallSite.class, Lookup.class, String.class, MethodType.class, String.class, MethodHandle.class)); } catch(NoSuchMethodException e) { // bind the default implementation return new ConstantCallSite(impl.bindTo(format).asType(methodType)); } return boostrap.invoke(lookup, name, methodType, format, impl); } } Here is another example, showing how to declare the methods __interpolate__ and __interpolate__bootstrap__ inside java.util.regex.Pattern. The "default" implementation calls Pattern.compile() and the optimized one always returns the result of Pattern.compile() as a constant. package java.util.regex; public class Pattern { public static String __interpolate__(String format) {. // the formatted string can not have arguments return Pattern.compile(format); } private static CallSite __interpolate__bootstrap__(Lookup lookup, String name, MethodType methodType, String format, MethodHandle impl) { return new ConstantCallSite(MethodHandles.constant(Pattern.class, Pattern.compile(format))); } } The method __interpolate__ provides via its signature, the parameter types that are verified by the compiler. It also provides a code that can be used by the tools that does static analysis on the bytecode because those tools can not see through the method handle returned by a bootstrap method given that it's a runtime construct, it's usually not available at the time the static analysis is done. This should be enough to have tools like Graal VM native image to see through the invokedynamic in a similar way it sees through the invokedynamic used when creating a lambda. The fact that all invokedynamic goes through the method InterpolateMetafactory.bootstrap and trampoline from it means that adding or removing the method __interpolate__bootstrap__ is a binary compatible change, if __interpolate__bootstrap__ is declared private. So implementing __interpolate__bootstrap__ can be an afterthought. regards, R?mi From forax at univ-mlv.fr Fri Oct 15 20:26:16 2021 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 15 Oct 2021 22:26:16 +0200 (CEST) Subject: String Interpolation In-Reply-To: <650E55D3-972C-480D-9D58-EA4ABA297EDD@oracle.com> References: <583208882.1665266.1634144995832.JavaMail.zimbra@u-pem.fr> <6bb17702-969d-a749-e438-c920d07ab4a4@oracle.com> <1348841301.2745456.1634317797747.JavaMail.zimbra@u-pem.fr> <650E55D3-972C-480D-9D58-EA4ABA297EDD@oracle.com> Message-ID: <1270469810.2777441.1634329576804.JavaMail.zimbra@u-pem.fr> > From: "Jim Laskey" > To: "Remi Forax" > Cc: "Brian Goetz" , "amber-spec-experts" > > Sent: Vendredi 15 Octobre 2021 20:34:58 > Subject: Re: String Interpolation > Yes, the methodology we have chosen does avoid boxing (and vararg). We don't > need parameter types because those types are accessible from the > TemplatedString implementation. You're loosing two important types if you are using only the type of the bindings, - the type of the implementation of the TemplatePolicy (which is important because you can check if the implementation type is final or not), if it's final no guard is needed (the JIT can overcome that but you have created more method handles than necessary and you have to wait c2 to kick in). - the return type > So we only really need the user chosen return type. But to be honest, we don't > even need that (the current prototype doesn't have the argument) because of > template erasure. T is just Object from the bootstrap perspective and the > policy can glean the return type elsewhere, if necessary, for MethodHandle > construction. For the return type, one question is why the TemplatePolicy is parameterized by the return type. It forces boxing and discard the contextual type that comes from the caller site. Loosing the inferred return type is a big issue, because you are losing the ability to write policy that move values from the untyped world to the typed world. By example, let say you have a policy that works like Map.get(), actually you can not write int value1 = policy."\(key1)"; String value2 = policy."\(key2)"; because the closest you can come is Object. This is especially important when you start to throw patterns into the mix, switch(new JSONPolicy(jsonText)) { case """ { key1: \(int value1), key2: \(String value2) } """ -> { /* value1 is an int and value2 is a String */ } } > To allay any fears of performance, FMT."%s\{a} + %s\{b} = %s\{a + b}" is as fast > as a + " + " + b + " = " + (a + b). because this is a special case where for any formats the return type is fixed. > -- Jim R?mi >> On Oct 15, 2021, at 2:09 PM, [ mailto:forax at univ-mlv.fr | >> forax at univ-mlv.fr ] wrote: >>> From: "Brian Goetz" < [ mailto:brian.goetz at oracle.com | brian.goetz at oracle.com ] >>> > >>> To: "Remi Forax" < [ mailto:forax at univ-mlv.fr | forax at univ-mlv.fr ] >, >>> "amber-spec-experts" < [ mailto:amber-spec-experts at openjdk.java.net | >>> amber-spec-experts at openjdk.java.net ] > >>> Sent: Mercredi 13 Octobre 2021 21:32:19 >>> Subject: Re: String Interpolation >> After grumbling a lot, let's restart >> [...] >>>> That's why the specification allow you to provide a second more optimised >>>> version of the interpolation method using a method __interpolate__bootstrap__. >>> This is an obviously attractive goal, but the mechanism is way too ad-hoc -- and >>> also too limited -- and also too advanced to be a language feature. Bootstraps >>> are way too complicated to expose in the source language in this way, >>> especially not this magically. And its too ad-hoc, since its specific to the >>> interpolation feature, whereas one could imagine a number of other contexts >>> where it is useful too. So this is a bad tradeoff in many ways. Jim's >>> implementation very cleverly gets the equivalent of this using pure library >>> implementation (which leans on MutableCallSite.) >>> While it is surely a desirable goal to be able to optimize formatter >>> implementation, it is also super-easy to become obsessed with this, and give it >>> a bigger place in the feature than it deserves. For some cases -- notably >>> String::format -- there are huge savings to be had (from a number of sources, >>> not least of which is that scanning the string at every invocation and choosing >>> a strategy based on that is expensive.) But in other cases, it is almost >>> irrelevant. For pure concatenation, it is already pretty fast; for SQL, the >>> cost of constructing the query is a tiny part of the execution time, so its not >>> even worth optimizing. So this is a "nice to have" rather than the centerpiece >>> of the feature. >>> To be clear, the centerpiece is the gathering up of a template + parameters so >>> that their combination can be handled by another entity, whether right now, >>> later, or never. Optimizing the case where it is done right now, using a >>> predictable choice of entity, is an optimization, but not the centerpiece. >>> Let me sketch out how we're envisioning this. The API is something like: >>> interface TemplatePolicy { >>> T apply(TemplatedString ts); >>> // returns MethodHandle (TemplatePolicy, TemplatedString) -> Object >>> default MethodHandle asMethodHandle(TemplatedString ts) { >>> return MH[TemplatePolicy::apply] >>> } >>> } >> I don't understand where you pass the arguments, is it not more something like >> public interface TemplatePolicy< T , E extends Exception> { >> T apply(TemplatedString template, Object... args) throws E ; >> // returns a MethodHandle with the signature T(TemplatePolicy, Object...) >> default MethodHandle asMethodHandle(TemplatedString template, MethodType type) { >> ... >> } >> } >> The second parameter of asMethodHandle is the descriptor of invokedynamic, this >> ensure that there is no boxing on the fast path, and if the implementation of >> TemplatePolicy is a final class. >>> The API specification has a number of constraints on the implementation of >>> asMethodHandle, which I'll get to in a second. When the compiler encounters an >>> immediate application P."...", it generates an indy, which uses a special >>> bootstrap that returns a MutableCallSite. The MutableCallSite initially has as >>> its target a special secondary bootstrap MH, which represents an interpolation >>> site that has not yet seen an actual invocation. The secondary bootstrap MH has >>> the shape of TemplatePolicy::apply (e.g., (TemplatePolicy, TemplatedString) -> >>> Object), so on first invocation it receives the TP object and the TS. It then >>> calls TP::asMethodHandle, and wraps this MH with a GWT which validates the >>> invariants and proceeds to that MH if they hold -- which they will 99.x% of the >>> time. >>> The invariant is that the dynamic type of the per-instantiation TP be == to the >>> dynamic type of the TP that was present at secondary linkage. That is, it be an >>> instance of the same class, but not the same instance. By definition, the >>> string will always be the same as will the types of the parameters, since this >>> is specific to concrete P."..." sites. So the MH can take advantage of that. >>> The constraint on TP::asMethodHandle is that it not undermine this invariant; >>> that if it generates a MH that is dependent on TP state, it not bake that state >>> into the resulting MH, but instead, treat the TP state as a parameter. Further, >>> the MH must be behaviorally equivalent to calling apply. >>> If the GWT fails, it means the user is doing something like: >>> for (TP p : listOfProcessors) { >>> blah blah p."foo \{a}" >>> } >>> in which case the GWT falls back to the "just do an invokevirtual of TP::apply" >>> strategy. (It could get fancier but I don't see any point.) >>> This lets us rescue indy-based translation without exposing a magic indy-hook in >>> the JLS. (Sorry, I know you wanted the magic indy hook.) >> As i said, i don't care about having the exact bootstrap API, but i care about >> the unnecessary boxing / class check / etc that can occur. >> I believe that if asMethodHandle() takes a MethodType as second parameter, >> performance should be Ok. >> Is it something that can be negotiated ? >> I've implemented a prototype to convince myself that with a MethodType as >> parameter is was not actually that bad. >> [ https://github.com/forax/java-interpolation | >> https://github.com/forax/java-interpolation ] >> (I also suppose that the TemplatedString is created with a constant dynamic ?) >> R?mi >>> On 10/13/2021 1:09 PM, Remi Forax wrote: >>>> Hi everybody, i've spend some time to think how the String interpolation + >>>> Policy should be specified and implemented. >>>> The goal is to add a syntax specifying a user defined method to "interpolate" >>>> (for a lack of better word) a string with arguments. >>>> Given that it's a method, the exact semantics of the interpolation, things like >>>> how the arguments are escaped, how the formatted string is parsed, is written >>>> is Java, this will allow to support a wide range of use cases. >>>> This proposal does not differ from the original proposal of Brian and Jim in its >>>> goal but in the way a user declare the interpolation method(s). >>>> TLDR; you can declare an interpolation method and optionally an interpolation >>>> bootstrap method if you want a more efficient code at the price of having to >>>> play with the method handle API. >>>> --- >>>> The proposal of Brian and Jim uses an interface to define the policy but in this >>>> case, using an interface is not what we want. >>>> I think there are two main reasons, >>>> - the interpolation method can be an instance method but can also be a factory >>>> method, a static method, and an interface can not constraint a static method. >>>> - we want the signature of the interpolation method to be free to use any number >>>> of parameters of any types, something that can not be specified with type >>>> parameters in Java. >>>> So let's take a step back and write some examples, as a user of the >>>> interpolation method, we want to >>>> - be able to specify string interpolation, >>>> you can notice that this is a static method. >>>> String name = ... >>>> int value = ... >>>> String s = String."name: \(name) age: \(age)"; >>>> - we also want to be able to instantiate regex Pattern, >>>> and have a magic optimisation that creates the Pattern instance only one >>>> Pattern pattern = Pattern."foo|bar"; >>>> - we also want to support instance method, so the interpolation can escape the >>>> arguments differently depending on the context, >>>> here by example, escaping differently depending on the database driver. >>>> String username = ... >>>> Connection connection = ... >>>> connection.""" >>>> SELECT * FROM users where user == "\(username)" >>>> """; >>>> I think the simplest way to specify an interpolation method is to have a method >>>> with a special name, >>>> i will use __interpolate__ because i don't want to discuss the exact syntax >>>> here. >>>> This method can be a static method or an instance method and has a restriction, >>>> the first parameter has to be a String because the first argument is the >>>> formatted string. >>>> Here is an example of how the method __interpolate__ inside java.lang.String can >>>> be written. >>>> To avoid everybody to re-implement the parsing of the formatted string, the >>>> class java.lang.runtime.InterpolateMetafactory provides a helper method >>>> "formatIterator" that returns an iterator splitting the formatted string into >>>> text and binding. >>>> package java.lang; >>>> public class String { >>>> ... >>>> public static String __interpolate__(String format, Object... args) { >>>> var i = 0; >>>> var builder = new StringBuilder(); >>>> var iterator = InterpolateMetafactory.formatIterator(format); >>>> while(iterator.hasNext()) { >>>> switch(iterator.next()) { >>>> case Text(var text) -> builder.append(text); >>>> case Binding binding -> args[i++]; >>>> } >>>> } >>>> return builder.toString(); >>>> } >>>> ... >>>> } >>>> While this is nice, you may think that it's just syntactic sugar and it will not >>>> be more performant that String.valueOf(), i.e. it will be slow. >>>> That's why the specification allow you to provide a second more optimised >>>> version of the interpolation method using a method __interpolate__bootstrap__. >>>> This method __interpolate__bootstrap__ is not required, can not replace the >>>> method __interpolate__, both __interpolate__ and __interpolate__bootstrap__ >>>> has to be present and it's a backward compatible change to add a method >>>> __interpolate__bootstrap__ after the fact, there is no need to recompile >>>> all the client code. >>>> For that the compiler translation rely on invokedynamic to call the method >>>> bootstrap of the class InterpolateMetafactor that at runtime decide >>>> to trampoline either to the method __interpolate__bootstrap__ or to the method >>>> __interpolate__ if no __interpolate__bootstrap__ exists. >>>> Here is an example of how a call to the interpolation method of String is >>>> generated by javac >>>> For the Java code >>>> String name = ... >>>> int value = ... >>>> String s = String."name: \(name) age: \(age)"; >>>> the equivalent bytecode is >>>> aload_1. // load name >>>> iload_2. // load age >>>> invokedynamic __interpolate__ (Ljava/lang/StringI)Ljava/lang/String; >>>> java.lang.runtime.InterpolateMetafactory.bootstrap(Lookup, String, MethodType, >>>> String, MethodHandle):CallSite >>>> [ "name: \(name) age: \(age)", String::__interpolate__(String, Object[]):String >>>> ] >>>> From the perspective of the compiler the method __interpolate__ works exactly >>>> like a method with a polymorphic method signature (the method annotated with >>>> @PolymorphicSignature), >>>> so the descriptor of invokedynamic is created by collecting the type of the >>>> argument, here the interpolation method is called with a String and an int, so >>>> the descriptor >>>> and the return type is String so the descriptor is >>>> (Ljava/lang/StringI)Ljava/lang/String; >>>> Considering the interpolation method as a polymorphic method is important in >>>> term of performance because it means that not boxing will be done by the >>>> compiler, if there are some boxing, they will be done by the runtime, so are >>>> optional if the __interpolate__bootstrap__ does not need to box arguments. >>>> You can also notice that the formatted string is passed as a bootstrap constant >>>> so all the parsing of the format can be done once outside of the hot path. >>>> A call to invokedynamic also pass as a second bootstrap argument the method >>>> handle to the method __interpolate__, so the implementation inside >>>> InterpolateMetafactory.bootstrap can called this method if no method >>>> __interpolate__bootstrap__ exists. >>>> Here is a raw implementation of the class InterpolateMetafactory. >>>> The method formatIterator() return an Iterator of Token which is a sealed class. >>>> The method bootstrap() first lookup to a method "__interpolate__bootstrap__" in >>>> the lookup class that takes a Lookup, a String, a MethodType, the format and >>>> the default implementation and call it if it exists or takes the default >>>> implementation, bind the formatted String and adapt the arguments using asType >>>> (ask for boxing, etc). >>>> package java.lang.runtime; >>>> public class InterpolateMetafactory { >>>> public sealed interface Token { >>>> public record Text(String text) implements Token {} >>>> public record Binding(String name) implements Token {} >>>> } >>>> public static Iterator formatIterator(String format) { >>>> ... >>>> } >>>> public static CallSite bootstrap(Lookup lookup, String name, MethodType >>>> methodType, String format, MethodHandle impl) throws Throwable { >>>> // check if there is a bootstrap method >>>> MethodHandle bootstrap; >>>> try { >>>> bootstrap = lookup.findStatic(lookup.lookupClass(), >>>> "__interpolate__bootstrap__", MethodType.methodType(CallSite.class, >>>> Lookup.class, String.class, MethodType.class, String.class, >>>> MethodHandle.class)); >>>> } catch(NoSuchMethodException e) { >>>> // bind the default implementation >>>> return new ConstantCallSite(impl.bindTo(format).asType(methodType)); >>>> } >>>> return boostrap.invoke(lookup, name, methodType, format, impl); >>>> } >>>> } >>>> Here is another example, showing how to declare the methods __interpolate__ and >>>> __interpolate__bootstrap__ inside java.util.regex.Pattern. >>>> The "default" implementation calls Pattern.compile() and the optimized one >>>> always returns the result of Pattern.compile() as a constant. >>>> package java.util.regex; >>>> public class Pattern { >>>> public static String __interpolate__(String format) {. // the formatted string >>>> can not have arguments >>>> return Pattern.compile(format); >>>> } >>>> private static CallSite __interpolate__bootstrap__(Lookup lookup, String name, >>>> MethodType methodType, String format, MethodHandle impl) { >>>> return new ConstantCallSite(MethodHandles.constant(Pattern.class, >>>> Pattern.compile(format))); >>>> } >>>> } >>>> The method __interpolate__ provides via its signature, the parameter types that >>>> are verified by the compiler. >>>> It also provides a code that can be used by the tools that does static analysis >>>> on the bytecode because those tools can not see through the method handle >>>> returned by a bootstrap method given that it's a runtime construct, it's >>>> usually not available at the time the static analysis is done. This should be >>>> enough to have tools like Graal VM native image to see through the >>>> invokedynamic in a similar way it sees through the invokedynamic used when >>>> creating a lambda. >>>> The fact that all invokedynamic goes through the method >>>> InterpolateMetafactory.bootstrap and trampoline from it means that adding or >>>> removing the method __interpolate__bootstrap__ is a binary compatible change, >>>> if __interpolate__bootstrap__ is declared private. So implementing >>>> __interpolate__bootstrap__ can be an afterthought. >>>> regards, >>>> R?mi From forax at univ-mlv.fr Sun Oct 17 20:54:48 2021 From: forax at univ-mlv.fr (Remi Forax) Date: Sun, 17 Oct 2021 22:54:48 +0200 (CEST) Subject: Templated String and template policies, why the current design is bad Message-ID: <327759777.1307.1634504088937.JavaMail.zimbra@u-pem.fr> I've recently proposed another way to implement the templated string/template policies but i may not have made it clear why i think the current proposal [1] is bad. First, some vocabulary, a templated string is a string with some unnamed parameters that are filled with the result of expressions by example, if we use ${ expr } as escape sequence to introduce an expression the code var a = 3; var b = 4; "sum ${ a } + ${ b } = ${ a + b }" can be decomposed into - a string template that can be seen either as a string "sum @ + @ = @" with a special character (here '@') denoting a hole for each parameter or an array of strings ["sum ", " + ", " = ", ""] indicating the strings in between holes. - 3 parameters, param0, param1 and param2 initialized respectively with the results of the expressions a, b and a + b Before talking about the current proposal, let's take a look to the way both JavaScript and Scala, implement the string interpolation. For JavaScript [2], you define a function that the template as an array and as many parameters you need function foo(templateParts, param0, param1, param2) { ... } JavaScript uses backticks `` to delimit the templated strings and ${} as escape sequence so var a = 3; var b = 4; foo.`sum ${ a } + ${ b } = ${ a + b }` is equivalent to foo(["sum ", " + ", " = ", ""], a, b, a + b) In Scala, this mostly works the same way, there is a class StringContext that correspond to a templated string and you define the function foo as a method of StringContext that takes the parameters (in Scala, you can add methods on an already existing class using (abusing of) the implicit keyword). implicit class FooHelper(val template: StringContext) { // adds the following methods to StringContext def foo(param0: Any, param1: Any, param2: Any) { ... } } Scala uses quotes "" to delimit the templated string and ${} as escape sequence so val a = 3; val b = 4; foo."sum ${ a } + ${ b } = ${ a + b }" is equivalent to new StringContext("sum ", " + ", " = ", "").foo(a, b, a + b) In summary, for both JavaScript and Scala, the generalization of string interpolation is a function call which takes the templates string parts as first argument and the parameters of the templated string as the other parameters. So in Java, you would assume that - there is an object that represents a templated string with the holes - there is a method that takes the templated string as first parameter and the parameters of the templated string But this is not how the proposed design works. The TemplateString does not represent a string with some holes, it represents the string with some holes plus the values of the holes, as if the arguments of the parameters were partially applied. The TemplateString acts as a closure on the arguments, a glorified Supplier if you prefer. Because the arguments are already inside the TemplatedString, the TemplatePolicy, the function that should take the template and the parameters does not declare the types of the parameters. Which means that there is no way for someone that creates a TemplatePolicy to declare the types of the parameters, any parameters is always valid, so there is no type safety. This design is not unknown, this is the GString [4] of Groovy. While it makes sense for a dynamic language like Groovy to not have to declare the type of the parameters, it makes no sense for a language like Java which is statically typed to not have a way to declare the types of the parameters like Scala or TypeScript/JavaScript do. The other issue with the proposed design is that there is no way to declare the template policy as a static method, it has to be an instance method implementing an interface despite the fact that both JavaScript and Scala* support function first and lets the user adds supplementary arguments as a secondary mechanism (using currying in Scala and by adding a property on the function itself in JavaScript). There is a good reason to support static methods in Java, a lot of use-cases does not requires the template policy to have additional arguments (storing them in an instance is not necessary) so forcing the template policy to be defined as an instance method means a lot of boilerplate for no good reason. I hope i've convinced you that the current proposal for string interpolation in Java is not the right one. regards, R?mi * for Scala, it's a method on StringContext that acts as a function that takes a StringContext as first parameter. [1] https://bugs.openjdk.java.net/browse/JDK-8273943 [3] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals [4] https://docs.scala-lang.org/overviews/core/string-interpolation.html [2] https://docs.groovy-lang.org/docs/latest/html/api/groovy/lang/GString.html From brian.goetz at oracle.com Mon Oct 18 16:47:00 2021 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 18 Oct 2021 12:47:00 -0400 Subject: Templated String and template policies, why the current design is bad In-Reply-To: <327759777.1307.1634504088937.JavaMail.zimbra@u-pem.fr> References: <327759777.1307.1634504088937.JavaMail.zimbra@u-pem.fr> Message-ID: <741655d7-ca30-2226-f561-60facffacd71@oracle.com> This seems a very strange argument to me. Templates are by their nature dynamic -- a template has an unknown number of holes, and the holes are filled with arbitrary expressions.?? People like templates because they're easy to use, and they're easy to use because they're flexible.? Consider String::format: ??? String format(String formatString, Object... values) There are many dynamic conditions that are not statically checked here; that the format string is well-formed, that the number of holes matches the number of values provided, that the types of the values are suitable for filling the holes, etc.? Every templating policy will carry their own private interpretation of these requirements, which would require much more complex type systems to capture. When the templating policy is a well-known constant, such as java.lang.String.FMT, IDEs will be able to provide better checking based on the specification of the formatter, but that's a bonus. You're saying here that what we should reify is not format+values, but format+types.? This is not an unreasonable choice (but, doesn't rise to the bar you've set by "the current design is bad"), but I think your argument is an implementation preference dressed up in theoretical garb.? You want the abstraction to serve the implementation (a bootstrap), so you want to shape it like what a bootstrap wants to consume. The reality is that the current implementation can extract this information perfectly well, and can easily and cheaply test for the invariants that are needed to guard the computation.? The design choice here is that the abstraction we are exposing is one that is more useful **to the users**; the format string and associated values can now travel together as they pass through the layers of, say, a logging framework. So we've deliberately chosen an API that is best for users, and makes a little extra work for implementors, rather than the other way around.? (And yes, the decision was informed by roads previously explored by JavaScript and Groovy.) > The other issue with the proposed design is that there is no way to declare the template policy as a static method, it has to be an instance method implementing an interface despite the fact that both JavaScript and Scala* support function first and lets the user adds supplementary arguments as a secondary mechanism (using currying in Scala and by adding a property on the function itself in JavaScript). The Template policy is a SAM interface, so any static method of the right shape can be turned into a template policy with a method reference. I suspect what you mean by "no way", is "no way to access the super-optimized implementation strategy"?? And I'll say again the two answers I've already given to that: (a) many such formatters will not benefit from the low-level implementation strategy anyway, and (b) we should design the API to serve the users, not the implementors.? THere are many more users. On 10/17/2021 4:54 PM, Remi Forax wrote: > I've recently proposed another way to implement the templated string/template policies but i may not have made it clear why i think the current proposal [1] is bad. > > First, some vocabulary, a templated string is a string with some unnamed parameters that are filled with the result of expressions > by example, if we use ${ expr } as escape sequence to introduce an expression > the code > > var a = 3; > var b = 4; > "sum ${ a } + ${ b } = ${ a + b }" > > can be decomposed into > > - a string template that can be seen either as a string "sum @ + @ = @" with a special character (here '@') denoting a hole for each parameter > or an array of strings ["sum ", " + ", " = ", ""] indicating the strings in between holes. > - 3 parameters, param0, param1 and param2 initialized respectively with the results of the expressions a, b and a + b > > Before talking about the current proposal, let's take a look to the way both JavaScript and Scala, implement the string interpolation. > > For JavaScript [2], you define a function that the template as an array and as many parameters you need > function foo(templateParts, param0, param1, param2) { > ... > } > > JavaScript uses backticks `` to delimit the templated strings and ${} as escape sequence > so > var a = 3; > var b = 4; > foo.`sum ${ a } + ${ b } = ${ a + b }` > > is equivalent to > > foo(["sum ", " + ", " = ", ""], a, b, a + b) > > > In Scala, this mostly works the same way, there is a class StringContext that correspond to a templated string and you define the function foo as a method of StringContext that takes the parameters (in Scala, you can add methods on an already existing class using (abusing of) the implicit keyword). > > implicit class FooHelper(val template: StringContext) { // adds the following methods to StringContext > def foo(param0: Any, param1: Any, param2: Any) { > ... > } > } > > Scala uses quotes "" to delimit the templated string and ${} as escape sequence > so > val a = 3; > val b = 4; > foo."sum ${ a } + ${ b } = ${ a + b }" > > is equivalent to > new StringContext("sum ", " + ", " = ", "").foo(a, b, a + b) > > > > In summary, for both JavaScript and Scala, the generalization of string interpolation is a function call which takes the templates string parts as first argument and the parameters of the templated string as the other parameters. > > So in Java, you would assume that > - there is an object that represents a templated string with the holes > - there is a method that takes the templated string as first parameter and the parameters of the templated string > > But this is not how the proposed design works. > > The TemplateString does not represent a string with some holes, it represents the string with some holes plus the values of the holes, as if the arguments of the parameters were partially applied. The TemplateString acts as a closure on the arguments, a glorified Supplier if you prefer. > > Because the arguments are already inside the TemplatedString, the TemplatePolicy, the function that should take the template and the parameters does not declare the types of the parameters. > Which means that there is no way for someone that creates a TemplatePolicy to declare the types of the parameters, any parameters is always valid, so there is no type safety. > > This design is not unknown, this is the GString [4] of Groovy. While it makes sense for a dynamic language like Groovy to not have to declare the type of the parameters, it makes no sense for a language like Java which is statically typed to not have a way to declare the types of the parameters like Scala or TypeScript/JavaScript do. > > The other issue with the proposed design is that there is no way to declare the template policy as a static method, it has to be an instance method implementing an interface despite the fact that both JavaScript and Scala* support function first and lets the user adds supplementary arguments as a secondary mechanism (using currying in Scala and by adding a property on the function itself in JavaScript). > > There is a good reason to support static methods in Java, a lot of use-cases does not requires the template policy to have additional arguments (storing them in an instance is not necessary) so forcing the template policy to be defined as an instance method means a lot of boilerplate for no good reason. > > I hope i've convinced you that the current proposal for string interpolation in Java is not the right one. > > regards, > R?mi > > * for Scala, it's a method on StringContext that acts as a function that takes a StringContext as first parameter. > > [1] https://bugs.openjdk.java.net/browse/JDK-8273943 > [3] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals > [4] https://docs.scala-lang.org/overviews/core/string-interpolation.html > [2] https://docs.groovy-lang.org/docs/latest/html/api/groovy/lang/GString.html > From forax at univ-mlv.fr Mon Oct 18 19:12:21 2021 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 18 Oct 2021 21:12:21 +0200 (CEST) Subject: Templated String and template policies, why the current design is bad In-Reply-To: <741655d7-ca30-2226-f561-60facffacd71@oracle.com> References: <327759777.1307.1634504088937.JavaMail.zimbra@u-pem.fr> <741655d7-ca30-2226-f561-60facffacd71@oracle.com> Message-ID: <633810112.552086.1634584341616.JavaMail.zimbra@u-pem.fr> ----- Original Message ----- > From: "Brian Goetz" > To: "Remi Forax" , "amber-spec-experts" > Sent: Lundi 18 Octobre 2021 18:47:00 > Subject: Re: Templated String and template policies, why the current design is bad > This seems a very strange argument to me. > > Templates are by their nature dynamic -- a template has an unknown > number of holes, and the holes are filled with arbitrary expressions. > People like templates because they're easy to use, and they're easy to > use because they're flexible.? Consider String::format: > > ??? String format(String formatString, Object... values) > > There are many dynamic conditions that are not statically checked here; > that the format string is well-formed, that the number of holes matches > the number of values provided, that the types of the values are suitable > for filling the holes, etc.? Every templating policy will carry their > own private interpretation of these requirements, which would require > much more complex type systems to capture. There is a lot of structured text that ask for specific types in a specific order. By example, if a text that starts with a date and then some values new DatedTextTemplatePolicy().""" // Date \(LocalDate.now()) \(key1) : \(value1) \(key2) : \(value2) """; If i can declare the parameters like in JavaScript, i can write String apply(TemplatedString template, LocalDate date, Object... pairs) { ... } It also make all the constructs that are target typing, unusable. By example, how to use lambdas/method references that will be used as projection functions for several record instances. List persons = ... // generate all mails new MailGeneratorTemplatePolicy(persons).""" Dear \(Person::title) \(Person::lastName), i hope you enjoy ... ... """; As you know, you can not write this kind of code if the arguments are all typed Object. Another example is there grammar example of John, https://github.com/forax/java-interpolation/blob/master/src/test/java/com/github/forax/interpolator/GrammarTemplatePolicyTest.java#L22 Here you want all the arguments to be either a terminal or a non-terminal. It should be a compile time error if a user uses something else. > > When the templating policy is a well-known constant, such as > java.lang.String.FMT, IDEs will be able to provide better checking based > on the specification of the formatter, but that's a bonus. > > You're saying here that what we should reify is not format+values, but > format+types.? This is not an unreasonable choice (but, doesn't rise to > the bar you've set by "the current design is bad"), but I think your > argument is an implementation preference dressed up in theoretical > garb.? You want the abstraction to serve the implementation (a > bootstrap), so you want to shape it like what a bootstrap wants to consume. Nope, i want compile time safety when it's possible, Object.. should be a possible descriptor for the types of the parameters not the only descriptor. [...] > >> The other issue with the proposed design is that there is no way to declare the >> template policy as a static method, it has to be an instance method >> implementing an interface despite the fact that both JavaScript and Scala* >> support function first and lets the user adds supplementary arguments as a >> secondary mechanism (using currying in Scala and by adding a property on the >> function itself in JavaScript). > > The Template policy is a SAM interface, so any static method of the > right shape can be turned into a template policy with a method reference. Yes, that why i call it a glorified Supplier, but i don't see how it helps. In term of writing the code, in an IDE, i can not take a type Pattern." + CTRL-SPACE. As a JDK maintainer, you can cheat and say, do this import static on that class and all template policies you need are now available, but this approach does not scale. Otherwise, you have to memorize that FMT is in fact FormatTemplatePolicy.FMT, that PATTERN is in Fact PatternTemplatePolicy.PATTERN, etc. > > I suspect what you mean by "no way", is "no way to access the > super-optimized implementation strategy"?? And I'll say again the two > answers I've already given to that: (a) many such formatters will not > benefit from the low-level implementation strategy anyway, and (b) we > should design the API to serve the users, not the implementors.? THere > are many more users. It's a false dichotomy, i want both, an API easy to use and efficient. But in this thread i would like us to focus on the type checking part, the efficiency can be discussed in another thread. R?mi > > > > > > On 10/17/2021 4:54 PM, Remi Forax wrote: >> I've recently proposed another way to implement the templated string/template >> policies but i may not have made it clear why i think the current proposal [1] >> is bad. >> >> First, some vocabulary, a templated string is a string with some unnamed >> parameters that are filled with the result of expressions >> by example, if we use ${ expr } as escape sequence to introduce an expression >> the code >> >> var a = 3; >> var b = 4; >> "sum ${ a } + ${ b } = ${ a + b }" >> >> can be decomposed into >> >> - a string template that can be seen either as a string "sum @ + @ = @" with a >> special character (here '@') denoting a hole for each parameter >> or an array of strings ["sum ", " + ", " = ", ""] indicating the strings in >> between holes. >> - 3 parameters, param0, param1 and param2 initialized respectively with the >> results of the expressions a, b and a + b >> >> Before talking about the current proposal, let's take a look to the way both >> JavaScript and Scala, implement the string interpolation. >> >> For JavaScript [2], you define a function that the template as an array and as >> many parameters you need >> function foo(templateParts, param0, param1, param2) { >> ... >> } >> >> JavaScript uses backticks `` to delimit the templated strings and ${} as escape >> sequence >> so >> var a = 3; >> var b = 4; >> foo.`sum ${ a } + ${ b } = ${ a + b }` >> >> is equivalent to >> >> foo(["sum ", " + ", " = ", ""], a, b, a + b) >> >> >> In Scala, this mostly works the same way, there is a class StringContext that >> correspond to a templated string and you define the function foo as a method of >> StringContext that takes the parameters (in Scala, you can add methods on an >> already existing class using (abusing of) the implicit keyword). >> >> implicit class FooHelper(val template: StringContext) { // adds the following >> methods to StringContext >> def foo(param0: Any, param1: Any, param2: Any) { >> ... >> } >> } >> >> Scala uses quotes "" to delimit the templated string and ${} as escape sequence >> so >> val a = 3; >> val b = 4; >> foo."sum ${ a } + ${ b } = ${ a + b }" >> >> is equivalent to >> new StringContext("sum ", " + ", " = ", "").foo(a, b, a + b) >> >> >> >> In summary, for both JavaScript and Scala, the generalization of string >> interpolation is a function call which takes the templates string parts as >> first argument and the parameters of the templated string as the other >> parameters. >> >> So in Java, you would assume that >> - there is an object that represents a templated string with the holes >> - there is a method that takes the templated string as first parameter and the >> parameters of the templated string >> >> But this is not how the proposed design works. >> >> The TemplateString does not represent a string with some holes, it represents >> the string with some holes plus the values of the holes, as if the arguments of >> the parameters were partially applied. The TemplateString acts as a closure on >> the arguments, a glorified Supplier if you prefer. >> >> Because the arguments are already inside the TemplatedString, the >> TemplatePolicy, the function that should take the template and the parameters >> does not declare the types of the parameters. >> Which means that there is no way for someone that creates a TemplatePolicy to >> declare the types of the parameters, any parameters is always valid, so there >> is no type safety. >> >> This design is not unknown, this is the GString [4] of Groovy. While it makes >> sense for a dynamic language like Groovy to not have to declare the type of the >> parameters, it makes no sense for a language like Java which is statically >> typed to not have a way to declare the types of the parameters like Scala or >> TypeScript/JavaScript do. >> >> The other issue with the proposed design is that there is no way to declare the >> template policy as a static method, it has to be an instance method >> implementing an interface despite the fact that both JavaScript and Scala* >> support function first and lets the user adds supplementary arguments as a >> secondary mechanism (using currying in Scala and by adding a property on the >> function itself in JavaScript). >> >> There is a good reason to support static methods in Java, a lot of use-cases >> does not requires the template policy to have additional arguments (storing >> them in an instance is not necessary) so forcing the template policy to be >> defined as an instance method means a lot of boilerplate for no good reason. >> >> I hope i've convinced you that the current proposal for string interpolation in >> Java is not the right one. >> >> regards, >> R?mi >> >> * for Scala, it's a method on StringContext that acts as a function that takes a >> StringContext as first parameter. >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8273943 >> [3] >> https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals >> [4] https://docs.scala-lang.org/overviews/core/string-interpolation.html >> [2] https://docs.groovy-lang.org/docs/latest/html/api/groovy/lang/GString.html From brian.goetz at oracle.com Mon Oct 18 19:26:33 2021 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 18 Oct 2021 15:26:33 -0400 Subject: [External] : Re: Templated String and template policies, why the current design is bad In-Reply-To: <633810112.552086.1634584341616.JavaMail.zimbra@u-pem.fr> References: <327759777.1307.1634504088937.JavaMail.zimbra@u-pem.fr> <741655d7-ca30-2226-f561-60facffacd71@oracle.com> <633810112.552086.1634584341616.JavaMail.zimbra@u-pem.fr> Message-ID: <02ca6dc7-efd9-b440-9b76-9df4b6bf2441@oracle.com> Templates are an *implementation* mechanism; they're not an API building tool.? If you want stronger type checking, use the existing API construction features of the language -- such as exposing a method that takes a list of arguments of specific types, or takes a record; if the template contains lots of repetition, use a builder, etc, and then let the method/builder do the templating. I agree it would be nice to be able to preflight the invocation at compile time, so that validity errors could be caught at compile time and turned into diagnostics, but this is broader than just type checking -- this includes malformed template strings, etc.? That's a desirable feature that I hope we'll be able to layer on later. (Again, this is somewhat related to the compile-time constant folding feature, since that involved the compiler reflectively calling library code at compile time to at least partially validate compile-time constant information.) On 10/18/2021 3:12 PM, forax at univ-mlv.fr wrote: > ----- Original Message ----- >> From: "Brian Goetz" >> To: "Remi Forax" , "amber-spec-experts" >> Sent: Lundi 18 Octobre 2021 18:47:00 >> Subject: Re: Templated String and template policies, why the current design is bad >> This seems a very strange argument to me. >> >> Templates are by their nature dynamic -- a template has an unknown >> number of holes, and the holes are filled with arbitrary expressions. >> People like templates because they're easy to use, and they're easy to >> use because they're flexible.? Consider String::format: >> >> ??? String format(String formatString, Object... values) >> >> There are many dynamic conditions that are not statically checked here; >> that the format string is well-formed, that the number of holes matches >> the number of values provided, that the types of the values are suitable >> for filling the holes, etc.? Every templating policy will carry their >> own private interpretation of these requirements, which would require >> much more complex type systems to capture. > There is a lot of structured text that ask for specific types in a specific order. > > By example, if a text that starts with a date and then some values > > new DatedTextTemplatePolicy().""" > // Date \(LocalDate.now()) > \(key1) : \(value1) > \(key2) : \(value2) > """; > > If i can declare the parameters like in JavaScript, i can write > String apply(TemplatedString template, LocalDate date, Object... pairs) { ... } > > > It also make all the constructs that are target typing, unusable. > By example, how to use lambdas/method references that will be used as projection functions for several record instances. > > List persons = ... > > // generate all mails > new MailGeneratorTemplatePolicy(persons).""" > Dear \(Person::title) \(Person::lastName), > i hope you enjoy ... > ... > """; > > As you know, you can not write this kind of code if the arguments are all typed Object. > > > Another example is there grammar example of John, > https://urldefense.com/v3/__https://github.com/forax/java-interpolation/blob/master/src/test/java/com/github/forax/interpolator/GrammarTemplatePolicyTest.java*L22__;Iw!!ACWV5N9M2RV99hQ!asJ_WOx0QOyBjnKlGynqHgivYFVsbSTL8xDyi-0kCyI_qHiDdLU_IZ4tPbvTT0ua2w$ > > Here you want all the arguments to be either a terminal or a non-terminal. > It should be a compile time error if a user uses something else. > > >> When the templating policy is a well-known constant, such as >> java.lang.String.FMT, IDEs will be able to provide better checking based >> on the specification of the formatter, but that's a bonus. >> >> You're saying here that what we should reify is not format+values, but >> format+types.? This is not an unreasonable choice (but, doesn't rise to >> the bar you've set by "the current design is bad"), but I think your >> argument is an implementation preference dressed up in theoretical >> garb.? You want the abstraction to serve the implementation (a >> bootstrap), so you want to shape it like what a bootstrap wants to consume. > Nope, i want compile time safety when it's possible, Object.. should be a possible descriptor for the types of the parameters not the only descriptor. > > [...] > >>> The other issue with the proposed design is that there is no way to declare the >>> template policy as a static method, it has to be an instance method >>> implementing an interface despite the fact that both JavaScript and Scala* >>> support function first and lets the user adds supplementary arguments as a >>> secondary mechanism (using currying in Scala and by adding a property on the >>> function itself in JavaScript). >> The Template policy is a SAM interface, so any static method of the >> right shape can be turned into a template policy with a method reference. > Yes, that why i call it a glorified Supplier, but i don't see how it helps. > > In term of writing the code, in an IDE, i can not take a type Pattern." + CTRL-SPACE. > > As a JDK maintainer, you can cheat and say, do this import static on that class and all template policies you need are now available, but this approach does not scale. > > Otherwise, you have to memorize that FMT is in fact FormatTemplatePolicy.FMT, that PATTERN is in Fact PatternTemplatePolicy.PATTERN, etc. > > >> I suspect what you mean by "no way", is "no way to access the >> super-optimized implementation strategy"?? And I'll say again the two >> answers I've already given to that: (a) many such formatters will not >> benefit from the low-level implementation strategy anyway, and (b) we >> should design the API to serve the users, not the implementors.? THere >> are many more users. > It's a false dichotomy, i want both, an API easy to use and efficient. > > But in this thread i would like us to focus on the type checking part, the efficiency can be discussed in another thread. > > R?mi > >> >> >> >> >> On 10/17/2021 4:54 PM, Remi Forax wrote: >>> I've recently proposed another way to implement the templated string/template >>> policies but i may not have made it clear why i think the current proposal [1] >>> is bad. >>> >>> First, some vocabulary, a templated string is a string with some unnamed >>> parameters that are filled with the result of expressions >>> by example, if we use ${ expr } as escape sequence to introduce an expression >>> the code >>> >>> var a = 3; >>> var b = 4; >>> "sum ${ a } + ${ b } = ${ a + b }" >>> >>> can be decomposed into >>> >>> - a string template that can be seen either as a string "sum @ + @ = @" with a >>> special character (here '@') denoting a hole for each parameter >>> or an array of strings ["sum ", " + ", " = ", ""] indicating the strings in >>> between holes. >>> - 3 parameters, param0, param1 and param2 initialized respectively with the >>> results of the expressions a, b and a + b >>> >>> Before talking about the current proposal, let's take a look to the way both >>> JavaScript and Scala, implement the string interpolation. >>> >>> For JavaScript [2], you define a function that the template as an array and as >>> many parameters you need >>> function foo(templateParts, param0, param1, param2) { >>> ... >>> } >>> >>> JavaScript uses backticks `` to delimit the templated strings and ${} as escape >>> sequence >>> so >>> var a = 3; >>> var b = 4; >>> foo.`sum ${ a } + ${ b } = ${ a + b }` >>> >>> is equivalent to >>> >>> foo(["sum ", " + ", " = ", ""], a, b, a + b) >>> >>> >>> In Scala, this mostly works the same way, there is a class StringContext that >>> correspond to a templated string and you define the function foo as a method of >>> StringContext that takes the parameters (in Scala, you can add methods on an >>> already existing class using (abusing of) the implicit keyword). >>> >>> implicit class FooHelper(val template: StringContext) { // adds the following >>> methods to StringContext >>> def foo(param0: Any, param1: Any, param2: Any) { >>> ... >>> } >>> } >>> >>> Scala uses quotes "" to delimit the templated string and ${} as escape sequence >>> so >>> val a = 3; >>> val b = 4; >>> foo."sum ${ a } + ${ b } = ${ a + b }" >>> >>> is equivalent to >>> new StringContext("sum ", " + ", " = ", "").foo(a, b, a + b) >>> >>> >>> >>> In summary, for both JavaScript and Scala, the generalization of string >>> interpolation is a function call which takes the templates string parts as >>> first argument and the parameters of the templated string as the other >>> parameters. >>> >>> So in Java, you would assume that >>> - there is an object that represents a templated string with the holes >>> - there is a method that takes the templated string as first parameter and the >>> parameters of the templated string >>> >>> But this is not how the proposed design works. >>> >>> The TemplateString does not represent a string with some holes, it represents >>> the string with some holes plus the values of the holes, as if the arguments of >>> the parameters were partially applied. The TemplateString acts as a closure on >>> the arguments, a glorified Supplier if you prefer. >>> >>> Because the arguments are already inside the TemplatedString, the >>> TemplatePolicy, the function that should take the template and the parameters >>> does not declare the types of the parameters. >>> Which means that there is no way for someone that creates a TemplatePolicy to >>> declare the types of the parameters, any parameters is always valid, so there >>> is no type safety. >>> >>> This design is not unknown, this is the GString [4] of Groovy. While it makes >>> sense for a dynamic language like Groovy to not have to declare the type of the >>> parameters, it makes no sense for a language like Java which is statically >>> typed to not have a way to declare the types of the parameters like Scala or >>> TypeScript/JavaScript do. >>> >>> The other issue with the proposed design is that there is no way to declare the >>> template policy as a static method, it has to be an instance method >>> implementing an interface despite the fact that both JavaScript and Scala* >>> support function first and lets the user adds supplementary arguments as a >>> secondary mechanism (using currying in Scala and by adding a property on the >>> function itself in JavaScript). >>> >>> There is a good reason to support static methods in Java, a lot of use-cases >>> does not requires the template policy to have additional arguments (storing >>> them in an instance is not necessary) so forcing the template policy to be >>> defined as an instance method means a lot of boilerplate for no good reason. >>> >>> I hope i've convinced you that the current proposal for string interpolation in >>> Java is not the right one. >>> >>> regards, >>> R?mi >>> >>> * for Scala, it's a method on StringContext that acts as a function that takes a >>> StringContext as first parameter. >>> >>> [1] https://bugs.openjdk.java.net/browse/JDK-8273943 >>> [3] >>> https://urldefense.com/v3/__https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals__;!!ACWV5N9M2RV99hQ!asJ_WOx0QOyBjnKlGynqHgivYFVsbSTL8xDyi-0kCyI_qHiDdLU_IZ4tPbsWqtWjzg$ >>> [4] https://urldefense.com/v3/__https://docs.scala-lang.org/overviews/core/string-interpolation.html__;!!ACWV5N9M2RV99hQ!asJ_WOx0QOyBjnKlGynqHgivYFVsbSTL8xDyi-0kCyI_qHiDdLU_IZ4tPbtRBYzatg$ >>> [2] https://urldefense.com/v3/__https://docs.groovy-lang.org/docs/latest/html/api/groovy/lang/GString.html__;!!ACWV5N9M2RV99hQ!asJ_WOx0QOyBjnKlGynqHgivYFVsbSTL8xDyi-0kCyI_qHiDdLU_IZ4tPbsju7OQhw$ From gavin.bierman at oracle.com Wed Oct 20 15:56:54 2021 From: gavin.bierman at oracle.com (Gavin Bierman) Date: Wed, 20 Oct 2021 15:56:54 +0000 Subject: [patterns-switch] Draft Spec for JEP 420: Pattern Matching for switch (Second Preview) Message-ID: Dear experts: The first draft of the spec for JEP 420 (Pattern Matching for switch - Second Preview) is now available at: http://cr.openjdk.java.net/~gbierman/jep420/latest/ This contains the updates discussed on the list (GADT support, revised dominance rules for constant case labels etc). It doesn?t support inference of type arguments (yet). More details on that to follow. Comments welcome! Thanks, Gavin From forax at univ-mlv.fr Thu Oct 28 17:12:38 2021 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 28 Oct 2021 19:12:38 +0200 (CEST) Subject: switch expression with not explicit yield value should work ? In-Reply-To: References: <84434836.1074099.1630245621532.JavaMail.zimbra@u-pem.fr> Message-ID: <1998745512.2410609.1635441158261.JavaMail.zimbra@u-pem.fr> I fell in the same trap yet again :( Am i the only one that want to throw exceptions in all branches of a switch ? https://github.com/forax/write_your_own_java_framework/blob/master/interceptor/src/test/java/org/github/forax/framework/interceptor/InterceptorRegistryTest.java#L523 R?mi ----- Original Message ----- > From: "Tagir Valeev" > To: "Remi Forax" > Cc: "amber-spec-experts" > Sent: Lundi 30 Ao?t 2021 04:00:27 > Subject: Re: switch expression with not explicit yield value should work ? > Hello! > > I think this is not related to recent JEPs. This behavior is > standardised since Java 14 when Switch expression was introduced: > > // Compilation error > int x = switch(0) { > default -> throw new IllegalArgumentException(); > }; > > This is explicitly specified (15.28.1) [1]: > >> It is a compile-time error if a switch expression has no result expressions. > > There was some discussion about this rule in March, 2019 [2]. > Basically, the idea is to preserve the possibility of normal > (non-abrupt) execution of every expression. > I believe, preventing unreachable code has always been in the spirit > of Java. In your code sample, the execution of the 'return' statement > itself is unreachable, > so writing 'return' is redundant. In my sample above, the 'x' variable > is never assigned to anything, and the subsequent statements (if any) > are unreachable as well. > > I'd vote to keep the current behavior. While it may complicate code > generation and automatic refactorings, this additional complexity is > only marginal. The benefit is > that this behavior may save us from accidental mistakes. > > Btw, you may deceive the compiler introducing a method like > > static Object fail() { > throw new IllegalArgumentException(); > } > > And use "case Object __ -> fail()" > > With best regards, > Tagir Valeev. > > [1] https://docs.oracle.com/javase/specs/jls/se16/html/jls-15.html#jls-15.28.1 > [2] > https://mail.openjdk.java.net/pipermail/amber-spec-experts/2019-March/001067.html > > On Sun, Aug 29, 2021 at 9:00 PM Remi Forax wrote: >> >> Another case where the spec is weird, >> i've converted a project that generate a visitor from a grammar (something like >> yacc) to use a switch on type instead. >> >> Sometimes for a degenerate portion of the grammar i've an empty visitor that >> always throw an exception, >> the equivalent code with a switch is >> >> static Object result(Object o) { >> return switch (o) { >> case Object __ -> throw new AssertionError(); >> }; >> } >> >> >> Obviously i can tweak the code generator to generate >> >> static Object result(Object o) { >> throw new AssertionError(); >> } >> >> but not be able to compile the former code strike me as odd. >> >> An expression switch is a poly-expression, so the result type is back-propagated >> from the return type of the method result, so it should be Object. >> >> Moreover, if the switch is not a switch expression but a switch statement, the >> code is also valid >> >> static Object result(Object o) { >> switch (o) { >> case Object __ -> throw new AssertionError(); >> } >> } >> >> Not be able to compile a switch expression when there is no explicit result type >> but only an implicit type seems arbitrary to me >> (this change is backward compatible because it only makes more codes compiling). >> > > R?mi From brian.goetz at oracle.com Thu Oct 28 17:48:40 2021 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 28 Oct 2021 13:48:40 -0400 Subject: switch expression with not explicit yield value should work ? In-Reply-To: <1998745512.2410609.1635441158261.JavaMail.zimbra@u-pem.fr> References: <84434836.1074099.1630245621532.JavaMail.zimbra@u-pem.fr> <1998745512.2410609.1635441158261.JavaMail.zimbra@u-pem.fr> Message-ID: If all branches throw, then you can refactor ??? switch (x) { ??????? case X -> throw e; ??????? ... ??? } to ??? throw switch (x) { ???????? case X -> e; ??? } On 10/28/2021 1:12 PM, forax at univ-mlv.fr wrote: > I fell in the same trap yet again :( > > Am i the only one that want to throw exceptions in all branches of a switch ? > > https://github.com/forax/write_your_own_java_framework/blob/master/interceptor/src/test/java/org/github/forax/framework/interceptor/InterceptorRegistryTest.java#L523 > > R?mi > > ----- Original Message ----- >> From: "Tagir Valeev" >> To: "Remi Forax" >> Cc: "amber-spec-experts" >> Sent: Lundi 30 Ao?t 2021 04:00:27 >> Subject: Re: switch expression with not explicit yield value should work ? >> Hello! >> >> I think this is not related to recent JEPs. This behavior is >> standardised since Java 14 when Switch expression was introduced: >> >> // Compilation error >> int x = switch(0) { >> default -> throw new IllegalArgumentException(); >> }; >> >> This is explicitly specified (15.28.1) [1]: >> >>> It is a compile-time error if a switch expression has no result expressions. >> There was some discussion about this rule in March, 2019 [2]. >> Basically, the idea is to preserve the possibility of normal >> (non-abrupt) execution of every expression. >> I believe, preventing unreachable code has always been in the spirit >> of Java. In your code sample, the execution of the 'return' statement >> itself is unreachable, >> so writing 'return' is redundant. In my sample above, the 'x' variable >> is never assigned to anything, and the subsequent statements (if any) >> are unreachable as well. >> >> I'd vote to keep the current behavior. While it may complicate code >> generation and automatic refactorings, this additional complexity is >> only marginal. The benefit is >> that this behavior may save us from accidental mistakes. >> >> Btw, you may deceive the compiler introducing a method like >> >> static Object fail() { >> throw new IllegalArgumentException(); >> } >> >> And use "case Object __ -> fail()" >> >> With best regards, >> Tagir Valeev. >> >> [1]https://docs.oracle.com/javase/specs/jls/se16/html/jls-15.html#jls-15.28.1 >> [2] >> https://mail.openjdk.java.net/pipermail/amber-spec-experts/2019-March/001067.html >> >> On Sun, Aug 29, 2021 at 9:00 PM Remi Forax wrote: >>> Another case where the spec is weird, >>> i've converted a project that generate a visitor from a grammar (something like >>> yacc) to use a switch on type instead. >>> >>> Sometimes for a degenerate portion of the grammar i've an empty visitor that >>> always throw an exception, >>> the equivalent code with a switch is >>> >>> static Object result(Object o) { >>> return switch (o) { >>> case Object __ -> throw new AssertionError(); >>> }; >>> } >>> >>> >>> Obviously i can tweak the code generator to generate >>> >>> static Object result(Object o) { >>> throw new AssertionError(); >>> } >>> >>> but not be able to compile the former code strike me as odd. >>> >>> An expression switch is a poly-expression, so the result type is back-propagated >>> from the return type of the method result, so it should be Object. >>> >>> Moreover, if the switch is not a switch expression but a switch statement, the >>> code is also valid >>> >>> static Object result(Object o) { >>> switch (o) { >>> case Object __ -> throw new AssertionError(); >>> } >>> } >>> >>> Not be able to compile a switch expression when there is no explicit result type >>> but only an implicit type seems arbitrary to me >>> (this change is backward compatible because it only makes more codes compiling). >>> >>> R?mi From forax at univ-mlv.fr Thu Oct 28 18:04:53 2021 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 28 Oct 2021 20:04:53 +0200 (CEST) Subject: switch expression with not explicit yield value should work ? In-Reply-To: References: <84434836.1074099.1630245621532.JavaMail.zimbra@u-pem.fr> <1998745512.2410609.1635441158261.JavaMail.zimbra@u-pem.fr> Message-ID: <127716797.2419081.1635444293129.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "Remi Forax" , "Tagir Valeev" > Cc: "amber-spec-experts" > Sent: Jeudi 28 Octobre 2021 19:48:40 > Subject: Re: switch expression with not explicit yield value should work ? > If all branches throw, then you can refactor > switch (x) { > case X -> throw e; > ... > } > to > throw switch (x) { > case X -> e; > } now i feel i've been tricked by a Jedi :) R?mi > On 10/28/2021 1:12 PM, [ mailto:forax at univ-mlv.fr | forax at univ-mlv.fr ] wrote: >> I fell in the same trap yet again :( >> Am i the only one that want to throw exceptions in all branches of a switch ? [ >> https://github.com/forax/write_your_own_java_framework/blob/master/interceptor/src/test/java/org/github/forax/framework/interceptor/InterceptorRegistryTest.java#L523 >> | >> https://github.com/forax/write_your_own_java_framework/blob/master/interceptor/src/test/java/org/github/forax/framework/interceptor/InterceptorRegistryTest.java#L523 >> ] R?mi >> ----- Original Message ----- >>> From: "Tagir Valeev" [ mailto:amaembo at gmail.com | ] To: >>> "Remi Forax" [ mailto:forax at univ-mlv.fr | ] Cc: >>> "amber-spec-experts" [ mailto:amber-spec-experts at openjdk.java.net | >>> ] Sent: Lundi 30 Ao?t 2021 04:00:27 >>> Subject: Re: switch expression with not explicit yield value should work ? >>> Hello! >>> I think this is not related to recent JEPs. This behavior is >>> standardised since Java 14 when Switch expression was introduced: >>> // Compilation error >>> int x = switch(0) { >>> default -> throw new IllegalArgumentException(); >>> }; >>> This is explicitly specified (15.28.1) [1]: >>>> It is a compile-time error if a switch expression has no result expressions. >>> There was some discussion about this rule in March, 2019 [2]. >>> Basically, the idea is to preserve the possibility of normal >>> (non-abrupt) execution of every expression. >>> I believe, preventing unreachable code has always been in the spirit >>> of Java. In your code sample, the execution of the 'return' statement >>> itself is unreachable, >>> so writing 'return' is redundant. In my sample above, the 'x' variable >>> is never assigned to anything, and the subsequent statements (if any) >>> are unreachable as well. >>> I'd vote to keep the current behavior. While it may complicate code >>> generation and automatic refactorings, this additional complexity is >>> only marginal. The benefit is >>> that this behavior may save us from accidental mistakes. >>> Btw, you may deceive the compiler introducing a method like >>> static Object fail() { >>> throw new IllegalArgumentException(); >>> } >>> And use "case Object __ -> fail()" >>> With best regards, >>> Tagir Valeev. >>> [1] [ https://docs.oracle.com/javase/specs/jls/se16/html/jls-15.html#jls-15.28.1 >>> | https://docs.oracle.com/javase/specs/jls/se16/html/jls-15.html#jls-15.28.1 ] >>> [2] [ >>> https://mail.openjdk.java.net/pipermail/amber-spec-experts/2019-March/001067.html >>> | >>> https://mail.openjdk.java.net/pipermail/amber-spec-experts/2019-March/001067.html >>> ] On Sun, Aug 29, 2021 at 9:00 PM Remi Forax [ mailto:forax at univ-mlv.fr | >>> ] wrote: >>>> Another case where the spec is weird, >>>> i've converted a project that generate a visitor from a grammar (something like >>>> yacc) to use a switch on type instead. >>>> Sometimes for a degenerate portion of the grammar i've an empty visitor that >>>> always throw an exception, >>>> the equivalent code with a switch is >>>> static Object result(Object o) { >>>> return switch (o) { >>>> case Object __ -> throw new AssertionError(); >>>> }; >>>> } >>>> Obviously i can tweak the code generator to generate >>>> static Object result(Object o) { >>>> throw new AssertionError(); >>>> } >>>> but not be able to compile the former code strike me as odd. >>>> An expression switch is a poly-expression, so the result type is back-propagated >>>> from the return type of the method result, so it should be Object. >>>> Moreover, if the switch is not a switch expression but a switch statement, the >>>> code is also valid >>>> static Object result(Object o) { >>>> switch (o) { >>>> case Object __ -> throw new AssertionError(); >>>> } >>>> } >>>> Not be able to compile a switch expression when there is no explicit result type >>>> but only an implicit type seems arbitrary to me >>>> (this change is backward compatible because it only makes more codes compiling). >>>> R?mi From james.laskey at oracle.com Fri Oct 29 14:10:54 2021 From: james.laskey at oracle.com (Jim Laskey) Date: Fri, 29 Oct 2021 14:10:54 +0000 Subject: Are templated string embedded expressions "method parameters" or "lambdas"? Message-ID: <2F2E1E93-2B84-434E-B446-672270048C0A@oracle.com> For our early templated string prototypes, we restricted embedded expressions to just basic accessors and basic arithmetic. The intent was to keep things easy to read and to prevent side effects. Over time, we began thinking this restriction was unduly harsh. More precisely, we worried it that it would result in a complex, difficult-to-defend boundary. But we still would like users to not rely on side-effects. Consequently, a new proposal for embedded expressions - we would allow any Java expression with the restriction that you can't use single quotes, double quotes or escape sequences. We opted to keep this restriction to allow tools (ex., syntax highlighters) to isolate embedded expressions within strings without requiring sophisticated parsing. Given that an unprocessed templated string involves at least some deferred evaluation, should we frame templated string parameters as being more like method parameters (all parameters evaluated eagerly, left to right), or should we treat them as lambda expressions, which may capture (effectively final) variables from the environment, and evaluate the full parameters expressions when they are needed? Note too that the effectively final restriction rules out some of the worst side-effect offenders, like: int x = 0; formatter."One \{x++} plus two \{x++} is three \{x}"; -- even if we intend to then do eager evaluation! To help understand the issue, let's look at a simplification of how the two different paradigms (method parameter vs. lambda) might be implemented. Example: int x = 0; int method1() { System.out.println("one"); return 1; } int method2() { System.out.println("two"); return 2; } System.out.println("Before TemplatedString"); TemplatedString ts = "\{x} and \{method1()} and \{method2()}"; System.out.println("After TemplatedString"); System.out.println(CONCAT.apply(ts)); System.out.println("After Policy"); The method parameter paradigm would generate something like following for TemplatedString ts = "\{x} and \{method1()} and \{method2()}"; statement. Basically, capture the values of the evaluated expressions in instance fields. TemplatedString ts = new TemplatedString() { int expr$0 = x; int expr$1 = method1(); int expr$2 = method2(); String template() { return "\uFFFC and \uFFFC and \uFFFC"; } List values() { return List.of(expr$0, expr$1, expr$2); } String concat() { return expr$0 + " and " + expr$1 + " and " + expr$2; } List vars() { return List.of(lookupGetter("expr$0"), lookupGetter("expr$1"), lookupGetter("expr$2")); } } The lambda paradigm would generate something like following. Basically, wrap the expression in an instance method and capturing effectively final values used by the methods in instance fields (ala lambda.) TemplatedString ts = new TemplatedString() { int var$x = x; int expr$0() { return var$x; } int expr$1() { return method1(); } int expr$2() { return method2(); } String template() { return "\uFFFC and \uFFFC and \uFFFC"; } List values() { return List.of(expr$0(), expr$1(), expr$2()); } String concat() { return expr$0() + " and " + expr$1() + " and " + expr$2(); } List vars() { return List.of(lookupMethod("expr$0"), lookupMethod("expr$1"), lookupMethod("expr$2")); } } The output from the method parameter paradigm would be: Before TemplatedString one two After TemplatedString 0 and 1 and 2 After Policy >From the lambda paradigm would be: Before TemplatedString After TemplatedString one two 0 and 1 and 2 After Policy To help us evaluating the tradeoffs between the two paradigms, our question to the experts is, "What are the ramifications of each?" Please resist the temptation to express a preference for one or the other. Thank you. Cheers, -- Jim From forax at univ-mlv.fr Fri Oct 29 16:53:23 2021 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 29 Oct 2021 18:53:23 +0200 (CEST) Subject: Are templated string embedded expressions "method parameters" or "lambdas"? In-Reply-To: <2F2E1E93-2B84-434E-B446-672270048C0A@oracle.com> References: <2F2E1E93-2B84-434E-B446-672270048C0A@oracle.com> Message-ID: <1507501316.2837186.1635526403446.JavaMail.zimbra@u-pem.fr> > From: "Jim Laskey" > To: "amber-spec-experts" > Sent: Vendredi 29 Octobre 2021 16:10:54 > Subject: Are templated string embedded expressions "method parameters" or > "lambdas"? > For our early templated string prototypes , we restricted embedded expressions > to just basic accessors and basic arithmetic. The intent was to keep things > easy to read and to prevent side effects . Over time, we began thinking this > restriction was unduly harsh. More precisely, we worried it that it would > result in a complex, difficult-to-defend boundary. But we still would like > users to not rely on side-effects. > Consequently, a new proposal for embedded expressions - we would allow any Java > expression with the restriction that you can't use single quotes, double quotes > or escape sequences. We opted to keep this restriction to allow tools (ex., > syntax highlighters) to isolate embedded expressions within strings without > requiring sophisticated parsing. > Given that an unprocessed templated string involves at least some deferred > evaluation, should we frame templated string parameters as being more like > method parameters (all parameters evaluated eagerly, left to right), or should > we treat them as lambda expressions, which may capture (effectively final) > variables from the environment, and evaluate the full parameters expressions > when they are needed? > Note too that the effectively final restriction rules out some of the worst > side-effect offenders, like: > int x = 0; > formatter."One \{x++} plus two \{x++} is three \{x}"; > -- even if we intend to then do eager evaluation! > To help understand the issue, let's look at a simplification of how the two > different paradigms ( method parameter vs. lambda) might be implemented. > Example: > int x = 0; > int method1() { > System.out.println("one"); > return 1; > } > int method2() { > System.out.println("two"); > return 2; > } > System.out.println("Before TemplatedString"); > TemplatedString ts = "\{x} and \{method1()} and \{method2()}"; > System.out.println("After TemplatedString"); > System.out.println(CONCAT.apply(ts)); > System.out.println("After Policy"); Here, i suppose there is a typo, "CONCAT" should be "formatter". The whole point of the syntax is that you can not introduce side effects in between the reification of the templated string to an object and the method call so from the POV of the user, there is only one semantics. > To help us evaluating the tradeoffs between the two paradigms, our question to > the experts is, " What are the ramifications of each ? " Please resist the > temptation to express a preference for one or the other. For me, "deferred execution" is not the right way to think about the sub expressions of a templated string. After all, a guard of a pattern is also a sub expression and we require the local variables to be captured, despite the fact that there is no "deferred execution". We do not want side effects in the guard of a pattern, we can not fully guaranteed that in Java but at least we can guarantee that the local variable will not change. I think the same argument apply to a sub expression of a templated string, we do not want side effect in it, so we should ask the local variables used inside a sub expression to be effectively final. > Thank you. > Cheers, > -- Jim regards, R?mi From brian.goetz at oracle.com Fri Oct 29 17:20:20 2021 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 29 Oct 2021 13:20:20 -0400 Subject: Are templated string embedded expressions "method parameters" or "lambdas"? In-Reply-To: <1507501316.2837186.1635526403446.JavaMail.zimbra@u-pem.fr> References: <2F2E1E93-2B84-434E-B446-672270048C0A@oracle.com> <1507501316.2837186.1635526403446.JavaMail.zimbra@u-pem.fr> Message-ID: <278bd79f-5b1e-a59a-8da8-5ed56bdfa2a8@oracle.com> > For me, "deferred execution" is not the right way to think about the > sub expressions of a templated string. I think there really are two (related) cases here. The first (and more common) case is when you provide a formatter object directly: ??? Foo f = F."Hello \{name()}" Here, there are several interesting constraints: ?- The template is formatted exactly once, consuming each parameter expression exactly once (*) ?- The timing of consuming the parameter expressions is the same as the timing of formatting Under these constraints, the eager-vs-lazy interpretation doesn't really matter, because there's no difference in the timing or arity of expression evaluation. The second case is when you are capturing an "unprocessed" template for later use: ??? TemplatedString ts = "Hello \{name()}"; Now, there's no guarantee as to whether the template will be processed at all, or once, or more than once.? Again, this only makes a difference if (a) the expressions have side-effects or (b) the expressions are "stateful", acting on state that might have changed since capture time, such as static state, current time, etc.? (Which sounds very much like the things that Streams tells you not to do in behavioral parameters.)? The sad thing is we would rather people not do these things at all, in which case it makes less of a difference. Treating expressions as lazy offers real performance benefits for the case where the template will not be processed at all (as in logging frameworks); it minimizes the cost of capturing the TS. Treating expressions as eager is simpler to reason about (though, in the absence of side effects, doesn't really matter), but creates a two-stage evaluation where the expressions are evaluated at one point and combined at another.? Performance is a side-effect too. *It may not actually be the case that all are consumed exactly once.? Imagine a framework where you can specifiy localized messages in a way that lets them reorder / reuse parameters; its possible some parameters need not be evaluated at all.? Again, lazy evaluation defers the costs until they are needed. From forax at univ-mlv.fr Fri Oct 29 18:18:39 2021 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 29 Oct 2021 20:18:39 +0200 (CEST) Subject: Are templated string embedded expressions "method parameters" or "lambdas"? In-Reply-To: <278bd79f-5b1e-a59a-8da8-5ed56bdfa2a8@oracle.com> References: <2F2E1E93-2B84-434E-B446-672270048C0A@oracle.com> <1507501316.2837186.1635526403446.JavaMail.zimbra@u-pem.fr> <278bd79f-5b1e-a59a-8da8-5ed56bdfa2a8@oracle.com> Message-ID: <1267650847.2850767.1635531519273.JavaMail.zimbra@u-pem.fr> ----- Original Message ----- > From: "Brian Goetz" > To: "Remi Forax" , "Jim Laskey" > Cc: "amber-spec-experts" > Sent: Vendredi 29 Octobre 2021 19:20:20 > Subject: Re: Are templated string embedded expressions "method parameters" or "lambdas"? >> For me, "deferred execution" is not the right way to think about the >> sub expressions of a templated string. > > I think there really are two (related) cases here. > > The first (and more common) case is when you provide a formatter object > directly: > > ??? Foo f = F."Hello \{name()}" > > Here, there are several interesting constraints: > > ?- The template is formatted exactly once, consuming each parameter > expression exactly once (*) > ?- The timing of consuming the parameter expressions is the same as the > timing of formatting > > Under these constraints, the eager-vs-lazy interpretation doesn't really > matter, because there's no difference in the timing or arity of > expression evaluation. > > The second case is when you are capturing an "unprocessed" template for > later use: > > ??? TemplatedString ts = "Hello \{name()}"; > > Now, there's no guarantee as to whether the template will be processed > at all, or once, or more than once.? Again, this only makes a difference > if (a) the expressions have side-effects or (b) the expressions are > "stateful", acting on state that might have changed since capture time, > such as static state, current time, etc.? (Which sounds very much like > the things that Streams tells you not to do in behavioral parameters.) > The sad thing is we would rather people not do these things at all, in > which case it makes less of a difference. Do we really need to support the second case at all ? Instead of "Hello \{name()}" it can be written to something like () -> F."Hello \{name()}" piggy backing on the semantics of lambdas instead of inventing a new kind of "lambda but not exactly a lambda" thing. > > Treating expressions as lazy offers real performance benefits for the > case where the template will not be processed at all (as in logging > frameworks); it minimizes the cost of capturing the TS. Treating > expressions as eager is simpler to reason about (though, in the absence > of side effects, doesn't really matter), but creates a two-stage > evaluation where the expressions are evaluated at one point and combined > at another.? Performance is a side-effect too. I think the choice between eager and lazy should be reflected in the syntax, if a sub-expression should be evaluated lazily, the expression can be a lambda or a method reference, like GENERATOR."Hello \(Employee::name)" regards, R?mi > > > > *It may not actually be the case that all are consumed exactly once. > Imagine a framework where you can specifiy localized messages in a way > that lets them reorder / reuse parameters; its possible some parameters > need not be evaluated at all.? Again, lazy evaluation defers the costs > until they are needed. From brian.goetz at oracle.com Fri Oct 29 18:26:28 2021 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 29 Oct 2021 14:26:28 -0400 Subject: [External] : Re: Are templated string embedded expressions "method parameters" or "lambdas"? In-Reply-To: <1267650847.2850767.1635531519273.JavaMail.zimbra@u-pem.fr> References: <2F2E1E93-2B84-434E-B446-672270048C0A@oracle.com> <1507501316.2837186.1635526403446.JavaMail.zimbra@u-pem.fr> <278bd79f-5b1e-a59a-8da8-5ed56bdfa2a8@oracle.com> <1267650847.2850767.1635531519273.JavaMail.zimbra@u-pem.fr> Message-ID: > Do we really need to support the second case at all ? Yes :) > Instead of > "Hello \{name()}" > > it can be written to something like > () -> F."Hello \{name()}" That deprives your callee of the opportunity to choose the formatter for you.? Whoops, now the feature is way less expressive.? One of the goals here is to enable APIs to accept TSs as parameters, and be in control of the when and how of formatting. > > I think the choice between eager and lazy should be reflected in the syntax, Oh great, two subtly different sets of semantics.? Way to overspend the complexity budget. From forax at univ-mlv.fr Fri Oct 29 18:44:33 2021 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 29 Oct 2021 20:44:33 +0200 (CEST) Subject: [External] : Re: Are templated string embedded expressions "method parameters" or "lambdas"? In-Reply-To: References: <2F2E1E93-2B84-434E-B446-672270048C0A@oracle.com> <1507501316.2837186.1635526403446.JavaMail.zimbra@u-pem.fr> <278bd79f-5b1e-a59a-8da8-5ed56bdfa2a8@oracle.com> <1267650847.2850767.1635531519273.JavaMail.zimbra@u-pem.fr> Message-ID: <191598355.2856117.1635533073115.JavaMail.zimbra@u-pem.fr> ----- Original Message ----- > From: "Brian Goetz" > To: "Remi Forax" > Cc: "Jim Laskey" , "amber-spec-experts" > Sent: Vendredi 29 Octobre 2021 20:26:28 > Subject: Re: [External] : Re: Are templated string embedded expressions "method parameters" or "lambdas"? >> Do we really need to support the second case at all ? > > Yes :) > >> Instead of >> "Hello \{name()}" >> >> it can be written to something like >> () -> F."Hello \{name()}" > > That deprives your callee of the opportunity to choose the formatter for > you.? Whoops, now the feature is way less expressive.? One of the goals > here is to enable APIs to accept TSs as parameters, and be in control of > the when and how of formatting. If you want the API to choose the formatter, you can take it as parameter like a normal lambda formatter -> formatter."Hello \{name()}" You do not need to provide an access to a TemplateString explicitly in the language because you can have a TemplatePolicy identity that returns the one taken as argument. > >> >> I think the choice between eager and lazy should be reflected in the syntax, > > Oh great, two subtly different sets of semantics.? Way to overspend the > complexity budget. I don't understand, those expressions, lambdas and method references already exist in Java. R?mi From amaembo at gmail.com Sat Oct 30 05:57:52 2021 From: amaembo at gmail.com (Tagir Valeev) Date: Sat, 30 Oct 2021 12:57:52 +0700 Subject: Are templated string embedded expressions "method parameters" or "lambdas"? In-Reply-To: <2F2E1E93-2B84-434E-B446-672270048C0A@oracle.com> References: <2F2E1E93-2B84-434E-B446-672270048C0A@oracle.com> Message-ID: Hello! I think that deferred semantics could be confusing and it provides too little benefit to justify its use. For example: DebugLogger."The number of objects: \{counter.incrementAndGet()}"; We saw a number of bugs like this in assert statements. The unpleasant thing is that in unit-tests assertions are usually turned on. It's possible that debug logging is also turned on for tests. So we can have successful tests and buggy production. Also, it's unclear whether the template processor can invoke the embedded expression several times, pass the MethodHandle to another thread, invoke two embedded expressions concurrently. It's likely that it can. In this case, the results could be totally unpredictable. Another problem is the IDE refactoring. Taking the example above, we cannot extract the variable without changing the semantics: var count = counter.incrementAndGet(); DebugLogger."The number of objects: \{count}"; As static analysis is somewhat limited, IDE cannot always predict whether the given template processor always executes the embedded expression and whether the embedded expression has a side effect. Thus, IDE cannot guarantee refactoring safety. Finally, the inability to use non-effectively-final variables would be very limiting. Note that the most classic way to iterate numbers in Java is still for loop: for(int i=0; i counter.incrementAndGet()}"; // adding `() -> ` I explicitly state that I want side-effect to be executed only when logging is on. In this case, block lambdas would also be possible, so I could extract variable preserving the semantics: DebugLogger."The number of objects: \{() -> {var count = counter.incrementAndGet(); return count;}"; This lazy evaluation feature can be implemented as a separate iteration, it's not necessary to deliver it together with the eager evaluation one. With best regards, Tagir Valeev. On Fri, Oct 29, 2021 at 9:11 PM Jim Laskey wrote: > > For our early templated string prototypes, we restricted embedded expressions to just basic accessors and basic arithmetic. The intent was to keep things easy to read and to prevent side effects. Over time, we began thinking this restriction was unduly harsh. More precisely, we worried it that it would result in a complex, difficult-to-defend boundary. But we still would like users to not rely on side-effects. > > Consequently, a new proposal for embedded expressions - we would allow any Java expression with the restriction that you can't use single quotes, double quotes or escape sequences. We opted to keep this restriction to allow tools (ex., syntax highlighters) to isolate embedded expressions within strings without requiring sophisticated parsing. > > Given that an unprocessed templated string involves at least some deferred evaluation, should we frame templated string parameters as being more like method parameters (all parameters evaluated eagerly, left to right), or should we treat them as lambda expressions, which may capture (effectively final) variables from the environment, and evaluate the full parameters expressions when they are needed? > > Note too that the effectively final restriction rules out some of the worst side-effect offenders, like: > > int x = 0; > formatter."One \{x++} plus two \{x++} is three \{x}"; > > -- even if we intend to then do eager evaluation! > > To help understand the issue, let's look at a simplification of how the two different paradigms (method parameter vs. lambda) might be implemented. Example: > > int x = 0; > > int method1() { > System.out.println("one"); > return 1; > } > > int method2() { > System.out.println("two"); > return 2; > } > > System.out.println("Before TemplatedString"); > TemplatedString ts = "\{x} and \{method1()} and \{method2()}"; > System.out.println("After TemplatedString"); > System.out.println(CONCAT.apply(ts)); > System.out.println("After Policy"); > > The method parameter paradigm would generate something like following for TemplatedString ts = "\{x} and \{method1()} and \{method2()}"; statement. Basically, capture the values of the evaluated expressions in instance fields. > > TemplatedString ts = new TemplatedString() { > int expr$0 = x; > int expr$1 = method1(); > int expr$2 = method2(); > > String template() { > return "\uFFFC and \uFFFC and \uFFFC"; > } > > List values() { > return List.of(expr$0, expr$1, expr$2); > } > > String concat() { > return expr$0 + " and " + expr$1 + " and " + expr$2; > } > > List vars() { > return List.of(lookupGetter("expr$0"), lookupGetter("expr$1"), lookupGetter("expr$2")); > } > } > > The lambda paradigm would generate something like following. Basically, wrap the expression in an instance method and capturing effectively final values used by the methods in instance fields (ala lambda.) > > TemplatedString ts = new TemplatedString() { > int var$x = x; > > int expr$0() { > return var$x; > } > > int expr$1() { > return method1(); > } > > int expr$2() { > return method2(); > } > > String template() { > return "\uFFFC and \uFFFC and \uFFFC"; > } > > List values() { > return List.of(expr$0(), expr$1(), expr$2()); > } > > String concat() { > return expr$0() + " and " + expr$1() + " and " + expr$2(); > } > > List vars() { > return List.of(lookupMethod("expr$0"), lookupMethod("expr$1"), lookupMethod("expr$2")); > } > } > > The output from the method parameter paradigm would be: > > Before TemplatedString > one > two > After TemplatedString > 0 and 1 and 2 > After Policy > > From the lambda paradigm would be: > > Before TemplatedString > After TemplatedString > one > two > 0 and 1 and 2 > After Policy > > To help us evaluating the tradeoffs between the two paradigms, our question to the experts is, "What are the ramifications of each?" Please resist the temptation to express a preference for one or the other. > > Thank you. > > Cheers, > > -- Jim From brian.goetz at oracle.com Sat Oct 30 17:04:46 2021 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 30 Oct 2021 13:04:46 -0400 Subject: Are templated string embedded expressions "method parameters" or "lambdas"? In-Reply-To: References: <2F2E1E93-2B84-434E-B446-672270048C0A@oracle.com> Message-ID: <6c2e73bb-7fde-e1a0-d36c-0c8caf13d7f0@oracle.com> > I think that deferred semantics could be confusing and it provides too > little benefit to justify its use. For example: > > DebugLogger."The number of objects: \{counter.incrementAndGet()}"; > > We saw a number of bugs like this in assert statements. The unpleasant > thing is that in unit-tests assertions are usually turned on. It's > possible that debug logging is also turned on for tests. So we can > have successful tests and buggy production. > > Also, it's unclear whether the template processor can invoke the > embedded expression several times, pass the MethodHandle to another > thread, invoke two embedded expressions concurrently. It's likely that > it can. In this case, the results could be totally unpredictable. > > Another problem is the IDE refactoring. Taking the example above, we > cannot extract the variable without changing the semantics: > > var count = counter.incrementAndGet(); > DebugLogger."The number of objects: \{count}"; > > As static analysis is somewhat limited, IDE cannot always predict > whether the given template processor always executes the embedded > expression and whether the embedded expression has a side effect. > Thus, IDE cannot guarantee refactoring safety. I'd characterize these argument as "this deferral is too magic and raises too many questions for too little benefit; if you want deferral, do it explicitly with ->. > Finally, the inability to use non-effectively-final variables would be > very limiting. Note that the most classic way to iterate numbers in > Java is still for loop: > > for(int i=0; i System.out.println("array[\{i}] = \{array[i]}"); > } > > Inability to do this simple thing would be frustrating. I see this as more a deficiency of loop induction variables than lambdas / other capturing constructs.? For the old-school for-loop, we're kind of hosed because the induction variable is intrinsically mutable, but for the foreach loop, this is probably fixable; ??? for (T x : source) could treat x as being freshly declared in each block iteration.? We visited this during Lambda; perhaps this could be revisited.? So I'm not yet convinced that the effectively-final restriction is unreasonable.? And, we can always relax from there, but can't go stricter. From forax at univ-mlv.fr Sat Oct 30 17:30:47 2021 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 30 Oct 2021 19:30:47 +0200 (CEST) Subject: Yet another proposal of a templated string design Message-ID: <1230830734.2960737.1635615047968.JavaMail.zimbra@u-pem.fr> Happy Halloween, i spend some time again thinking how to implement templated strings. >From the comments of my previous attempt, the main issues of the previous iteration were that it was a little too magical and by using a bootstrap method force the BSM design into the JLS. So i try to borrow some aspects of the proposal from Brian and Jim and makes them mine. I still think that Brian and Jim proposal to use an interface and to bind the values into the TemplatedString object far from good enough. A templated string is fundamentally a string with holes (the template part) and some arguments (the sub expressions part), so it should be modeled by a method call. In a way, a templated string is similar to a varargs call, at the definition site, we want a special keyword like the symbol "..." for varargs and at call site the compiler do transformation some boxing / array creation in the case of varargs. A templated string is in fact more like the opposite of varargs (a spread operator ?) because a templated string is expanded into several parameters, a constant TemplatedString and the values of the sub-expressions while the varargs collects several arguments into one parameter. The other thing to remark is that the current syntax, something like Format."name: \(name) age: \(age)" omits the method name, so there is need for a convention for the compiler to linked a templated string call to an actual method definition in a similar way the name "value" is used when declaring an annotation without mentioning a method name. I think we can group those two constraints by using that a method with a special name, i will use the hyphenated name "template-policy" in the rest of the document, obviously it can be any name. Using an hyphenated name has the advantage to be clear at definition site that the method is special and acts as a kind of spread operator. So i propose that Format."name: \(name) age: \(age)" is semantically equivalent to Format.template-policy(new TemplatedString("name: \uFFFC age: \uFFFC", ...), name, age).result() You can notice that there is a call to result() on the returned value, it's because the returned value can be either a value or a value and a policy factory (a lambda to call to optimize the call, in a very similar way TemplatePolicy.asMethodHandle works). So at declaration site the method template-policy looks like that public class Format { public static TemplatePolicyResult template-policy(TemplatedString templatedString, Object... args) { if (templatedString.parameters().size() != args.length) { throw new IllegalArgumentException(templatedString + " does not accept " + Arrays.toString(args)); } var builder = new StringBuilder(); for(var segment: templatedString.segments()) { builder.append(switch(segment) { case Text text -> text.text(); case Parameter parameter -> args[parameter.index()]; }); } var text = builder.toString(); return TemplatePolicyResult.result(text); } } The compiler can check that the expressions of the templated string are correctly typed, here they have to be assignable to Object. The return type, is the type argument of TemplatePolicyResult, so the result is a String. If we want to optimize the template-policy to use the StringConcatFactory, instead of just specifying a result as return value, we can also specify a policy factory. public class Format { public static TemplatePolicyResult template-policy(TemplatedString templatedString, Object... args) { ... // see above var text = builder.toString(); return TemplatePolicyResult.resultAndPolicyFactory(text, StringConcat::policyFactory); } private static MethodHandle policyFactory(TemplatedString templatedString, MethodType methodType) throws StringConcatException { var recipe = templatedString.template().replace(TemplatedString.OBJECT_REPLACEMENT_CHARACTER, '\u0001'); return StringConcatFactory.makeConcatWithConstants(MethodHandles.lookup(), "concat", methodType, recipe) .dynamicInvoker(); } } The semantics is the following, the first time the method template-policy is called, if the result also comes with a policy factory, all subsequent calls will use the method handle returned by the policy factory lambda. Internally at runtime, it means using a MutableCallSite but with the guarantee that after one call the target will never change again (and obviously there is no runtime check needed). The runtime implementation is available here https://github.com/forax/java-interpolation/tree/master/policy-method/src/main/java/com/github/forax/policymethod regards, R?mi From brian.goetz at oracle.com Sat Oct 30 17:49:49 2021 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 30 Oct 2021 13:49:49 -0400 Subject: Yet another proposal of a templated string design In-Reply-To: <1230830734.2960737.1635615047968.JavaMail.zimbra@u-pem.fr> References: <1230830734.2960737.1635615047968.JavaMail.zimbra@u-pem.fr> Message-ID: <684aca6d-9ecc-77ae-12ad-9790983368bd@oracle.com> > The other thing to remark is that the current syntax, something like Format."name: \(name) age: \(age)" omits the method name, so there is need for a convention for the compiler to linked a templated string call to an actual method definition in a similar way the name "value" is used when declaring an annotation without mentioning a method name. If what you're saying is you want a convention for inferring a magic method name when a type is presented, absolutely not.? That's the road serialization took, and we're not taking that road again. The path to get from types to behavior is type classes.? I don't want to dive into the details of type classes now, but we *can* eventually get to where you want, in a more disciplined way, once we have type classes.? In the meantime, the thing to the left side of the dot is a receiver object. > I think we can group those two constraints by using that a method with a special name, i will use the hyphenated name "template-policy" in the rest of the document, obviously it can be any name. Using an hyphenated name has the advantage to be clear at definition site that the method is special and acts as a kind of spread operator. > > So i propose that > Format."name: \(name) age: \(age)" I think using type names on the LHS is a distraction, though, because its a weaker feature than having an object, because objects have state.? Take the SQL example.? A Connection could be (or have) a policy object that does quote escaping _according to DB-specific rules_, while still letting users interact through a Connection (rather than a FooBaseConnection.)? We don't want to give that up. > is semantically equivalent to > Format.template-policy(new TemplatedString("name: \uFFFC age: \uFFFC", ...), name, age).result() Where we are now is quite close: ??? receiver."Hi \{name}" is basically equivalent to ??? receiver.applyTemplatePolicy("Hi \uFFFc", name) except that we group together the string and the parameters in a single object.? This has both implementation and API design benefits; the template policy deals in the same TemplatedString type that you'd get if you didn't specify a policy. I'm still fuzzy, though, on what problem you're trying to solve. Could we uplevel a bit? From forax at univ-mlv.fr Sat Oct 30 19:56:52 2021 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sat, 30 Oct 2021 21:56:52 +0200 (CEST) Subject: Yet another proposal of a templated string design In-Reply-To: <684aca6d-9ecc-77ae-12ad-9790983368bd@oracle.com> References: <1230830734.2960737.1635615047968.JavaMail.zimbra@u-pem.fr> <684aca6d-9ecc-77ae-12ad-9790983368bd@oracle.com> Message-ID: <222583044.2967995.1635623812889.JavaMail.zimbra@u-pem.fr> ----- Original Message ----- > From: "Brian Goetz" > To: "Remi Forax" , "amber-spec-experts" > Sent: Samedi 30 Octobre 2021 19:49:49 > Subject: Re: Yet another proposal of a templated string design >> The other thing to remark is that the current syntax, something like >> Format."name: \(name) age: \(age)" omits the method name, so there is need for >> a convention for the compiler to linked a templated string call to an actual >> method definition in a similar way the name "value" is used when declaring an >> annotation without mentioning a method name. > > If what you're saying is you want a convention for inferring a magic > method name when a type is presented, absolutely not.? That's the road > serialization took, and we're not taking that road again. no it's not, one problem of the serialization is the use private methods with normal name that changes the semantics of the serialization. If the method is non normal like with a constructor or used an hyphenated name as i propose, the is no such issue. > > The path to get from types to behavior is type classes.? I don't want to > dive into the details of type classes now, but we *can* eventually get > to where you want, in a more disciplined way, once we have type > classes.? In the meantime, the thing to the left side of the dot is a > receiver object. Having a type classes is not what we need here. A type classes like an interface is a way to constrain the signature of *one* method or a group of methods that works together. As a user, i will want to declare several overloads, by example, one that takes values and one that takes functional interfaces, like in the case described by Tagir. A templated string is a kind of spread operator, so we should be able to support several overloads as well as both static and instance methods. > >> I think we can group those two constraints by using that a method with a special >> name, i will use the hyphenated name "template-policy" in the rest of the >> document, obviously it can be any name. Using an hyphenated name has the >> advantage to be clear at definition site that the method is special and acts as >> a kind of spread operator. >> >> So i propose that >> Format."name: \(name) age: \(age)" > > I think using type names on the LHS is a distraction, though, because > its a weaker feature than having an object, because objects have state. > Take the SQL example.? A Connection could be (or have) a policy object > that does quote escaping _according to DB-specific rules_, while still > letting users interact through a Connection (rather than a > FooBaseConnection.)? We don't want to give that up. I don't propose to give anything up, i propose the opposite, it should work with both instance methods and static methods. So yes, obviously, it should work with a BD Connection. > >> is semantically equivalent to >> Format.template-policy(new TemplatedString("name: \uFFFC age: \uFFFC", ...), >> name, age).result() > > Where we are now is quite close: > > ??? receiver."Hi \{name}" > > is basically equivalent to > > ??? receiver.applyTemplatePolicy("Hi \uFFFc", name) > > except that we group together the string and the parameters in a single > object.? This has both implementation and API design benefits; the > template policy deals in the same TemplatedString type that you'd get if > you didn't specify a policy. You have painted yourself into a corner by wanting it to be implemented using an interface. You have a problem that you can resolve with a method call and you resolve it by doing a partial application before doing that call, so it works with an interface. Your design reify the arguments into a record-like object implementing TemplatedString generated at runtime (or worst at compile time) and hopes that the escape analysis of c2 will see through the maze of metadata you have just created to turn the whole thing into a method call. That's a lot of fluff for a method call. Sadly, i believe it will only work well if the template policy is specified as a method handle (you need the arguments to be accessed using method handle accessors). If this is true, then it is somewhat better to just have the method apply that returns a method handle, because at least the performance model is clear. But it also means that it will fracture the ecosystem because tools like graal native image that does a static analysis will not be able to see through that method handle. > > I'm still fuzzy, though, on what problem you're trying to solve. Could > we uplevel a bit? I'm trying to explain to you that you can solve a problem that can be solved by a method call by a method call. regards, R?mi From guy.steele at oracle.com Sat Oct 30 21:11:52 2021 From: guy.steele at oracle.com (Guy Steele) Date: Sat, 30 Oct 2021 21:11:52 +0000 Subject: Are templated string embedded expressions "method parameters" or "lambdas"? In-Reply-To: <6c2e73bb-7fde-e1a0-d36c-0c8caf13d7f0@oracle.com> References: <2F2E1E93-2B84-434E-B446-672270048C0A@oracle.com> <6c2e73bb-7fde-e1a0-d36c-0c8caf13d7f0@oracle.com> Message-ID: On Oct 30, 2021, at 1:04 PM, Brian Goetz > wrote: I think that deferred semantics could be confusing and it provides too little benefit to justify its use. . . . I'd characterize these argument as "this deferral is too magic and raises too many questions for too little benefit; if you want deferral, do it explicitly with ->. I will go a step further: the proposed deferral semantics is too _surprising_. Joe Programmer will look at FOO."I have \{n+1} apples in my basket and \{c.size()} elements in my collection." and fully expect that the expressions `n+1` and `c.size()` will be evaluated right away, before any of the magic associated with that dot happens. There are languages for which deferred evaluation is the norm, the expected way of doing things. Haskell is one of them. Java is NOT one of them. Java now has a standard strategy for user-specified deferred evaluation: ->. So it would also be surprising to introduce another, unrelated mechanism for user-specified deferred evaluation without a REALLY compelling reason. (I say ?user-specified? because there are certain other mechanisms that are built-in, such as the ? : operator and statements such as `if` and `while`.) Joe Programmer would expect that if FOO wants to provide for deferred evaluation, it would do so by accepting lambdas, and Joe would expect to provide them. ?Guy From guy.steele at oracle.com Sat Oct 30 21:13:08 2021 From: guy.steele at oracle.com (Guy Steele) Date: Sat, 30 Oct 2021 21:13:08 +0000 Subject: Are templated string embedded expressions "method parameters" or "lambdas"? In-Reply-To: <2F2E1E93-2B84-434E-B446-672270048C0A@oracle.com> References: <2F2E1E93-2B84-434E-B446-672270048C0A@oracle.com> Message-ID: On Oct 29, 2021, at 10:10 AM, Jim Laskey > wrote: For our early templated string prototypes, we restricted embedded expressions to just basic accessors and basic arithmetic. The intent was to keep things easy to read and to prevent side effects. Over time, we began thinking this restriction was unduly harsh. More precisely, we worried it that it would result in a complex, difficult-to-defend boundary. But we still would like users to not rely on side-effects. Consequently, a new proposal for embedded expressions - we would allow any Java expression with the restriction that you can't use single quotes, double quotes or escape sequences. We opted to keep this restriction to allow tools (ex., syntax highlighters) to isolate embedded expressions within strings without requiring sophisticated parsing. An interesting restriction. Would comments also be forbidden within such expressions (for the same reason)? ?Guy From john.r.rose at oracle.com Sat Oct 30 21:44:46 2021 From: john.r.rose at oracle.com (John Rose) Date: Sat, 30 Oct 2021 21:44:46 +0000 Subject: Are templated string embedded expressions "method parameters" or "lambdas"? In-Reply-To: References: <2F2E1E93-2B84-434E-B446-672270048C0A@oracle.com> <6c2e73bb-7fde-e1a0-d36c-0c8caf13d7f0@oracle.com> Message-ID: <635336FF-72B7-4745-B708-203A4A281437@oracle.com> Brian will not be surprised to hear that I agree with Guy here. The Logger use case is the main one, IIUC, which is driving the insertion of pass-by-name (automatic quoting as if by ()->) into the current ST design. If we want something like Log.note("foo = "+foo()) to auto-quote to Log.note(() -> "foo = "+foo()) we should do it as its own avowed language feature (akin to auto boxing and varargs) and not hide it inside of string templates. (Turning x to ()->x is indeed a kind of quoting, since it allows the user access to a useful name of x without elaborating x. If and when we add lambda cracking it will also become a way to literally quote expression trees. If the term auto-quoting is off the mark, may I suggest auto-delaying or pass-by-name.) I would be delighted if we could define string templates without rolling in an auto-delay feature for their arguments, then later add auto-delay of arguments (to arbitrary methods) as a proper language feature, and then, triumphantly, cross-apply those two features to create Logger APIs that (a) make great use of STs, and (b) smoothly delay their operands until the Logger has decided it really wants them. On Oct 30, 2021, at 2:11 PM, Guy Steele > wrote: On Oct 30, 2021, at 1:04 PM, Brian Goetz > wrote: I think that deferred semantics could be confusing and it provides too little benefit to justify its use. . . . I'd characterize these argument as "this deferral is too magic and raises too many questions for too little benefit; if you want deferral, do it explicitly with ->. I will go a step further: the proposed deferral semantics is too _surprising_. Joe Programmer will look at FOO."I have \{n+1} apples in my basket and \{c.size()} elements in my collection." and fully expect that the expressions `n+1` and `c.size()` will be evaluated right away, before any of the magic associated with that dot happens. There are languages for which deferred evaluation is the norm, the expected way of doing things. Haskell is one of them. Java is NOT one of them. Java now has a standard strategy for user-specified deferred evaluation: ->. So it would also be surprising to introduce another, unrelated mechanism for user-specified deferred evaluation without a REALLY compelling reason. (I say ?user-specified? because there are certain other mechanisms that are built-in, such as the ? : operator and statements such as `if` and `while`.) Joe Programmer would expect that if FOO wants to provide for deferred evaluation, it would do so by accepting lambdas, and Joe would expect to provide them. ?Guy From guy.steele at oracle.com Sat Oct 30 21:58:33 2021 From: guy.steele at oracle.com (Guy Steele) Date: Sat, 30 Oct 2021 21:58:33 +0000 Subject: Are templated string embedded expressions "method parameters" or "lambdas"? In-Reply-To: <635336FF-72B7-4745-B708-203A4A281437@oracle.com> References: <2F2E1E93-2B84-434E-B446-672270048C0A@oracle.com> <6c2e73bb-7fde-e1a0-d36c-0c8caf13d7f0@oracle.com> <635336FF-72B7-4745-B708-203A4A281437@oracle.com> Message-ID: <4AA2D9FA-013A-409C-A673-111272B19BCC@oracle.com> And Brian will not be surprised to hear that I agree with John?s further elaborations here. If it can be done, it would be far preferable to have two separate, orthogonal, general features that interact smoothly. On Oct 30, 2021, at 5:44 PM, John Rose > wrote: Brian will not be surprised to hear that I agree with Guy here. The Logger use case is the main one, IIUC, which is driving the insertion of pass-by-name (automatic quoting as if by ()->) into the current ST design. If we want something like Log.note("foo = "+foo()) to auto-quote to Log.note(() -> "foo = "+foo()) we should do it as its own avowed language feature (akin to auto boxing and varargs) and not hide it inside of string templates. (Turning x to ()->x is indeed a kind of quoting, since it allows the user access to a useful name of x without elaborating x. If and when we add lambda cracking it will also become a way to literally quote expression trees. If the term auto-quoting is off the mark, may I suggest auto-delaying or pass-by-name.) I would be delighted if we could define string templates without rolling in an auto-delay feature for their arguments, then later add auto-delay of arguments (to arbitrary methods) as a proper language feature, and then, triumphantly, cross-apply those two features to create Logger APIs that (a) make great use of STs, and (b) smoothly delay their operands until the Logger has decided it really wants them. On Oct 30, 2021, at 2:11 PM, Guy Steele > wrote: On Oct 30, 2021, at 1:04 PM, Brian Goetz > wrote: I think that deferred semantics could be confusing and it provides too little benefit to justify its use. . . . I'd characterize these argument as "this deferral is too magic and raises too many questions for too little benefit; if you want deferral, do it explicitly with ->. I will go a step further: the proposed deferral semantics is too _surprising_. Joe Programmer will look at FOO."I have \{n+1} apples in my basket and \{c.size()} elements in my collection." and fully expect that the expressions `n+1` and `c.size()` will be evaluated right away, before any of the magic associated with that dot happens. There are languages for which deferred evaluation is the norm, the expected way of doing things. Haskell is one of them. Java is NOT one of them. Java now has a standard strategy for user-specified deferred evaluation: ->. So it would also be surprising to introduce another, unrelated mechanism for user-specified deferred evaluation without a REALLY compelling reason. (I say ?user-specified? because there are certain other mechanisms that are built-in, such as the ? : operator and statements such as `if` and `while`.) Joe Programmer would expect that if FOO wants to provide for deferred evaluation, it would do so by accepting lambdas, and Joe would expect to provide them. ?Guy From john.r.rose at oracle.com Sat Oct 30 22:22:14 2021 From: john.r.rose at oracle.com (John Rose) Date: Sat, 30 Oct 2021 22:22:14 +0000 Subject: Are templated string embedded expressions "method parameters" or "lambdas"? In-Reply-To: <635336FF-72B7-4745-B708-203A4A281437@oracle.com> References: <2F2E1E93-2B84-434E-B446-672270048C0A@oracle.com> <6c2e73bb-7fde-e1a0-d36c-0c8caf13d7f0@oracle.com> <635336FF-72B7-4745-B708-203A4A281437@oracle.com> Message-ID: <8EA56F3B-9615-471F-8C79-C84952B5FBE9@oracle.com> On Oct 30, 2021, at 2:44 PM, John Rose > wrote: its own avowed language feature And, full disclosure, besides pass-by-name I see two other language features that could be distinguished from the current design, whether or not they ever appear separately (as I hope pass-by-name/autoquote will some day appear, for expression tree processing more than loggers). 2. Reified argument types: There is a nifty way that javac captures the static types of the ?hole? arguments of the template and makes them available, in case the template processor would find it useful. (This is not reified type arguments, but could be a building block.) This happens also for many uses of method handles and invokedynamic, but always ?under the covers?. This is the first time static argument types have been reified as part of an API design. Perhaps that deserves some pathfinding discussion at this point. Perhaps string templates intrinsically ?own the turf? of reifying static argument types, but I don?t see this yet; I think they have a natural ?ask? for reified types, but *are not alone*. If S.T.s are an early customer of such things, it would be wise to take a breath and think ahead, to other customers, if possible aligning the way S.T.s do it from the first with some set of likely futures. 3. Static validation: This is a biggie. We want a hook that allows (though does not require) a S.T. provider to validate a string template, once for each distinct context. What is the static context? Well, in x."y\{z}" that would be at least the string containing y plus (see above) the static type of z, plus something derived from x. Plus maybe?I hope?an indy-like context, so that the validation logic has the option (but not requirement) to look up the stack for permission checks. Plus, probably, the entire contents of x (which is not static if x is a proper variable!). I call it ?static? validation because clearly we want to play out the validation logic the first time some given context is evaluated, but we don?t want to replay it for later evaluations in the same context. The string is clearly constant as are the static types of the arguments. The rest is not so clearly constant, so the prototype uses some tricks with MutableCallSite to make the evaluation mostly static. I think this needs scrutiny, and I would also like to consider other ways (without appealing to an MCS) to gain the required static evaluation hooks. Some of the use cases defer static validation until a database connection is opened and an SQL idiom is determined. In the more problematic case, the validation is not static because the database connection is not a constant, and the idiom may vary from call to call. The best you can do in such a circumstance is a cache. A MCS is the wrong tool in that circumstance; it is best to allow the SQL template designer to create whatever cache is most appropriate to the expected set of SQL idioms. The ST provider, per se, will in this case do minimal checking, or none at all, in its ?static validation? hook. In any case, I think SQL idiom validation is very far from the central use case to design around. I think format strings and regular expressions are much more likely as use cases, and those have truly static validation stories that make sense. They may require static types to do the best job, but much validation can be done without argument types as well. Clearly, the design will succeed for regular expressions if and only if Pattern.compile can be executed reliably once before the first evaluation of the template as a whole. To me this suggests that we should be thinking about what it means to execute a ?static hook? for an expression, which prepares for the subsequent execution of a method call that finishes the call. Maybe this is just a design pattern, but there is evidence that it needs language support. The language support come in when the compiler separates arguments meant only for the static hook from arguments used to finish the whole expression evaluation. It would seem that we need a notion of ?static arguments? as opposed to regular ?dynamic arguments?, for expressions that have a ?static hook?. I don?t want to dilute the S.T. discussion any more than I already have, so I?ll just say that S.T.?s have static arguments that are (a) the string itself with holes marked and (b) maybe the argument types for the expressions for those holes. I suggest that, until we roll out more of the machinery we intend to roll out, such as type classes, that we restrict the operand x (the receiver LHS of the S.T.) to be a statically constant expression. If we do that, then we can hand three things to a bootstrap method that can do static validation: 1. the (constant) receiver 2. the string body (with holes marked) 3. maybe the static types of the arguments (this is very natural for indy) If we allow the receiver to vary dynamically, we have boxed ourselves into a corner, regarding validation. By ?constant x? I simply mean the name of a static final field. There are not many other reference-valued expressions which are ?constant? enough to permit static validation. This would give us a story for validation from day one (using indy) and would also preserve room to grow to more complex receivers, though not all at once. If we roll out a story for constant folding, the set of possible x naturally grows, without further trouble. And when we roll out type classes, then we can (again without much more trouble) support non-constant ?x? receivers, if their type is able to nominate the appropriate S.T. handler witness. In this I am assuming that witnesses will always be constants relative to a given request for a witness. The Parametric VM design provides a way to do this even when the witnessed type is a type parameter. From james.laskey at oracle.com Sat Oct 30 22:25:22 2021 From: james.laskey at oracle.com (Jim Laskey) Date: Sat, 30 Oct 2021 22:25:22 +0000 Subject: Are templated string embedded expressions "method parameters" or "lambdas"? In-Reply-To: References: <2F2E1E93-2B84-434E-B446-672270048C0A@oracle.com> Message-ID: Though I agree it's not a strong argument, but comment content is arbitrary. It's either /* arbitrary characters */ or // arbitrary characters . String content has to be contextual interpreted, i.e. have knowledge of valid escape sequences or translation of eoln. String s = """ \{"abc\"}"} """; Also, do we allow String t = """ \{"\{\"abc\"}"} """; or String t = """ \""" \{"\{\"abc\"}"} \""" """; It's not impossible to parse, just a little more complicated than just counting braces. On Oct 30, 2021, at 6:13 PM, Guy Steele > wrote: On Oct 29, 2021, at 10:10 AM, Jim Laskey > wrote: For our early templated string prototypes, we restricted embedded expressions to just basic accessors and basic arithmetic. The intent was to keep things easy to read and to prevent side effects. Over time, we began thinking this restriction was unduly harsh. More precisely, we worried it that it would result in a complex, difficult-to-defend boundary. But we still would like users to not rely on side-effects. Consequently, a new proposal for embedded expressions - we would allow any Java expression with the restriction that you can't use single quotes, double quotes or escape sequences. We opted to keep this restriction to allow tools (ex., syntax highlighters) to isolate embedded expressions within strings without requiring sophisticated parsing. An interesting restriction. Would comments also be forbidden within such expressions (for the same reason)? ?Guy From john.r.rose at oracle.com Sat Oct 30 22:37:42 2021 From: john.r.rose at oracle.com (John Rose) Date: Sat, 30 Oct 2021 22:37:42 +0000 Subject: Are templated string embedded expressions "method parameters" or "lambdas"? In-Reply-To: <2F2E1E93-2B84-434E-B446-672270048C0A@oracle.com> References: <2F2E1E93-2B84-434E-B446-672270048C0A@oracle.com> Message-ID: On Oct 29, 2021, at 7:10 AM, Jim Laskey wrote: > > Given that an unprocessed templated string involves at least some deferred evaluation, should we frame templated string parameters as being more like method parameters (all parameters evaluated eagerly, left to right), or should we treat them as lambda expressions, which may capture (effectively final) variables from the environment, and evaluate the full parameters expressions when they are needed? If we take away that given, that deferred evaluation is ?baked into? S.T., we are left (at least, I hope we will be left) with no natural constraints on expressions, just the usual practical ones. It would be comforting to me if I found that, in the end, all the ?hole arguments? to S.T.?s are nothing more or less than garden-variety method arguments, with no special processing and no special restrictions. It is natural to want to forestall ?bad form? uses of a new feature, and make parsing easier, and make it harder to write puzzlers, and all the rest. But there is a limit to such things, and eventually you have to let the user decide whether mis-use the language, according to the tastes of its designers. In any case, my design heuristics prefer few or no limitations on what can ?go in the hole?, just as there are now few or no limitations on what can be passed as a method argument. Anything more restrictive feels like the Language Nannies have arrived. (Also, when I *do* use a Nanny, I prefer it to be wired into the IDE, not the language.) ? John From john.r.rose at oracle.com Sat Oct 30 23:18:38 2021 From: john.r.rose at oracle.com (John Rose) Date: Sat, 30 Oct 2021 23:18:38 +0000 Subject: Are templated string embedded expressions "method parameters" or "lambdas"? In-Reply-To: <8EA56F3B-9615-471F-8C79-C84952B5FBE9@oracle.com> References: <2F2E1E93-2B84-434E-B446-672270048C0A@oracle.com> <6c2e73bb-7fde-e1a0-d36c-0c8caf13d7f0@oracle.com> <635336FF-72B7-4745-B708-203A4A281437@oracle.com> <8EA56F3B-9615-471F-8C79-C84952B5FBE9@oracle.com> Message-ID: <9A81A710-EDC1-43C2-91A1-89E6FEE8BCC9@oracle.com> On Oct 30, 2021, at 3:22 PM, John Rose > wrote: restrict the operand x (the receiver LHS of the S.T.) to be a statically constant expression. If we do that, then we can hand three things to a bootstrap method that can do static validation: 1. the (constant) receiver 2. the string body (with holes marked) 3. maybe the static types of the arguments (this is very natural for indy) Completing the design is pretty straightforward, but I might as well write out more of my work. Here?s *one possible* design in which the terminal ?apply? operation is performed under the name ?MethodHandle.invokeExact?. X."y?\{z?}" translates to an invokedynamic instruction The static arguments to the indy instruction are X (formed as a CONSTANT_Dynamic constant as necessary) and the string body containing y with hole indicators. Thus, the indy BSM gets the following: 1. a Lookup 2. a name (ignored) 3. a method-type (composed of the static types of z, returning R the expression type) 4. X (via condy) 5. "y?" where the holes are appropriately marked It returns a CallSite, which is then used for all evaluations of the expression. Applications will use a ConstantCallSite. That is the mechanism. It does not say what is the logic of the BSM or the type R. That is where the language rules come in. The type of X must contain, directly or not, two or three methods, validate, apply, asMethodHandle. The methods are declared as abstracts using one or two API types. (Logically, they could also be left ?hanging? outside of any interface as the magic methods Brian detests.) I will show one-interface and two-interface potential designs. interface ST_A { // 1 type with 3 methods ST12 validate(Lookup, String, MethodType); apply(E?); MethodHandle asMethodHandle(); } interface ST_B { // 2 types with 1 or 2 methods Applier validate(Lookup, String, MethodType); interface Applier { apply(Object? E); MethodHandle asMethodHandle(); } //default R validateAndApply(Lookup, String, MethodType) { ? } } interface ST_C { // 1 type with 2 methods, plus MH MethodHandle validate(Lookup, String, MethodType); R validateAndInvoke(Lookup, String, MethodType, Object...); } // ?apply? here is MethodHandle::invokeExact; asMethodHandle is a nop The language infers R as usual, as if the call were going through apply (A), validate then apply (B) or validateAndInvoke (C). But the BSM uses drives the same API points to obtain the needed MethodHandle, which is then installed in a CCS. Further variations: Have a static hook to produce, not a CCS but a general CS such as a MCS. Drop the Lookup argument from the APIs, because who wants that? You can add it later. The oddity here, as in existing prototypes, is that there are ?two sets of books?, one for indy with its static bootstrap logic that produces a method handle, and one ?for the police? which shows how all the types (including R) fit together. All of the above APIs allow implementations (subtypes) of the interface to supply different calling sequences for the eventual apply (or invoke). This is important, because a logger-oriented Applier wants to accept delayed evaluation lambdas if possible, while other simpler uses of the S.T. mechanism will be happy to get along with just the Object arguments of apply(Object?). One of the fine points of this design is whether and how to statically type the *hole arguments* and whether the static type of the receiver (x in x."?") can affect the subsequent static typing of the hole arguments. With a separate Applier type, the degrees of freedom in hole type checking are, maybe, a little easier to manage, but all of the API types above are malleable to some degree. Ultimately, I think we will be pushed to allow some amount of overloading on the ?apply? method, if use cases demand static checking of argument lists. I?ve put in the ?E? parameter above as a stop gap to allow (at least) the necessary distinction between Object and Supplier for distinct use cases. If we ever do ?Varargs 2.0? (better varargs, with richer argument type patterns encoded into the VA receiver), that will naturally add value to the above APIs, if they can be retrofitted or replaced with VA2.0 APIs situated on apply. That last one (ST_C) is nice and simple. Maybe that?s a good one to start with, maybe sans Lookup. The others can be layered on later on. A final word: If you said ?that?s a curried function? to yourself at some point reading the above, you are not wrong. From amaembo at gmail.com Sun Oct 31 04:21:04 2021 From: amaembo at gmail.com (Tagir Valeev) Date: Sun, 31 Oct 2021 11:21:04 +0700 Subject: Are templated string embedded expressions "method parameters" or "lambdas"? In-Reply-To: <635336FF-72B7-4745-B708-203A4A281437@oracle.com> References: <2F2E1E93-2B84-434E-B446-672270048C0A@oracle.com> <6c2e73bb-7fde-e1a0-d36c-0c8caf13d7f0@oracle.com> <635336FF-72B7-4745-B708-203A4A281437@oracle.com> Message-ID: +1 to Guy and John With best regards, Tagir Valeev On Sun, Oct 31, 2021 at 4:44 AM John Rose wrote: > > Brian will not be surprised to hear that I agree with Guy here. > > The Logger use case is the main one, IIUC, which is driving > the insertion of pass-by-name (automatic quoting as if by ()->) > into the current ST design. > > If we want something like Log.note("foo = "+foo()) to auto-quote to > Log.note(() -> "foo = "+foo()) we should do it as its own avowed > language feature (akin to auto boxing and varargs) and not hide > it inside of string templates. > > (Turning x to ()->x is indeed a kind of quoting, since it allows the > user access to a useful name of x without elaborating x. > If and when we add lambda cracking it will also become a way > to literally quote expression trees. If the term auto-quoting > is off the mark, may I suggest auto-delaying or pass-by-name.) > > I would be delighted if we could define string templates without > rolling in an auto-delay feature for their arguments, then later > add auto-delay of arguments (to arbitrary methods) as a proper > language feature, and then, triumphantly, cross-apply those > two features to create Logger APIs that (a) make great use of > STs, and (b) smoothly delay their operands until the Logger > has decided it really wants them. > > On Oct 30, 2021, at 2:11 PM, Guy Steele wrote: > > > On Oct 30, 2021, at 1:04 PM, Brian Goetz wrote: > > > I think that deferred semantics could be confusing and it provides too > little benefit to justify its use. . . . > > > I'd characterize these argument as "this deferral is too magic and raises too many questions for too little benefit; if you want deferral, do it explicitly with ->. > > > I will go a step further: the proposed deferral semantics is too _surprising_. > > Joe Programmer will look at > > FOO."I have \{n+1} apples in my basket and \{c.size()} elements in my collection." > > and fully expect that the expressions `n+1` and `c.size()` will be evaluated right away, before any of the magic associated with that dot happens. > > There are languages for which deferred evaluation is the norm, the expected way of doing things. Haskell is one of them. Java is NOT one of them. > > Java now has a standard strategy for user-specified deferred evaluation: ->. So it would also be surprising to introduce another, unrelated mechanism for user-specified deferred evaluation without a REALLY compelling reason. (I say ?user-specified? because there are certain other mechanisms that are built-in, such as the ? : operator and statements such as `if` and `while`.) > > Joe Programmer would expect that if FOO wants to provide for deferred evaluation, it would do so by accepting lambdas, and Joe would expect to provide them. > > ?Guy > > From amaembo at gmail.com Sun Oct 31 04:30:04 2021 From: amaembo at gmail.com (Tagir Valeev) Date: Sun, 31 Oct 2021 11:30:04 +0700 Subject: Effectively final loop counter (was: Are templated string embedded expressions "method parameters" or "lambdas"?) In-Reply-To: <6c2e73bb-7fde-e1a0-d36c-0c8caf13d7f0@oracle.com> References: <2F2E1E93-2B84-434E-B446-672270048C0A@oracle.com> <6c2e73bb-7fde-e1a0-d36c-0c8caf13d7f0@oracle.com> Message-ID: > > Finally, the inability to use non-effectively-final variables would be > > very limiting. Note that the most classic way to iterate numbers in > > Java is still for loop: > > > > for(int i=0; i > System.out.println("array[\{i}] = \{array[i]}"); > > } > > > > Inability to do this simple thing would be frustrating. > > I see this as more a deficiency of loop induction variables than lambdas > / other capturing constructs. For the old-school for-loop, we're kind > of hosed because the induction variable is intrinsically mutable, but > for the foreach loop, this is probably fixable; > > for (T x : source) > > could treat x as being freshly declared in each block iteration. We > visited this during Lambda; perhaps this could be revisited. So I'm not > yet convinced that the effectively-final restriction is unreasonable. > And, we can always relax from there, but can't go stricter. This is a somewhat separate topic but I would be glad to see some improvement here with respect to good old for loops, as I often saw copying to a new variable for the sake of lambda capture inside the loop. Like if a variable is declared at the for loop initializer and it's never modified inside the `for` loop body, then let's assume that before body entry, a fresh variable is created with the same name and assigned to the original variable shadowing the original variable. This way, the counter variable of the classic counting loop will be considered as effectively final inside the loop body (but not inside the condition and update expressions). This will make for (int i=0; i References: <2F2E1E93-2B84-434E-B446-672270048C0A@oracle.com> <6c2e73bb-7fde-e1a0-d36c-0c8caf13d7f0@oracle.com> Message-ID: <89d60b81-fe2a-815e-24b6-37110e908513@oracle.com> > This is a somewhat separate topic but I would be glad to see some > improvement here with respect to good old for loops, as I often saw > copying to a new variable for the sake of lambda capture inside the > loop. As a separate separate topic, I'd like to make good-old-for-loops more of a thing of the past. For example, there's a reasonable stacking of features atop Valhalla that gets us to: ??? for (int i : 0.. References: <2F2E1E93-2B84-434E-B446-672270048C0A@oracle.com> <6c2e73bb-7fde-e1a0-d36c-0c8caf13d7f0@oracle.com> <635336FF-72B7-4745-B708-203A4A281437@oracle.com> <8EA56F3B-9615-471F-8C79-C84952B5FBE9@oracle.com> <9A81A710-EDC1-43C2-91A1-89E6FEE8BCC9@oracle.com> Message-ID: <1045013382.3039141.1635694733285.JavaMail.zimbra@u-pem.fr> > From: "John Rose" > To: "Guy Steele" > Cc: "Brian Goetz" , "Tagir Valeev" , > "Jim Laskey" , "amber-spec-experts" > > Sent: Dimanche 31 Octobre 2021 01:18:38 > Subject: Re: Are templated string embedded expressions "method parameters" or > "lambdas"? > On Oct 30, 2021, at 3:22 PM, John Rose < [ mailto:john.r.rose at oracle.com | > john.r.rose at oracle.com ] > wrote: >> restrict the operand x (the receiver LHS of the S.T.) >> to be a statically constant expression. If we do that, >> then we can hand three things to a bootstrap method >> that can do static validation: >> 1. the (constant) receiver >> 2. the string body (with holes marked) >> 3. maybe the static types of the arguments (this is very natural for indy) > Completing the design is pretty straightforward, but I might > as well write out more of my work. Here?s *one possible* design > in which the terminal ?apply? operation is performed under > the name ?MethodHandle.invokeExact?. > X."y?\{z?}" translates to an invokedynamic instruction > The static arguments to the indy instruction are X (formed > as a CONSTANT_Dynamic constant as necessary) and the > string body containing y with hole indicators. > Thus, the indy BSM gets the following: > 1. a Lookup > 2. a name (ignored) > 3. a method-type (composed of the static types of z, returning R the expression > type) > 4. X (via condy) > 5. "y?" where the holes are appropriately marked > It returns a CallSite, which is then used for all evaluations > of the expression. Applications will use a ConstantCallSite. > That is the mechanism. It does not say what is the logic of the BSM > or the type R. That is where the language rules come in. > The type of X must contain, directly or not, two or three methods, > validate, apply, asMethodHandle. The methods are declared as abstracts > using one or two API types. (Logically, they could also be left ?hanging? > outside of any interface as the magic methods Brian detests.) One missing point is that it should be possible to do a static analysis of the code so asMethodHandle is a way to improve the performance, not to specify or change the semantics. > I will show one-interface and two-interface potential designs. > interface ST_A { // 1 type with 3 methods > ST12 validate(Lookup, String, MethodType); > apply(E?); > MethodHandle asMethodHandle(); > } It's more an implementation details but if you have the method apply(E...), the compiler will generate a bridge method but the implementation will have not way to find it, you need also to provide the refied argument used as E. We had the same issue when generating the lambda proxy. > interface ST_B { // 2 types with 1 or 2 methods > Applier validate(Lookup, String, MethodType); > interface Applier { > apply(Object? E); > MethodHandle asMethodHandle(); > } > //default R validateAndApply(Lookup, String, MethodType) { ? } > } > interface ST_C { // 1 type with 2 methods, plus MH > MethodHandle validate(Lookup, String, MethodType); > R validateAndInvoke(Lookup, String, MethodType, Object...); > } > // ?apply? here is MethodHandle::invokeExact; asMethodHandle is a nop > The language infers R as usual, as if the call were going through > apply (A), validate then apply (B) or validateAndInvoke (C). > But the BSM uses drives the same API points to obtain the > needed MethodHandle, which is then installed in a CCS. > Further variations: Have a static hook to produce, not a CCS > but a general CS such as a MCS. Drop the Lookup argument > from the APIs, because who wants that? You can add it later. I think that having the Lookup argument is actually harmful, unlike the way we use BSM for lambdas, string concatenation or records where they are defined in the JDK, here the equivalent of the bootstrap method is implemented in library code. As a user, i don't want to pass a Lookup to my class to a library because i'm using a templated string, this is too much power. > The oddity here, as in existing prototypes, is that there are > ?two sets of books?, one for indy with its static bootstrap > logic that produces a method handle, and one ?for the > police? which shows how all the types (including R) fit > together. > All of the above APIs allow implementations (subtypes) of the > interface to supply different calling sequences for the eventual > apply (or invoke). This is important, because a logger-oriented > Applier wants to accept delayed evaluation lambdas if possible, > while other simpler uses of the S.T. mechanism will be happy > to get along with just the Object arguments of apply(Object?). If you want to design a template policy for a logger you wan both to allow direct evaluation and delayed evaluation by providing two overloaded methods, like void log(TemplatedString template, Object... args) and void log(TemplatedString template, Supplier... args) > One of the fine points of this design is whether and how > to statically type the *hole arguments* and whether the > static type of the receiver (x in x."?") can affect the > subsequent static typing of the hole arguments. With > a separate Applier type, the degrees of freedom in hole > type checking are, maybe, a little easier to manage, > but all of the API types above are malleable to some > degree. Ultimately, I think we will be pushed to allow > some amount of overloading on the ?apply? method, yes ! > if use cases demand static checking of argument > lists. I?ve put in the ?E? parameter above as a stop > gap to allow (at least) the necessary distinction between > Object and Supplier for distinct use cases. > If we ever do ?Varargs 2.0? (better varargs, with > richer argument type patterns encoded into the VA > receiver), that will naturally add value to the above > APIs, if they can be retrofitted or replaced with VA2.0 > APIs situated on apply. > That last one (ST_C) is nice and simple. Maybe that?s > a good one to start with, maybe sans Lookup. The others > can be layered on later on. There is a forth variation, i will show it using an interface but it also work without it. You can merge validate and apply in one method and provide a way to transfer states between that method and asMethodHandle. If we have a method that conceptually returns a tuple interface Policy { (R, Optional MethodHandle>) validateAndApply(ConstantInfo info, P... args) throws E } with ConstantInfo containing the string with holes and the types of each holes. The return type return the value and Optionally a function that for a method type returns a method handle. The semantics is the following, the method validateAndApply is called, it validate the ConstantInfo, precompute a data structure from those constant arguments, applied the dynamic arguments of that data structure, return a value and optionally a lambda that for a method type and the data structure (captured by the lambda) returns a method handle. You have all the 3 steps, valide, apply and asMethodHandle into one method. If no lambda is provided, it works like String.valueOf() works, if a lambda is provided, the call to validateAndApply precompute the data structure used by the method handle and provide only the result of the first call all subsequent calls will use the method handle returned by the lambda. > A final word: If you said ?that?s a curried function? to > yourself at some point reading the above, you are not > wrong. R?mi From brian.goetz at oracle.com Sun Oct 31 15:52:44 2021 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 31 Oct 2021 11:52:44 -0400 Subject: Are templated string embedded expressions "method parameters" or "lambdas"? In-Reply-To: <8EA56F3B-9615-471F-8C79-C84952B5FBE9@oracle.com> References: <2F2E1E93-2B84-434E-B446-672270048C0A@oracle.com> <6c2e73bb-7fde-e1a0-d36c-0c8caf13d7f0@oracle.com> <635336FF-72B7-4745-B708-203A4A281437@oracle.com> <8EA56F3B-9615-471F-8C79-C84952B5FBE9@oracle.com> Message-ID: > I suggest that, until we roll out more of the machinery > we intend to roll out, such as type classes, that we > restrict the operand x (the receiver LHS of the S.T.) > to be a statically constant expression. I think this is taking it way too far. *If* the receiver is a statically constant expression, *then* it should be possible to get better type checking / translation.? But isn't constraining the receiver to be a static constant just more of the same sort of nannyism that you've been objecting to?