From kevinb9n at gmail.com Fri Mar 1 19:09:08 2024 From: kevinb9n at gmail.com (Kevin Bourrillion) Date: Fri, 1 Mar 2024 11:09:08 -0800 Subject: Draft JEP: Derived Record Creation (Preview) Message-ID: Hi Gavin, My response is mostly just to add grist to the mill for a feature that looks great already. You might perhaps feel some of the points are worth working into the proposal. I have some angst over the term "derived" for this. It's not wrong, but is an entirely different meaning from the one I encounter regularly: a "derived field" being one that caches a value computed deterministically from the other field values (a feature that record classes notably don't support... sadly). I think the more basic term is "modified", and I think it works: "creating modified records". In the vernacular I think most people do understand that a genetically "modified" soybean doesn't necessarily mean that any particular bean was changed, only that it is an altered version of what it would otherwise have been. "You can't *modify* a record instance, but you can get a *modified* instance based on it." This feels to me like a good reuse of existing terminology. Suppose we want to evolve the state by doubling the x coordinate of a Point > oldLoc, resulting in Point newLoc: > > Point newLoc = new Point(oldLoc.x()*2, oldLoc.y(), oldLoc.z()); > This code, while straightforward, is laborious. Deriving newLoc from > oldLoc means extracting every component of oldLoc, whether it changes or > not, and providing a value for every component of newLoc, even if unchanged > from oldLoc. It would be a constant tax on productivity if developers had > to repeatedly deconstruct one record value (extract all its components) in > order to instantiate a new record value with mostly the same components. > It's also bug-prone in multiple ways. It also is the worst-case maintenance scenario, the "any-every". When adding, removing, or renaming *any* record component, *every* statement like this throughout the codebase has to be changed. (True of record constructor calls too, but there's not much we could do about that short of optional parameters.) However, wither methods have two problems. First, they add boilerplate to > the record class, > Some boilerplate is relatively innocuous but this is *high-maintenance* boilerplate, which we have to carefully keep in sync with the record's component declarations. Boo. > Record values can be nested, where components are themselves record > values. Derived instance creation expressions can be nested in order to > transform nested record values. For example: > > record Marker(Point loc, String label, Icon icon) { } > > Marker m = new Marker(new Point(...), ..., ...); > Marker scaled = m with { loc = loc with { x *= 2; y *= 2; z *= 2; }}; > In fact, this is such a common need (in my experience), and what you have to do today is such a horror show, that you might want to illustrate it as part of the value proposition of the feature. > Derived instance creation expressions can be used in record classes to > simplify the implementation of basic operations. For example: > > record Complex(double re, double im) { > Complex conjugate() { return this with { im = -im; }; } > Complex realOnly() { return this with { im = 0; }; } > Complex imOnly() { return this with { re = 0; }; } > } > This is very nice because now `conjugate()` has no relationship with `re` at all, just as it should be. And it makes the essence of what each method is for crystal-clear. > Any assignment statements that occur within the transformation block have > the following constraint: If the left-hand side of the assignment is an > unqualified name, that name must be either (i) the name of a local > component variable, or (ii) the name of a local variable that is declared > explicitly in the transformation block. > And because there's no way to qualify a local variable from the surrounding scope, reassigning such variables is simply impossible within this block. Right? That's "no great loss" of course, although I'm missing why the restriction is necessary. The notion of variables that (at least in userspeak) are "in scope for reading but not for writing" seems weird; does it have precedent? The transformation block need only express the parts of the state being > modified. If the transformation block is empty then the result of the > derived instance creation expression is a copy of the value of the origin > expression (the expression on the left-hand side). > This could be interpreted as saying that in this case the record's constructor isn't even run, which I suspect isn't what you mean, and which could make a difference (if best practices aren't being followed). Do you need to say anything at all about this case? If the origin value is null then evaluation of the derived instance > creation expression completes abruptly with a NullPointerException. > ... and if it isn't, then we can talk about the "origin instance" it refers to. I'd suggest avoiding the term "origin value" completely except for the above, preferring to talk about the origin instance instead. I think that's the way to be as clear as possible that none of what we're talking about here cares whether the record class is a value class or not. But we can dissect this further if need be. Before executing the contents of the transformation block, a number of > implicit local variable declaration statements are executed. These local > variable declaration statements are derived from each record component in > the header of the record class R, in order, as follows: > > The local variable declaration has the same name and declared type as the > record component. > Overall, there have been several references here to the record class R, but I would think it's the record *type* we really need to talk about. That type post-substitution is what determines these variable types, no? That also suggests we need to discuss wildcard capture here - or is that addressed elsewhere? A new instance of record class R is created as if by evaluating a new class > instance creation expression (new) with the compile-time type of the origin > expression and an argument list containing the local component variables, > if any, in the order that they appear in the header of record class R. > Likewise, should this talk about type arguments too? This would I think mean duplicating them from the record type but doing whatever fancy footwork is required to deal with wildcards? (I assume that record constructors themselves can't be generic.) Implied in all this: I would think a record type like `MyRecord` *should* be usable with `with` (of course, trying to assign to some variables inside the transformation block isn't going to go well, but likely the user just isn't referring to those variables at all in this case). > The use of a derived instance creation expression: > can be thought of a switch expression: > imho this would be useful to state earlier! What goes wrong if we think of this feature as *exactly* desugaring to that switch code? The structure and behavior of the transformation block in a derived > instance creation expression is similar to the body of a compact > constructor in a record class. Both have the same control flow restrictions > (must complete normally or throw an exception); both have a set of > pre-initialized variables in scope, which are expected to be mutated by the > block; and both take the final values of those variables and pass them as > arguments to a constructor invocation. > This would've been useful to state earlier too, to me anyway. The only difference I thought of is that one can refer to `this` inside the constructor (uh, right?) but there is no syntax to access the origin expression in the transformation block. And that seems as it should be. Alternatives > Instead of supporting an expression form for use-site creation of new > record values, we could support it at the declaration site with some form > of special support for wither methods. We prefer the flexibility of > use-site creation, whereas declaring wither methods would add bloat to > record class declarations, which currently enjoy a high degree of > succinctness. > This would also introduce a lot of potential for unpredictability. The whole deal with records is that they act in highly predictable ways. ~~ Nano-scale details aside... this will be a very helpful feature for working with records and I hope it happens! -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Sat Mar 2 16:08:37 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 2 Mar 2024 11:08:37 -0500 Subject: Draft JEP: Derived Record Creation (Preview) In-Reply-To: References: Message-ID: <1386995c-0735-41fa-9aae-b74473643477@oracle.com> > Any assignment statements that occur within the transformation > block have the following constraint: If the left-hand side of the > assignment is an unqualified name, that name must be either (i) > the name of a local component variable, or (ii) the name of a > local variable that is declared explicitly in the transformation > block. > > > And because there's no way to qualify a local variable from the > surrounding scope, reassigning such variables is simply impossible > within this block. Right? > > That's "no great loss" of course, although I'm missing why the > restriction is necessary. The notion of variables that (at least in > userspeak) are "in scope for reading but not for writing" seems weird; > does it have precedent? There is some precedent with lambdas/inner classes, where you can only access effectively final locals, though that wasn't really in our mind when we crafted this restriction. The motivation for the restriction is twofold: ?- This is a functional idiom (think "state monad"), side-effecting the environment would be weird.? (Of course, you could launder side-effects through any of the usual means, including probably using a qualified acess (Foo.x = 3; this.y = 4), but you shouldn't.) ?- We intend to extend this to classes in the future.? This idiom is basically "take apart with deconstructor + transform state + reconstruct with constructor".? There's an overload selection problem buried in there, and the names of variables involved in the transform may be important inputs to that selection decision. We're not sure that we'll want to do overload selection nominally in this manner, but we're not ready to say "we will never be able to"; having this restriction in place keeps the flexibility to do so. > > The transformation block need only express the parts of the state > being modified. If the transformation block is empty then the > result of the derived instance creation expression is a copy of > the value of the origin expression (the expression on the > left-hand side). > > > This could be interpreted as saying that in this case the record's > constructor isn't even run, which I suspect isn't what you mean, and > which could make a difference (if best practices aren't being > followed). Do you need to say anything at all about this case? I interpret this question as "is the result guaranteed to have a distinct identity from the origin expression."? (Obviously, for value types, the answer is "huh, what's identity?")? But we probably do want to say that the constructor is always invoked to produce the result, even if the block is empty; that "copy" is more of an analogy. > > Overall, there have been several references here to the record class > R, but I would think it's the record /type/?we really need to talk > about. That type post-substitution is what determines these variable > types, no? Yes.? It is probably a little more complicated than "the static type of the origin expression is the static type of the with expression", because of, as you say, wildcards (and other weirdo types).? You probably have to do an upward projection on the type of the origin expression, or something like that. > The use of a derived instance creation expression: > can be thought of a switch expression: > > > imho this would be useful to state earlier! > > What goes wrong if we think of this feature as /exactly/?desugaring to > that switch code? The set of statements permissible in the two contexts is probably slightly different; you can do a `yield ` in the RHS of a switch case, but not in a reconstruction block.? Probably other subtle reasons too.? Our experience with "specify by syntactic expansion" frequently runs into annoying roadblocks because of things that are expressible in one context but not in the desugared context, or vice versa. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ccherlin at gmail.com Thu Mar 7 17:31:02 2024 From: ccherlin at gmail.com (Clement Cherlin) Date: Thu, 7 Mar 2024 11:31:02 -0600 Subject: Generic StringTemplates? Message-ID: I can envision some use cases for StringTemplate where I would want to restrict the type of the interpolated expressions, either for static type safety, or for the convenience of providing an implicit static type to interpolated lambda expressions, or both. For example, consider a Processor that takes Suppliers as interpolated values and returns a Supplier, allowing for lazy evaluation: public static final StringTemplate.Processor, RuntimeException> LAZY = stringTemplate -> () -> StringTemplate.interpolate( stringTemplate.fragments(), stringTemplate.values().stream() .map(o -> ((Supplier)o).get()).toList()); Using such a processor is awkward and not type-safe, requiring both an unchecked cast in the processor and an explicit cast in every value expression: final Supplier lazy = LAZY."Now: \{(Supplier) Instant::now}"; You can use various workarounds, like defining a static method that coerces its argument to Supplier, but String Templates are supposed to reduce existing boilerplate, not create new boilerplate. Could StringTemplate have a type argument? I envision something like public interface StringTemplate { List values(); public interface Processor { R process(StringTemplate stringTemplate) throws E; ... } ... } To support creating generic StringTemplates, the following syntax could be legal, returning a StringTemplate with static type StringTemplate. RAW."template with \{value}s of type..." Without a type argument, the return value would have static type StringTemplate. Cheers, Clement Cherlin From brian.goetz at oracle.com Fri Mar 8 18:35:03 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 8 Mar 2024 18:35:03 +0000 Subject: Update on String Templates (JEP 459) Message-ID: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> Time to check in with where were are with String Templates. We?ve gone through two rounds of preview, and have received some feedback. As a reminder, the primary goal of gathering feedback is to learn things about the design or implementation that we don?t already know. This could be bug reports, experience reports, code review, careful analysis, novel alternatives, etc. And the best feedback usually comes from using the feature ?in anger? ? trying to actually write code with it. (?Some people would prefer a different syntax? or ?some people would prefer we focused on string interpolation only? fall squarely in the ?things we already knew? camp.) In the course of using this feature in the `jextract` project, we did learn quite a few things we didn?t already know, and this was conclusive enough that it has motivated us to adjust our approach in this feature. Specifically, the role of processors is ?outsized? to the value they offer, and, after further exploration, we now believe it is possible to achieve the goals of the feature without an explicit ?processor? abstraction at all! This is a very positive development. First, I want to affirm that that the goals of the project have not changed. From JEP 459: Goals ? Simplify the writing of Java programs by making it easy to express strings that include values computed at run time. ? Enhance the readability of expressions that mix text and expressions, whether the text fits on a single source line (as with string literals) or spans several source lines (as with text blocks). ? Improve the security of Java programs that compose strings from user-provided values and pass them to other systems (e.g., building queries for databases) by supporting validation and transformation of both the template and the values of its embedded expressions. ? Retain flexibility by allowing Java libraries to define the formatting syntax used in string templates. ? Simplify the use of APIs that accept strings written in non-Java languages (e.g., SQL, XML, and JSON). ? Enable the creation of non-string values computed from literal text and embedded expressions without having to transit through an intermediate string representation. Non-Goals ? It is not a goal to introduce syntactic sugar for Java's string concatenation operator (+), since that would circumvent the goal of validation. ? It is not a goal to deprecate or remove the StringBuilder and StringBuffer classes, which have traditionally been used for complex or programmatic string composition. Another thing that has not changed is our view on the syntax for embedding expressions. While many people did express the opinion of ?why not ?just' do what Kotlin/Scala does?, this issue was more than fully explored during the initial design round. (In fact, while syntax disagreements are often purely subjective, this one was far more clear ? the $-syntax is objectively worse, and would be doubly so if injected into an existing language where there were already string literals in the wild. This has all been more than adequately covered elsewhere, so I won?t rehash it here.) Now, let?s talk about what we do think should change: the role of processors and the StringTemplate type. Processors were envisioned as a means to abstract the transformation of templates to their final form (whether string, or something else.) However, Java already has a well established means of abstracting behavior: methods. (In fact, a processor application can be viewed as merely a new syntax for a method call.) Our experience using the feature highlighted the question: When converting a SQL query expressed as a template to the form required by the database (such as PreparedStatement), why do we need to say: DB.?? template ?? When we could use an ordinary Java library: Query q = Query.of(??template??) Indeed, one of the worst things about having processors in the language is that API designers are put in the difficult situation of not knowing whether to write a processor or an ordinary API, and often have to make that choice before the consequences are fully understood. (To add to this, processors raise similar questions at the use site.) But the real criticism here is that template capture and processing are complected, when they should be separate, composable features. This motivated us to revisit some of the reasons why processors were so central to the initial design in the first place. And it turned out, this choice had been influenced ? perhaps overly so ? by early implementation experiments. (One of the background design goals was to enable expensive operations like `String::format` to be (much) cheaper. Without digressing too deeply on performance, String::format can be more than an order of magnitude worse than the equivalent concatenation operation, and this in turn sometimes motivates developers to use worse idioms for formatting. The FMT processor brough that cost back in line with the equivalent concatenation.) These early experiments biased the design towards needing to know the processor at the point of template capture, but upon reexamination we realized that there are other ways to achieve the desired performance goals without requiring processors to be known at capture time. This, in turn, enabled us to revisit a point in the design space we had transited through earlier, where string templates were ?just a new kind of literal? and the job performed by processors could instead be performed by ordinary APIs. At this point, a simpler design and implementation emerged that met the semantic, correctness, and performance goals: template literals (?Hello \{name}?) are simply the literal form of StringTemplate: StringTemplate st = ?Hello \{name}?; String and StringTemplate remain unrelated types. (We explored a number of ways to interconvert them, but they caused more trouble than they solved.) Processing of string templates, including interpolation, is done by ordinary APIs that deal in StringTemplate, aided by some clever implementation tricks to ensure good performance. For APIs where interpolation is known to be safe in the domain, such as PrintWriter, APIs can make that choice on behalf of the domain, by providing overloads to embody this design choice: void println(String) { ? } void println(StringTemplate) { ? interpolate and delegate to println(String) ?. } The upshot is that for interpolation-safe APIs like println, we can use a template directly without giving up any safety: System.out.println(?Hello \{name}?); In this example, the string template evaluates to StringTemplate, not String (no implicit interpolation), and chooses the StringTemplate overload of println, which in turn chooses how to process the template. This stays true to the design principle that interpolation is dangerous enough that it should be an explicit choice in the code ? but it allows that choice to be made by libraries when the library is comfortable doing so. Similarly, the FMT processor is replaced by an overload of String::format that interprets templates with embedded format specifiers (e.g., ?%d?): String format(String formatString, Object? parameters) { ? same as today ? } String format(StringTemplate template) {... equivalent of FMT ...} And users can call this as: String s = String.format(?Hello %12s\{name}?); Here, the String::format API has chosen to interpret string templates according to the rules previously specified in the FMT processor (not ordinary interpolation), but that choice is embedded in the library semantics so no further explicit choice at the use site is required. The user already chose to pass it to String::format; that?s all the processing selection that is needed. Where APIs do not express a choice of what template expansion means, users continue to be free to process them explicitly before passing them, using APIs that do (such as String::format or ordinary interpolation.). The result is: - The need for use-site "goop" (previously, the processor name; now, static or instance methods to process a template) goes away entirely when dealing with libraries that are already template-friendly. - Even with libraries that require use-site goop, it is no more intrusive than before, and can be reduced over time as APIs get with the program. - StringTemplate is just another type that APIs can support if they want. The "DB" processor becomes an ordinary factory method that accepts a string template or an ordinary builder API. - APIs now can have _more_ control over the timing and meaning of template processing, because we are not biasing so strongly towards early processing. - It becomes easier to abstract over template processing (i.e., combine or manipulate templates as templates before processing) - Interpolation remains an explicit choice, but ST-aware libraries can make this choice on behalf of the user. - The language feature and API surface get considerably smaller, which is good. Core JDK APIs (e.g., println, format, exception constructors) get upgraded to work with string templates. The remaining question that everyone is probably asking is: ?so how do we do interpolation.? The answer there is ?ordinary library methods?. This might be a static method (String.join(StringTemplate)) or an instance method (template.join()), shed to be painted (but please, not right now.). This is a sketch of direction, so feel free to pose questions/comments on the direction. We?ll discuss the details as we go. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ccherlin at gmail.com Fri Mar 8 21:22:41 2024 From: ccherlin at gmail.com (Clement Cherlin) Date: Fri, 8 Mar 2024 15:22:41 -0600 Subject: Update on String Templates (JEP 459) In-Reply-To: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> Message-ID: On Fri, Mar 8, 2024 at 2:54?PM Brian Goetz wrote: > > > Time to check in with where were are with String Templates. We?ve gone through two rounds of preview, and have received some feedback. > > As a reminder, the primary goal of gathering feedback is to learn things about the design or implementation that we don?t already know. This could be bug reports, experience reports, code review, careful analysis, novel alternatives, etc. And the best feedback usually comes from using the feature ?in anger? ? trying to actually write code with it. (?Some people would prefer a different syntax? or ?some people would prefer we focused on string interpolation only? fall squarely in the ?things we already knew? camp.) > > In the course of using this feature in the `jextract` project, we did learn quite a few things we didn?t already know, and this was conclusive enough that it has motivated us to adjust our approach in this feature. Specifically, the role of processors is ?outsized? to the value they offer, and, after further exploration, we now believe it is possible to achieve the goals of the feature without an explicit ?processor? abstraction at all! This is a very positive development. > > First, I want to affirm that that the goals of the project have not changed. From JEP 459: > > Goals > > ? Simplify the writing of Java programs by making it easy to express strings that include values computed at run time. > ? Enhance the readability of expressions that mix text and expressions, whether the text fits on a single source line (as with string literals) or spans several source lines (as with text blocks). > ? Improve the security of Java programs that compose strings from user-provided values and pass them to other systems (e.g., building queries for databases) by supporting validation and transformation of both the template and the values of its embedded expressions. > ? Retain flexibility by allowing Java libraries to define the formatting syntax used in string templates. > ? Simplify the use of APIs that accept strings written in non-Java languages (e.g., SQL, XML, and JSON). > ? Enable the creation of non-string values computed from literal text and embedded expressions without having to transit through an intermediate string representation. > > Non-Goals > ? It is not a goal to introduce syntactic sugar for Java's string concatenation operator (+), since that would circumvent the goal of validation. > ? It is not a goal to deprecate or remove the StringBuilder and StringBuffer classes, which have traditionally been used for complex or programmatic string composition. > > Another thing that has not changed is our view on the syntax for embedding expressions. While many people did express the opinion of ?why not ?just' do what Kotlin/Scala does?, this issue was more than fully explored during the initial design round. (In fact, while syntax disagreements are often purely subjective, this one was far more clear ? the $-syntax is objectively worse, and would be doubly so if injected into an existing language where there were already string literals in the wild. This has all been more than adequately covered elsewhere, so I won?t rehash it here.) > > > Now, let?s talk about what we do think should change: the role of processors and the StringTemplate type. > > Processors were envisioned as a means to abstract the transformation of templates to their final form (whether string, or something else.) However, Java already has a well established means of abstracting behavior: methods. (In fact, a processor application can be viewed as merely a new syntax for a method call.) Our experience using the feature highlighted the question: When converting a SQL query expressed as a template to the form required by the database (such as PreparedStatement), why do we need to say: > > DB.?? template ?? > > When we could use an ordinary Java library: > > Query q = Query.of(??template??) > > Indeed, one of the worst things about having processors in the language is that API designers are put in the difficult situation of not knowing whether to write a processor or an ordinary API, and often have to make that choice before the consequences are fully understood. (To add to this, processors raise similar questions at the use site.) But the real criticism here is that template capture and processing are complected, when they should be separate, composable features. > > This motivated us to revisit some of the reasons why processors were so central to the initial design in the first place. And it turned out, this choice had been influenced ? perhaps overly so ? by early implementation experiments. (One of the background design goals was to enable expensive operations like `String::format` to be (much) cheaper. Without digressing too deeply on performance, String::format can be more than an order of magnitude worse than the equivalent concatenation operation, and this in turn sometimes motivates developers to use worse idioms for formatting. The FMT processor brough that cost back in line with the equivalent concatenation.) These early experiments biased the design towards needing to know the processor at the point of template capture, but upon reexamination we realized that there are other ways to achieve the desired performance goals without requiring processors to be known at capture time. This, in turn, enabled us to revisit a point in the design space we had transited through earlier, where string templates were ?just a new kind of literal? and the job performed by processors could instead be performed by ordinary APIs. > > At this point, a simpler design and implementation emerged that met the semantic, correctness, and performance goals: template literals (?Hello \{name}?) are simply the literal form of StringTemplate: > > StringTemplate st = ?Hello \{name}?; > > String and StringTemplate remain unrelated types. (We explored a number of ways to interconvert them, but they caused more trouble than they solved.) Processing of string templates, including interpolation, is done by ordinary APIs that deal in StringTemplate, aided by some clever implementation tricks to ensure good performance. > > For APIs where interpolation is known to be safe in the domain, such as PrintWriter, APIs can make that choice on behalf of the domain, by providing overloads to embody this design choice: > > void println(String) { ? } > void println(StringTemplate) { ? interpolate and delegate to println(String) ?. } > > The upshot is that for interpolation-safe APIs like println, we can use a template directly without giving up any safety: > > System.out.println(?Hello \{name}?); > > In this example, the string template evaluates to StringTemplate, not String (no implicit interpolation), and chooses the StringTemplate overload of println, which in turn chooses how to process the template. This stays true to the design principle that interpolation is dangerous enough that it should be an explicit choice in the code ? but it allows that choice to be made by libraries when the library is comfortable doing so. > > Similarly, the FMT processor is replaced by an overload of String::format that interprets templates with embedded format specifiers (e.g., ?%d?): > > String format(String formatString, Object? parameters) { ? same as today ? } > String format(StringTemplate template) {... equivalent of FMT ...} > > And users can call this as: > > String s = String.format(?Hello %12s\{name}?); > > Here, the String::format API has chosen to interpret string templates according to the rules previously specified in the FMT processor (not ordinary interpolation), but that choice is embedded in the library semantics so no further explicit choice at the use site is required. The user already chose to pass it to String::format; that?s all the processing selection that is needed. > > Where APIs do not express a choice of what template expansion means, users continue to be free to process them explicitly before passing them, using APIs that do (such as String::format or ordinary interpolation.). > > The result is: > > - The need for use-site "goop" (previously, the processor name; now, static or instance methods to process a template) goes away entirely when dealing with libraries that are already template-friendly. > - Even with libraries that require use-site goop, it is no more intrusive than before, and can be reduced over time as APIs get with the program. > - StringTemplate is just another type that APIs can support if they want. The "DB" processor becomes an ordinary factory method that accepts a string template or an ordinary builder API. > - APIs now can have _more_ control over the timing and meaning of template processing, because we are not biasing so strongly towards early processing. > - It becomes easier to abstract over template processing (i.e., combine or manipulate templates as templates before processing) > - Interpolation remains an explicit choice, but ST-aware libraries can make this choice on behalf of the user. > - The language feature and API surface get considerably smaller, which is good. Core JDK APIs (e.g., println, format, exception constructors) get upgraded to work with string templates. > > The remaining question that everyone is probably asking is: ?so how do we do interpolation.? The answer there is ?ordinary library methods?. This might be a static method (String.join(StringTemplate)) or an instance method (template.join()), shed to be painted (but please, not right now.). > > This is a sketch of direction, so feel free to pose questions/comments on the direction. We?ll discuss the details as we go. So this new approach is to make all template expressions return the same unprocessed value that RAW."..." did previously? Excellent news! I would still like a way to apply a static type to the interpolated values at the point the StringTemplate is constructed, for reasons such as constructing ergonomic DSLs using StringTemplates, and implicitly typing lambdas. My previous LAZY example simplifies to (presuming generic StringTemplates are supported): public static Supplier lazy(StringTemplate> stringTemplate) { return () -> Bikeshed.paint( stringTemplate.fragments(), stringTemplate.values().stream() .map(Supplier::get) .toList()); } Usage: Supplier now = lazy("Now: \{ Instant::now }"); ... some time later System.out.println(now.get()); I imagine the existing type inference infrastructure is sufficient to automatically derive the correct generic type of the template expression (which will usually be StringTemplate) in almost all cases. Cheers, Clement Cherlin From amaembo at gmail.com Sat Mar 9 11:48:26 2024 From: amaembo at gmail.com (Tagir Valeev) Date: Sat, 9 Mar 2024 12:48:26 +0100 Subject: Update on String Templates (JEP 459) In-Reply-To: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> Message-ID: The idea is interesting. There's a thing that disturbs me though. Currently, proc."string" and proc."string \{template}" are uniformly processed, and the processor may not care much about whether it's a string or a template: both can be processed uniformly. After this change, removing the last embedded expression from the template (e.g., after inlining a constant) will implicitly change the type of the literal from StringTemplate to String. This may either cause a compilation error, or silently bind to another overload which may or may not behave like a template overload with a single-fragment-template. For API authors, this means that every method accepting StringTemplate should have a counterpart accepting String. The logic inside both methods would likely be very similar, so probably both will eventually call a third private method. For API user, it could be unclear how to call a method accepting StringTemplate if I have simple string in hands but there's no String method (or it does slightly different thing due to poor API design). Should I use some ugly construct like "This is a string but the API wants a template, so I append an empty embedded expression\{""}"? Note that we already have an inspection that warns about kinda useless templates like STR."Hello \{"world"}" suggesting to replace them with STR."Hello world". Such an inspection would not work after the proposed change, as the expression type will differ. I can still imagine that StringTemplate could be an interface providing methods like fragments() and values() (like now), but String may implement it, returning an empty list from values() and List.of(this) from fragments(). Of course, method names could differ, to fit the String class better. It would not be a problem if they are more verbose like stringTemplateValues() and stringTemplateFragments(), because they are used not very often. Anyway, not bikeshedding now. This would allow API designers to provide only StringTemplate accepting method, and API users should not think much when string template is suddenly becomes a string. Also, automatic refactorings from StringTemplate to String, like shown above, will still work. Another advantage is that IDEs may suggest converting string concatenation into template if the expected type is StringTemplate. With current proposal, we may suggest to convert "Hello "+name into "Hello \{name}".join(), which is a questionable improvement. However, if we are at method call argument position, where the expected type is StringTemplate, then we can suggest simply "Hello \{name}", which is much better. Otherwise, we would need to check whether there's a StringTemplate-accepting overload and hope that it does the same thing. By the way, I assume that we agree on the toString() implementation of non-String-StringTemplate: it should be a technical debug string, like now, not doing the automatic interpolation. There would be a discrepancy with String-StringTemplate if my suggestion is accepted, but I think it's not a big problem. With best regards, Tagir Valeev. On Fri, Mar 8, 2024 at 7:35?PM Brian Goetz wrote: > > Time to check in with where were are with String Templates. We?ve gone > through two rounds of preview, and have received some feedback. > > As a reminder, the primary goal of gathering feedback is to learn things > about the design or implementation that we don?t already know. This could > be bug reports, experience reports, code review, careful analysis, novel > alternatives, etc. And the best feedback usually comes from using the > feature ?in anger? ? trying to actually write code with it. (?Some > people would prefer a different syntax? or ?some people would prefer we > focused on string interpolation only? fall squarely in the ?things we > already knew? camp.) > > In the course of using this feature in the `jextract` project, we did > learn quite a few things we didn?t already know, and this was conclusive > enough that it has motivated us to adjust our approach in this feature. > Specifically, the role of processors is ?outsized? to the value they > offer, and, after further exploration, we now believe it is possible to > achieve the goals of the feature without an explicit ?processor? > abstraction at all! This is a very positive development. > > First, I want to affirm that that the goals of the project have not > changed. From JEP 459: > > Goals > > ? Simplify the writing of Java programs by making it easy to express > strings that include values computed at run time. > ? Enhance the readability of expressions that mix text and expressions, > whether the text fits on a single source line (as with string literals) or > spans several source lines (as with text blocks). > ? Improve the security of Java programs that compose strings from > user-provided values and pass them to other systems (e.g., building queries > for databases) by supporting validation and transformation of both the > template and the values of its embedded expressions. > ? Retain flexibility by allowing Java libraries to define the formatting > syntax used in string templates. > ? Simplify the use of APIs that accept strings written in non-Java > languages (e.g., SQL, XML, and JSON). > ? Enable the creation of non-string values computed from literal text and > embedded expressions without having to transit through an intermediate > string representation. > > Non-Goals > ? It is not a goal to introduce syntactic sugar for Java's string > concatenation operator (+), since that would circumvent the goal of > validation. > ? It is not a goal to deprecate or remove the StringBuilder and > StringBuffer classes, which have traditionally been used for complex or > programmatic string composition. > > Another thing that has not changed is our view on the syntax for embedding > expressions. While many people did express the opinion of ?why not ?just' > do what Kotlin/Scala does?, this issue was more than fully explored during > the initial design round. (In fact, while syntax disagreements are often > purely subjective, this one was far more clear ? the $-syntax is > objectively worse, and would be doubly so if injected into an existing > language where there were already string literals in the wild. This has > all been more than adequately covered elsewhere, so I won?t rehash it here.) > > > Now, let?s talk about what we do think should change: the role of > processors and the StringTemplate type. > > Processors were envisioned as a means to abstract the transformation of > templates to their final form (whether string, or something else.) > However, Java already has a well established means of abstracting > behavior: methods. (In fact, a processor application can be viewed as > merely a new syntax for a method call.) Our experience using the feature > highlighted the question: When converting a SQL query expressed as a > template to the form required by the database (such as PreparedStatement), > why do we need to say: > > DB.?? template ?? > > When we could use an ordinary Java library: > > Query q = Query.of(??template??) > > Indeed, one of the worst things about having processors in the language is > that API designers are put in the difficult situation of not knowing > whether to write a processor or an ordinary API, and often have to make > that choice before the consequences are fully understood. (To add to this, > processors raise similar questions at the use site.) But the real criticism > here is that template capture and processing are complected, when they > should be separate, composable features. > > This motivated us to revisit some of the reasons why processors were so > central to the initial design in the first place. And it turned out, this > choice had been influenced ? perhaps overly so ? by early implementation > experiments. (One of the background design goals was to enable expensive > operations like `String::format` to be (much) cheaper. Without digressing > too deeply on performance, String::format can be more than an order of > magnitude worse than the equivalent concatenation operation, and this in > turn sometimes motivates developers to use worse idioms for formatting. > The FMT processor brough that cost back in line with the equivalent > concatenation.) These early experiments biased the design towards needing > to know the processor at the point of template capture, but upon > reexamination we realized that there are other ways to achieve the desired > performance goals without requiring processors to be known at capture > time. This, in turn, enabled us to revisit a point in the design space we > had transited through earlier, where string templates were ?just a new kind > of literal? and the job performed by processors could instead be performed > by ordinary APIs. > > At this point, a simpler design and implementation emerged that met the > semantic, correctness, and performance goals: template literals (?Hello > \{name}?) are simply the literal form of StringTemplate: > > StringTemplate st = ?Hello \{name}?; > > String and StringTemplate remain unrelated types. (We explored a number > of ways to interconvert them, but they caused more trouble than they > solved.) Processing of string templates, including interpolation, is done > by ordinary APIs that deal in StringTemplate, aided by some clever > implementation tricks to ensure good performance. > > For APIs where interpolation is known to be safe in the domain, such as > PrintWriter, APIs can make that choice on behalf of the domain, by > providing overloads to embody this design choice: > > void println(String) { ? } > void println(StringTemplate) { ? interpolate and delegate to > println(String) ?. } > > The upshot is that for interpolation-safe APIs like println, we can use a > template directly without giving up any safety: > > System.out.println(?Hello \{name}?); > > In this example, the string template evaluates to StringTemplate, not > String (no implicit interpolation), and chooses the StringTemplate overload > of println, which in turn chooses how to process the template. This > stays true to the design principle that interpolation is dangerous enough > that it should be an explicit choice in the code ? but it allows that > choice to be made by libraries when the library is comfortable doing so. > > Similarly, the FMT processor is replaced by an overload of String::format > that interprets templates with embedded format specifiers (e.g., ?%d?): > > String format(String formatString, Object? parameters) { ? same as today > ? } > String format(StringTemplate template) {... equivalent of FMT ...} > > And users can call this as: > > String s = String.format(?Hello %12s\{name}?); > > Here, the String::format API has chosen to interpret string templates > according to the rules previously specified in the FMT processor (not > ordinary interpolation), but that choice is embedded in the library > semantics so no further explicit choice at the use site is required. The > user already chose to pass it to String::format; that?s all the processing > selection that is needed. > > Where APIs do not express a choice of what template expansion means, users > continue to be free to process them explicitly before passing them, using > APIs that do (such as String::format or ordinary interpolation.). > > The result is: > > - The need for use-site "goop" (previously, the processor name; now, > static or instance methods to process a template) goes away entirely when > dealing with libraries that are already template-friendly. > - Even with libraries that require use-site goop, it is no more intrusive > than before, and can be reduced over time as APIs get with the program. > - StringTemplate is just another type that APIs can support if they want. > The "DB" processor becomes an ordinary factory method that accepts a string > template or an ordinary builder API. > - APIs now can have _more_ control over the timing and meaning of template > processing, because we are not biasing so strongly towards early processing. > - It becomes easier to abstract over template processing (i.e., combine or > manipulate templates as templates before processing) > - Interpolation remains an explicit choice, but ST-aware libraries can > make this choice on behalf of the user. > - The language feature and API surface get considerably smaller, which is > good. Core JDK APIs (e.g., println, format, exception constructors) get > upgraded to work with string templates. > > The remaining question that everyone is probably asking is: ?so how do we > do interpolation.? The answer there is ?ordinary library methods?. This > might be a static method (String.join(StringTemplate)) or an instance > method (template.join()), shed to be painted (but please, not right now.). > > This is a sketch of direction, so feel free to pose questions/comments on > the direction. We?ll discuss the details as we go. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Sat Mar 9 17:03:32 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 9 Mar 2024 17:03:32 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> Message-ID: <636B984E-A544-4155-81D1-8752037A973B@oracle.com> The idea is interesting. There's a thing that disturbs me though. Currently, proc."string" and proc."string \{template}" are uniformly processed, and the processor may not care much about whether it's a string or a template: both can be processed uniformly. Yes, this is one of the tradeoffs of this evolution (and was one of the advantages of the processor-is-required version.) The PROC. Is a strong syntactic hint that whatever comes next is a template, even if it has zero holes. In the current proposal, we have string literals and string template literals, and there are cases where we would like to use a String as the degenerate form of a template. As mentioned, we transited through ?processors are optional, if no processor, its a string template? earlier in the design, and this was one of the reasons we thought that making the processor required all the time was preferable. But now that processors are _gone_, the calculus shifts. We experimented with various ways to address this, including ?String extends StringTemplate?, a boxing conversion from String to StringTemplate, and alternate literal forms like t??? that says ?its a template, dammit?. But in the end, these either create new problems, or just don?t carry their weight. So instead, we?ll just make sure there are conversion methods (StringTemplate::of, String::asTemplate) that users can insert to say what they mean. For API authors, this means that every method accepting StringTemplate should have a counterpart accepting String. I think this is overstated. If you have a ST-accepting method only and pass a string, compiler diagnostics will remind you to convert it. (And all of this discussion is about string *literals*; ordinary string expressions will still require explicit conversion, and should.). Many API points may choose to have both, but I don?t think this rises nearly to the level of a requirement. I can still imagine that StringTemplate could be an interface providing methods like fragments() and values() (like now), but String may implement it, returning an empty list from values() and List.of(this) from fragments(). As mentioned, we explored this, but I think this cure is worse than the disease. At root, this is a workaround for ?a string *literal* with no holes might want to be a template.? I don?t think it makes sense to interpret *all* strings as templates. And this led to some annoying overload selection / inference decisions (tinkering with the super types of String ripples throughout the JDK.). By the way, I assume that we agree on the toString() implementation of non-String-StringTemplate: it should be a technical debug string, like now, not doing the automatic interpolation. There would be a discrepancy with String-StringTemplate if my suggestion is accepted, but I think it's not a big problem. 100%. Interpolation is always an explicit choice of how to convert a ST to a String. With best regards, Tagir Valeev. On Fri, Mar 8, 2024 at 7:35?PM Brian Goetz > wrote: Time to check in with where were are with String Templates. We?ve gone through two rounds of preview, and have received some feedback. As a reminder, the primary goal of gathering feedback is to learn things about the design or implementation that we don?t already know. This could be bug reports, experience reports, code review, careful analysis, novel alternatives, etc. And the best feedback usually comes from using the feature ?in anger? ? trying to actually write code with it. (?Some people would prefer a different syntax? or ?some people would prefer we focused on string interpolation only? fall squarely in the ?things we already knew? camp.) In the course of using this feature in the `jextract` project, we did learn quite a few things we didn?t already know, and this was conclusive enough that it has motivated us to adjust our approach in this feature. Specifically, the role of processors is ?outsized? to the value they offer, and, after further exploration, we now believe it is possible to achieve the goals of the feature without an explicit ?processor? abstraction at all! This is a very positive development. First, I want to affirm that that the goals of the project have not changed. From JEP 459: Goals ? Simplify the writing of Java programs by making it easy to express strings that include values computed at run time. ? Enhance the readability of expressions that mix text and expressions, whether the text fits on a single source line (as with string literals) or spans several source lines (as with text blocks). ? Improve the security of Java programs that compose strings from user-provided values and pass them to other systems (e.g., building queries for databases) by supporting validation and transformation of both the template and the values of its embedded expressions. ? Retain flexibility by allowing Java libraries to define the formatting syntax used in string templates. ? Simplify the use of APIs that accept strings written in non-Java languages (e.g., SQL, XML, and JSON). ? Enable the creation of non-string values computed from literal text and embedded expressions without having to transit through an intermediate string representation. Non-Goals ? It is not a goal to introduce syntactic sugar for Java's string concatenation operator (+), since that would circumvent the goal of validation. ? It is not a goal to deprecate or remove the StringBuilder and StringBuffer classes, which have traditionally been used for complex or programmatic string composition. Another thing that has not changed is our view on the syntax for embedding expressions. While many people did express the opinion of ?why not ?just' do what Kotlin/Scala does?, this issue was more than fully explored during the initial design round. (In fact, while syntax disagreements are often purely subjective, this one was far more clear ? the $-syntax is objectively worse, and would be doubly so if injected into an existing language where there were already string literals in the wild. This has all been more than adequately covered elsewhere, so I won?t rehash it here.) Now, let?s talk about what we do think should change: the role of processors and the StringTemplate type. Processors were envisioned as a means to abstract the transformation of templates to their final form (whether string, or something else.) However, Java already has a well established means of abstracting behavior: methods. (In fact, a processor application can be viewed as merely a new syntax for a method call.) Our experience using the feature highlighted the question: When converting a SQL query expressed as a template to the form required by the database (such as PreparedStatement), why do we need to say: DB.?? template ?? When we could use an ordinary Java library: Query q = Query.of(??template??) Indeed, one of the worst things about having processors in the language is that API designers are put in the difficult situation of not knowing whether to write a processor or an ordinary API, and often have to make that choice before the consequences are fully understood. (To add to this, processors raise similar questions at the use site.) But the real criticism here is that template capture and processing are complected, when they should be separate, composable features. This motivated us to revisit some of the reasons why processors were so central to the initial design in the first place. And it turned out, this choice had been influenced ? perhaps overly so ? by early implementation experiments. (One of the background design goals was to enable expensive operations like `String::format` to be (much) cheaper. Without digressing too deeply on performance, String::format can be more than an order of magnitude worse than the equivalent concatenation operation, and this in turn sometimes motivates developers to use worse idioms for formatting. The FMT processor brough that cost back in line with the equivalent concatenation.) These early experiments biased the design towards needing to know the processor at the point of template capture, but upon reexamination we realized that there are other ways to achieve the desired performance goals without requiring processors to be known at capture time. This, in turn, enabled us to revisit a point in the design space we had transited through earlier, where string templates were ?just a new kind of literal? and the job performed by processors could instead be performed by ordinary APIs. At this point, a simpler design and implementation emerged that met the semantic, correctness, and performance goals: template literals (?Hello \{name}?) are simply the literal form of StringTemplate: StringTemplate st = ?Hello \{name}?; String and StringTemplate remain unrelated types. (We explored a number of ways to interconvert them, but they caused more trouble than they solved.) Processing of string templates, including interpolation, is done by ordinary APIs that deal in StringTemplate, aided by some clever implementation tricks to ensure good performance. For APIs where interpolation is known to be safe in the domain, such as PrintWriter, APIs can make that choice on behalf of the domain, by providing overloads to embody this design choice: void println(String) { ? } void println(StringTemplate) { ? interpolate and delegate to println(String) ?. } The upshot is that for interpolation-safe APIs like println, we can use a template directly without giving up any safety: System.out.println(?Hello \{name}?); In this example, the string template evaluates to StringTemplate, not String (no implicit interpolation), and chooses the StringTemplate overload of println, which in turn chooses how to process the template. This stays true to the design principle that interpolation is dangerous enough that it should be an explicit choice in the code ? but it allows that choice to be made by libraries when the library is comfortable doing so. Similarly, the FMT processor is replaced by an overload of String::format that interprets templates with embedded format specifiers (e.g., ?%d?): String format(String formatString, Object? parameters) { ? same as today ? } String format(StringTemplate template) {... equivalent of FMT ...} And users can call this as: String s = String.format(?Hello %12s\{name}?); Here, the String::format API has chosen to interpret string templates according to the rules previously specified in the FMT processor (not ordinary interpolation), but that choice is embedded in the library semantics so no further explicit choice at the use site is required. The user already chose to pass it to String::format; that?s all the processing selection that is needed. Where APIs do not express a choice of what template expansion means, users continue to be free to process them explicitly before passing them, using APIs that do (such as String::format or ordinary interpolation.). The result is: - The need for use-site "goop" (previously, the processor name; now, static or instance methods to process a template) goes away entirely when dealing with libraries that are already template-friendly. - Even with libraries that require use-site goop, it is no more intrusive than before, and can be reduced over time as APIs get with the program. - StringTemplate is just another type that APIs can support if they want. The "DB" processor becomes an ordinary factory method that accepts a string template or an ordinary builder API. - APIs now can have _more_ control over the timing and meaning of template processing, because we are not biasing so strongly towards early processing. - It becomes easier to abstract over template processing (i.e., combine or manipulate templates as templates before processing) - Interpolation remains an explicit choice, but ST-aware libraries can make this choice on behalf of the user. - The language feature and API surface get considerably smaller, which is good. Core JDK APIs (e.g., println, format, exception constructors) get upgraded to work with string templates. The remaining question that everyone is probably asking is: ?so how do we do interpolation.? The answer there is ?ordinary library methods?. This might be a static method (String.join(StringTemplate)) or an instance method (template.join()), shed to be painted (but please, not right now.). This is a sketch of direction, so feel free to pose questions/comments on the direction. We?ll discuss the details as we go. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Sat Mar 9 20:45:57 2024 From: guy.steele at oracle.com (Guy Steele) Date: Sat, 9 Mar 2024 20:45:57 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: <636B984E-A544-4155-81D1-8752037A973B@oracle.com> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <636B984E-A544-4155-81D1-8752037A973B@oracle.com> Message-ID: <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com> Sent from my iPhone > On Mar 9, 2024, at 12:03?PM, Brian Goetz wrote: > . . . > At root, this is a workaround for ?a string *literal* with no holes might want to be a template.? I don?t think it makes sense to interpret *all* strings as templates. And this led to some annoying overload selection / inference decisions (tinkering with the super types of String ripples throughout the JDK.). Right, we don?t want to interpret all strings as templates, and we don?t want to have a conversion that lets any String expression to be converted to a template. But what about a more targeted conversion? Recall that assignment conversion has a special case that allows a narrowing primitive conversion under certain circumstances where the right-hand side is a constant expression, thus allowing assignment of, for example, the constant literal 1 (nominally of type int) to a variable of type byte, short, or char. Have you considered allowing conversion of _constant_ expressions of type String to templates in assignment contexts and invocation contexts? (Presumably this could be implemented by having the compiler automatically wrap the constant expression within an invocation of something like StringTemplate.of(?).) This would of course kick in for a method invocation only if there is no applicable overloading that does not need the conversion. As a rule I don?t like enlarging the can of worms known as ?special-case conversions?, but I think this would have sufficient utility that it would be worth doing, especially given the precedent that the compiler already knows a great deal of special information about class Java.lang.String. From brian.goetz at oracle.com Sat Mar 9 23:52:19 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 9 Mar 2024 23:52:19 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <636B984E-A544-4155-81D1-8752037A973B@oracle.com> <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com> Message-ID: Maurizio did prototype almost exactly this: treat compile-time constant string expressions as poly expressions whose standalone type is String but which could be treated as ST as well. I?ll let him recap the details, but I think the upshot was that we had overload selection problems with m(String) vs m(StringTemplate), as both were applicable, unless we wanted to treat this as akin to a boxing conversion where we preferred ?unboxed? overloads as we do with loose vs strict method invocation contexts. There were other things we tried too for the special case of string literals as degenerate templates ? I?ll let Maurizio give the details, because I?m sure I will have forgotten one or two. > On Mar 9, 2024, at 12:45 PM, Guy Steele wrote: > > > Sent from my iPhone > >> On Mar 9, 2024, at 12:03?PM, Brian Goetz wrote: >> . . . >> At root, this is a workaround for ?a string *literal* with no holes might want to be a template.? I don?t think it makes sense to interpret *all* strings as templates. And this led to some annoying overload selection / inference decisions (tinkering with the super types of String ripples throughout the JDK.). > > Right, we don?t want to interpret all strings as templates, and we don?t want to have a conversion that lets any String expression to be converted to a template. > > But what about a more targeted conversion? > > Recall that assignment conversion has a special case that allows a narrowing primitive conversion under certain circumstances where the right-hand side is a constant expression, thus allowing assignment of, for example, the constant literal 1 (nominally of type int) to a variable of type byte, short, or char. > > Have you considered allowing conversion of _constant_ expressions of type String to templates in assignment contexts and invocation contexts? (Presumably this could be implemented by having the compiler automatically wrap the constant expression within an invocation of something like StringTemplate.of(?).) This would of course kick in for a method invocation only if there is no applicable overloading that does not need the conversion. > > As a rule I don?t like enlarging the can of worms known as ?special-case conversions?, but I think this would have sufficient utility that it would be worth doing, especially given the precedent that the compiler already knows a great deal of special information about class Java.lang.String. > From guy.steele at oracle.com Sun Mar 10 01:38:34 2024 From: guy.steele at oracle.com (Guy Steele) Date: Sun, 10 Mar 2024 01:38:34 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <636B984E-A544-4155-81D1-8752037A973B@oracle.com> <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com> Message-ID: > On Mar 9, 2024, at 6:52?PM, Brian Goetz wrote: > > Maurizio did prototype almost exactly this: treat compile-time constant string expressions as poly expressions whose standalone type is String but which could be treated as ST as well. I?ll let him recap the details, but I think the upshot was that we had overload selection problems with m(String) vs m(StringTemplate), as both were applicable, unless we wanted to treat this as akin to a boxing conversion where we preferred ?unboxed? overloads as we do with loose vs strict method invocation contexts. Yep, that is exactly what you would have to do: give preference to overloads that do not require the conversion. I don't doubt that this would require special finagling in the compiler?s overload resolution code, since it is not perfectly analogous to anything already in the language. > > There were other things we tried too for the special case of string literals as degenerate templates ? I?ll let Maurizio give the details, because I?m sure I will have forgotten one or two. I?m all ears! >> On Mar 9, 2024, at 12:45 PM, Guy Steele wrote: >> >> >> Sent from my iPhone >> >>> On Mar 9, 2024, at 12:03?PM, Brian Goetz wrote: >>> . . . >>> At root, this is a workaround for ?a string *literal* with no holes might want to be a template.? I don?t think it makes sense to interpret *all* strings as templates. And this led to some annoying overload selection / inference decisions (tinkering with the super types of String ripples throughout the JDK.). >> >> Right, we don?t want to interpret all strings as templates, and we don?t want to have a conversion that lets any String expression to be converted to a template. >> >> But what about a more targeted conversion? >> >> Recall that assignment conversion has a special case that allows a narrowing primitive conversion under certain circumstances where the right-hand side is a constant expression, thus allowing assignment of, for example, the constant literal 1 (nominally of type int) to a variable of type byte, short, or char. >> >> Have you considered allowing conversion of _constant_ expressions of type String to templates in assignment contexts and invocation contexts? (Presumably this could be implemented by having the compiler automatically wrap the constant expression within an invocation of something like StringTemplate.of(?).) This would of course kick in for a method invocation only if there is no applicable overloading that does not need the conversion. >> >> As a rule I don?t like enlarging the can of worms known as ?special-case conversions?, but I think this would have sufficient utility that it would be worth doing, especially given the precedent that the compiler already knows a great deal of special information about class Java.lang.String. >> > From attila.kelemen85 at gmail.com Sun Mar 10 20:41:43 2024 From: attila.kelemen85 at gmail.com (Attila Kelemen) Date: Sun, 10 Mar 2024 21:41:43 +0100 Subject: Update on String Templates (JEP 459) In-Reply-To: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> Message-ID: If the string processing burden is now pushed to the consumer API side, then wouldn't it be worthwhile to make `StringTemplate` simpler given that this means a lot more people are forced to implement processors? I mean that having two lists where you have to alternate between the two is rather unintuitive which is proven by the fact that it forces `StringTemplate` to do the empty string hacks to support alternating between the two lists. Given that we have these nice pattern matching syntaxes, wouldn't it be much nicer to make `StringTemplate` to be a simple wrapper for a `List`, where `StringTemplate.Part` is a sealed interface implemented by `String` and `StringTemplate.ValueRef` (or whatever equivalent). In this case, you could just write a processor with a simple loop like this: ``` var sb = new StringBuilder(); st.parts().forEach(part -> { switch (part) { case String -> sb.append(part); case StringTemplate.ValueRef -> sb.append(formatValue(valueRef.value())); } }) ``` A processor logic would be just much more easier to read than the double iterator counterpart (and in my opinion even easier than trying to use the stencil). An added benefit is that there would be little need to ban a character from ST in this case. Of course, the flip side is that we would need all values to be wrapped, but that doesn't seem like a high cost to me (especially if `ValueRef` would eventually be a value type, then I'm guessing this extra cost would be possible to be mostly optimized away), because it is unlikely to have so many values in an ST for this to matter. Not to mention that having double iterators would have additional cost as well. Attila Brian Goetz ezt ?rta (id?pont: 2024. m?rc. 8., P, 21:54): > > Time to check in with where were are with String Templates. We?ve gone > through two rounds of preview, and have received some feedback. > > As a reminder, the primary goal of gathering feedback is to learn things > about the design or implementation that we don?t already know. This could > be bug reports, experience reports, code review, careful analysis, novel > alternatives, etc. And the best feedback usually comes from using the > feature ?in anger? ? trying to actually write code with it. (?Some > people would prefer a different syntax? or ?some people would prefer we > focused on string interpolation only? fall squarely in the ?things we > already knew? camp.) > > In the course of using this feature in the `jextract` project, we did > learn quite a few things we didn?t already know, and this was conclusive > enough that it has motivated us to adjust our approach in this feature. > Specifically, the role of processors is ?outsized? to the value they > offer, and, after further exploration, we now believe it is possible to > achieve the goals of the feature without an explicit ?processor? > abstraction at all! This is a very positive development. > > First, I want to affirm that that the goals of the project have not > changed. From JEP 459: > > Goals > > ? Simplify the writing of Java programs by making it easy to express > strings that include values computed at run time. > ? Enhance the readability of expressions that mix text and expressions, > whether the text fits on a single source line (as with string literals) or > spans several source lines (as with text blocks). > ? Improve the security of Java programs that compose strings from > user-provided values and pass them to other systems (e.g., building queries > for databases) by supporting validation and transformation of both the > template and the values of its embedded expressions. > ? Retain flexibility by allowing Java libraries to define the formatting > syntax used in string templates. > ? Simplify the use of APIs that accept strings written in non-Java > languages (e.g., SQL, XML, and JSON). > ? Enable the creation of non-string values computed from literal text and > embedded expressions without having to transit through an intermediate > string representation. > > Non-Goals > ? It is not a goal to introduce syntactic sugar for Java's string > concatenation operator (+), since that would circumvent the goal of > validation. > ? It is not a goal to deprecate or remove the StringBuilder and > StringBuffer classes, which have traditionally been used for complex or > programmatic string composition. > > Another thing that has not changed is our view on the syntax for embedding > expressions. While many people did express the opinion of ?why not ?just' > do what Kotlin/Scala does?, this issue was more than fully explored during > the initial design round. (In fact, while syntax disagreements are often > purely subjective, this one was far more clear ? the $-syntax is > objectively worse, and would be doubly so if injected into an existing > language where there were already string literals in the wild. This has > all been more than adequately covered elsewhere, so I won?t rehash it here.) > > > Now, let?s talk about what we do think should change: the role of > processors and the StringTemplate type. > > Processors were envisioned as a means to abstract the transformation of > templates to their final form (whether string, or something else.) > However, Java already has a well established means of abstracting > behavior: methods. (In fact, a processor application can be viewed as > merely a new syntax for a method call.) Our experience using the feature > highlighted the question: When converting a SQL query expressed as a > template to the form required by the database (such as PreparedStatement), > why do we need to say: > > DB.?? template ?? > > When we could use an ordinary Java library: > > Query q = Query.of(??template??) > > Indeed, one of the worst things about having processors in the language is > that API designers are put in the difficult situation of not knowing > whether to write a processor or an ordinary API, and often have to make > that choice before the consequences are fully understood. (To add to this, > processors raise similar questions at the use site.) But the real criticism > here is that template capture and processing are complected, when they > should be separate, composable features. > > This motivated us to revisit some of the reasons why processors were so > central to the initial design in the first place. And it turned out, this > choice had been influenced ? perhaps overly so ? by early implementation > experiments. (One of the background design goals was to enable expensive > operations like `String::format` to be (much) cheaper. Without digressing > too deeply on performance, String::format can be more than an order of > magnitude worse than the equivalent concatenation operation, and this in > turn sometimes motivates developers to use worse idioms for formatting. > The FMT processor brough that cost back in line with the equivalent > concatenation.) These early experiments biased the design towards needing > to know the processor at the point of template capture, but upon > reexamination we realized that there are other ways to achieve the desired > performance goals without requiring processors to be known at capture > time. This, in turn, enabled us to revisit a point in the design space we > had transited through earlier, where string templates were ?just a new kind > of literal? and the job performed by processors could instead be performed > by ordinary APIs. > > At this point, a simpler design and implementation emerged that met the > semantic, correctness, and performance goals: template literals (?Hello > \{name}?) are simply the literal form of StringTemplate: > > StringTemplate st = ?Hello \{name}?; > > String and StringTemplate remain unrelated types. (We explored a number > of ways to interconvert them, but they caused more trouble than they > solved.) Processing of string templates, including interpolation, is done > by ordinary APIs that deal in StringTemplate, aided by some clever > implementation tricks to ensure good performance. > > For APIs where interpolation is known to be safe in the domain, such as > PrintWriter, APIs can make that choice on behalf of the domain, by > providing overloads to embody this design choice: > > void println(String) { ? } > void println(StringTemplate) { ? interpolate and delegate to > println(String) ?. } > > The upshot is that for interpolation-safe APIs like println, we can use a > template directly without giving up any safety: > > System.out.println(?Hello \{name}?); > > In this example, the string template evaluates to StringTemplate, not > String (no implicit interpolation), and chooses the StringTemplate overload > of println, which in turn chooses how to process the template. This > stays true to the design principle that interpolation is dangerous enough > that it should be an explicit choice in the code ? but it allows that > choice to be made by libraries when the library is comfortable doing so. > > Similarly, the FMT processor is replaced by an overload of String::format > that interprets templates with embedded format specifiers (e.g., ?%d?): > > String format(String formatString, Object? parameters) { ? same as today > ? } > String format(StringTemplate template) {... equivalent of FMT ...} > > And users can call this as: > > String s = String.format(?Hello %12s\{name}?); > > Here, the String::format API has chosen to interpret string templates > according to the rules previously specified in the FMT processor (not > ordinary interpolation), but that choice is embedded in the library > semantics so no further explicit choice at the use site is required. The > user already chose to pass it to String::format; that?s all the processing > selection that is needed. > > Where APIs do not express a choice of what template expansion means, users > continue to be free to process them explicitly before passing them, using > APIs that do (such as String::format or ordinary interpolation.). > > The result is: > > - The need for use-site "goop" (previously, the processor name; now, > static or instance methods to process a template) goes away entirely when > dealing with libraries that are already template-friendly. > - Even with libraries that require use-site goop, it is no more intrusive > than before, and can be reduced over time as APIs get with the program. > - StringTemplate is just another type that APIs can support if they want. > The "DB" processor becomes an ordinary factory method that accepts a string > template or an ordinary builder API. > - APIs now can have _more_ control over the timing and meaning of template > processing, because we are not biasing so strongly towards early processing. > - It becomes easier to abstract over template processing (i.e., combine or > manipulate templates as templates before processing) > - Interpolation remains an explicit choice, but ST-aware libraries can > make this choice on behalf of the user. > - The language feature and API surface get considerably smaller, which is > good. Core JDK APIs (e.g., println, format, exception constructors) get > upgraded to work with string templates. > > The remaining question that everyone is probably asking is: ?so how do we > do interpolation.? The answer there is ?ordinary library methods?. This > might be a static method (String.join(StringTemplate)) or an instance > method (template.join()), shed to be painted (but please, not right now.). > > This is a sketch of direction, so feel free to pose questions/comments on > the direction. We?ll discuss the details as we go. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From maurizio.cimadamore at oracle.com Mon Mar 11 12:15:51 2024 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 11 Mar 2024 12:15:51 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <636B984E-A544-4155-81D1-8752037A973B@oracle.com> <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com> Message-ID: <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com> Hi all, we tried mainly three approaches to allow smoother interop between strings and string templates: (a) make String a subclass of StringTemplate. Or (b) make constant strings bs /convertible/ to string templates. Or, (c) use target-typing. All these approaches have some issues, discussed below. The first approach is slightly simpler, because it can be achieved entirely outside of the Java language. Unfortunately, adding ?String implements StringTemplate? adds overload ambiguities in cases such as this: |format(StringTemplate) // 1 format(String, Object...) // 2 | This is actually a very important case, as we predice that StringTemplate will serve as a great replacement for methods out there accepting a string/Object? pack. Unfortunatly, if String <: StringTemplate, this means that calling format with a string literal will resolve to (1), not (2) as before. The problem here is that (2) is not even applicable during the two overload resolution phases (which is only allowed to use subtyping and conversions, respectively), as it is a varargs method. Because of this, (1) will now take the precedence, as that?s not varargs. While for String::format this is probably harmless, changing results of overload selection is something that should be done with care (esp. if different overloads have different return types), as it could lead to source compatibility issues. On top of these issues, making all strings be string templates has the disadvantage of also considering ?messy? strings obtained via concatenation of non-constant values string templates too, which seems bad. To overcome these issues, we attempetd to add an implicit conversion from /constant/ strings to StringTemplate. As it was observed by Guy, in case of ambiguities, the non-converting variants (e.g. m(String)) would be preferred. That said, in the above example (with varargs) we would still get a potentially incompatible change - as a string literal would be applicable in (1) before (2) is even considered, so the same concerns surrounding overload resolution changes would remain. Another thing that came up is that conversions automatically bring in casting conversions. E.g. if you can go from A to B using assignment conversion, you can typically go the same direction using casting conversion. This raises two issues. The first is that casting conversion is generally a symmetric type relationship (e.g. if you can cast from A to B, then you can cast from B to A), while here we?re mostly discussing about one direction. But this is, perhaps, not a big deal - after all, ?constant strings? don?t have a denotable type, so perhaps it should come to no surprise that you can?t use them as a /target/ type for a cast. The second ?issue? is that casting conversion brings about patterns, as that?s how pattern applicability is defined. For instance: |switch("Hello") { case StringTemplate st ... } | To make this work we would need at least to tweak exhaustiveness (otherwise javac would think the above switch is not exhaustive, and ask you to add a default). Secondly, some tweaks to the runtime tests would be required also. Not impossible, but would require some more work to make sure we?re ok with this direction. Another issue with the conversion is that it would expose a sharp edge in the current overload resolution and inference machinery. For instance, this program doesn?t compile correctly: |List li = List.of(1, 1L) | Similarly, this program would also not compile correctly: |List li = List.of("Hello", "Hello \{world}"); | The last possibility would be to say that a string literal is a /poly expression/. As such, a string literal can be typed to either String or StringTemplate depending on the target type (for instance, this is close to how int literals also work). This approach would still suffer from the same incompatible overload changes with varargs method as the other approaches. But, by avoiding to add a conversion, it makes things a little easier: for instance, in the case of pattern matching, nothing needs to be done, as the string literal will be turned into a string template /before/ the switch even takes place (meaning that existing exhaustiveness and runtime checks would still work). But, there?s still dragons and irregularities when it comes to inference - for instance: |List lst = List.of("hello", "world"); | This would not type-check: we need a target-type to know which way the literal is going (List::of just accepts a type-variable X). Note that overload resolution happens at a time where the target-type is not known, so here we?d probably pick X = String, which will then fail to type-check against the target. Another issue with target-typing is that if you have two overloads: |m(String) m(StringTemplate) | And you call this with a string literal, you get an ambiguity: you can go both ways, but String and StringTemplate are unrelated types, so we can?t pick one as ?most specific?. This issue could be addressed, in principle, by adding an ad-hoc most specific rule that, in case of an ambiguity, always gave precedence to String over StringTemplate. We do a similar trick for lambda expressions, where if two method accepts similarly looking functional interface, we give precedence to the non-boxing one. Anyway, the general message here is that it?s a bit of a ?pick your posion? situation. Adding a more fluid relationship between string and templates is definitively possible, but there are risks that this will impact negatively other areas of the language, risks that would need to be assessed very carefully. Another, simpler, option we consider was to use some kind of prefix to mark a string template literal (e.g. make that explicit, instead of resorting to language wizardry). That works, but has the disadvantage of breaking the spell that there is only ?one string literal?, which is something we have worked quite hard to achieve. Cheers Maurizio On 09/03/2024 23:52, Brian Goetz wrote: > I?ll let Maurizio give the details, because I?m sure I will have forgotten one or two. ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From ccherlin at gmail.com Mon Mar 11 13:54:46 2024 From: ccherlin at gmail.com (Clement Cherlin) Date: Mon, 11 Mar 2024 08:54:46 -0500 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> Message-ID: On Sun, Mar 10, 2024 at 3:42?PM Attila Kelemen wrote: > > If the string processing burden is now pushed to the consumer API side, then wouldn't it be worthwhile to make `StringTemplate` simpler given that this means a lot more people are forced to implement processors? I mean that having two lists where you have to alternate between the two is rather unintuitive which is proven by the fact that it forces `StringTemplate` to do the empty string hacks to support alternating between the two lists. > > Given that we have these nice pattern matching syntaxes, wouldn't it be much nicer to make `StringTemplate` to be a simple wrapper for a `List`, where `StringTemplate.Part` is a sealed interface implemented by `String` and `StringTemplate.ValueRef` (or whatever equivalent). In this case, you could just write a processor with a simple loop like this: > > ``` > var sb = new StringBuilder(); > st.parts().forEach(part -> { > switch (part) { > case String -> sb.append(part); > case StringTemplate.ValueRef -> sb.append(formatValue(valueRef.value())); > } > }) > ``` > > A processor logic would be just much more easier to read than the double iterator counterpart (and in my opinion even easier than trying to use the stencil). An added benefit is that there would be little need to ban a character from ST in this case. Of course, the flip side is that we would need all values to be wrapped, but that doesn't seem like a high cost to me (especially if `ValueRef` would eventually be a value type, then I'm guessing this extra cost would be possible to be mostly optimized away), because it is unlikely to have so many values in an ST for this to matter. Not to mention that having double iterators would have additional cost as well. > > Attila I like where you're going, but I think it can be done in a more straightforward and simple way by moving the process() method to StringTemplate and having it take a pair of Consumers: public interface StringTemplate { default void process(Consumer fragmentConsumer, Consumer[1] valueConsumer) { // iterate through both lists, alternately calling fragmentConsumer and valueConsumer } } [1] or Consumer, see my previous posts about generic string templates. Using that method would look like: var sb = new StringBuilder(); st.process( sb::append, value -> sb.append(formatValue(value)) ); Cheers, Clement Cherlin From maurizio.cimadamore at oracle.com Mon Mar 11 14:28:28 2024 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 11 Mar 2024 14:28:28 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: <1848509621.25722001.1710162089780.JavaMail.zimbra@univ-eiffel.fr> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <636B984E-A544-4155-81D1-8752037A973B@oracle.com> <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com> <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com> <1848509621.25722001.1710162089780.JavaMail.zimbra@univ-eiffel.fr> Message-ID: On 11/03/2024 13:01, Remi Forax wrote: > It's not a real boxing conversion, because it's a one way conversion, > i.e. there is a boxing conversion between StringTemplate to String but > no boxing conversion from String to StringTemplate. We can add it, but > i do not think it's necessary given that with a String s, it can > always be converted to a StringTemplate using t"\{s}". This approach goes against the goal of making template -> string conversion explicit. While turning a string into a template is totally safe (after all, a string is a degenerate case of a template with no values), the reverse is not true: there are many ways to go from a template to a string and either the user (at the use site) or the library (at the decl site) will have to "say what they mean". Now, you might disagree with this, but, as stated by Brian, this change is not about relaxing the design goals in JEP 465. This is why, in my email, I'm specifically only speaking about the String -> StringTemplate direction. > Apart from the fact that adding overloads in a lot of existing > projects looks like a sisiphus task, doing the conversion at use site > also as the advantage of allowing the compiler generates an > invokedynamic at use site so the boxing from a StringTemplate to a > String will be as fast as the string concatenation using '+' (see > Duncan email on amber-dev). We can make things fast in other ways. For instance, given that string interpolation will be rather common, we might cache the string interpolation MH in the literal directly (after all, such literal is associated with an indy callsite). Other, more dynamic, approaches are possible too. I believe Jim might provide more details on how exactly this can be achieved, but I think that for now it would be better not to let the performance considerations drive the discussion. Maurizio From forax at univ-mlv.fr Mon Mar 11 13:01:29 2024 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 11 Mar 2024 14:01:29 +0100 (CET) Subject: Update on String Templates (JEP 459) In-Reply-To: <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <636B984E-A544-4155-81D1-8752037A973B@oracle.com> <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com> <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com> Message-ID: <1848509621.25722001.1710162089780.JavaMail.zimbra@univ-eiffel.fr> Hello, > Another, simpler, option we consider was to use some kind of prefix to mark a string template literal (e.g. make that explicit, instead of resorting to language wizardry). That works, but has the disadvantage of breaking the spell that there is only ?one string literal?, which is something we have worked quite hard to achieve. I vote for making string templates explicit. Yes, it can be seen as complex as first because not everything is a String, but at the same time, I believe it makes the conversion rules far easier to understand. For me, we already have a lot of methods that takes a String as parameter, so seeing String as a StringTemplate is not really a solution because it means adding a lot of overloads with all of the incompatibilities you describe. I see the conversion of a StringTemplate to a String as a boxing conversion, if the current type is a StringTemplate and the target type is a String, the compiler will generate a code that is equivalent to calling ".interpolate()" implicitly. It's not a real boxing conversion, because it's a one way conversion, i.e. there is a boxing conversion between StringTemplate to String but no boxing conversion from String to StringTemplate. We can add it, but i do not think it's necessary given that with a String s, it can always be converted to a StringTemplate using t"\{s}". One question can be if we prefer a callsite conversion like the boxing conversion described above or a declaration site conversion, i.e. ask all developers if they want to support StringTemplate to add a new overload. Apart from the fact that adding overloads in a lot of existing projects looks like a sisiphus task, doing the conversion at use site also as the advantage of allowing the compiler generates an invokedynamic at use site so the boxing from a StringTemplate to a String will be as fast as the string concatenation using '+' (see Duncan email on amber-dev). regards, R?mi > From: "Maurizio Cimadamore" > To: "Brian Goetz" , "Guy Steele" > Cc: "Tagir Valeev" , "amber-spec-experts" > > Sent: Monday, March 11, 2024 1:15:51 PM > Subject: Re: Update on String Templates (JEP 459) > Hi all, > we tried mainly three approaches to allow smoother interop between strings and > string templates: (a) make String a subclass of StringTemplate. Or (b) make > constant strings bs convertible to string templates. Or, (c) use target-typing. > All these approaches have some issues, discussed below. > The first approach is slightly simpler, because it can be achieved entirely > outside of the Java language. Unfortunately, adding ?String implements > StringTemplate? adds overload ambiguities in cases such as this: > format(StringTemplate) // 1 > format(String, Object...) // 2 > This is actually a very important case, as we predice that StringTemplate will > serve as a great replacement for methods out there accepting a string/Object? > pack. > Unfortunatly, if String <: StringTemplate, this means that calling format with a > string literal will resolve to (1), not (2) as before. The problem here is that > (2) is not even applicable during the two overload resolution phases (which is > only allowed to use subtyping and conversions, respectively), as it is a > varargs method. Because of this, (1) will now take the precedence, as that?s > not varargs. While for String::format this is probably harmless, changing > results of overload selection is something that should be done with care (esp. > if different overloads have different return types), as it could lead to source > compatibility issues. > On top of these issues, making all strings be string templates has the > disadvantage of also considering ?messy? strings obtained via concatenation of > non-constant values string templates too, which seems bad. > To overcome these issues, we attempetd to add an implicit conversion from > constant strings to StringTemplate. As it was observed by Guy, in case of > ambiguities, the non-converting variants (e.g. m(String)) would be preferred. > That said, in the above example (with varargs) we would still get a potentially > incompatible change - as a string literal would be applicable in (1) before (2) > is even considered, so the same concerns surrounding overload resolution > changes would remain. > Another thing that came up is that conversions automatically bring in casting > conversions. E.g. if you can go from A to B using assignment conversion, you > can typically go the same direction using casting conversion. This raises two > issues. The first is that casting conversion is generally a symmetric type > relationship (e.g. if you can cast from A to B, then you can cast from B to A), > while here we?re mostly discussing about one direction. But this is, perhaps, > not a big deal - after all, ?constant strings? don?t have a denotable type, so > perhaps it should come to no surprise that you can?t use them as a target type > for a cast. > The second ?issue? is that casting conversion brings about patterns, as that?s > how pattern applicability is defined. For instance: > switch("Hello") { > case StringTemplate st ... > } > To make this work we would need at least to tweak exhaustiveness (otherwise > javac would think the above switch is not exhaustive, and ask you to add a > default). Secondly, some tweaks to the runtime tests would be required also. > Not impossible, but would require some more work to make sure we?re ok with > this direction. > Another issue with the conversion is that it would expose a sharp edge in the > current overload resolution and inference machinery. For instance, this program > doesn?t compile correctly: > List li = List.of(1, 1L) > Similarly, this program would also not compile correctly: > List li = List.of("Hello", "Hello \{world}"); > The last possibility would be to say that a string literal is a poly expression > . As such, a string literal can be typed to either String or StringTemplate > depending on the target type (for instance, this is close to how int literals > also work). > This approach would still suffer from the same incompatible overload changes > with varargs method as the other approaches. But, by avoiding to add a > conversion, it makes things a little easier: for instance, in the case of > pattern matching, nothing needs to be done, as the string literal will be > turned into a string template before the switch even takes place (meaning that > existing exhaustiveness and runtime checks would still work). But, there?s > still dragons and irregularities when it comes to inference - for instance: > List lst = List.of("hello", "world"); > This would not type-check: we need a target-type to know which way the literal > is going (List::of just accepts a type-variable X). Note that overload > resolution happens at a time where the target-type is not known, so here we?d > probably pick X = String, which will then fail to type-check against the > target. > Another issue with target-typing is that if you have two overloads: > m(String) > m(StringTemplate) > And you call this with a string literal, you get an ambiguity: you can go both > ways, but String and StringTemplate are unrelated types, so we can?t pick one > as ?most specific?. This issue could be addressed, in principle, by adding an > ad-hoc most specific rule that, in case of an ambiguity, always gave precedence > to String over StringTemplate. We do a similar trick for lambda expressions, > where if two method accepts similarly looking functional interface, we give > precedence to the non-boxing one. > Anyway, the general message here is that it?s a bit of a ?pick your posion? > situation. Adding a more fluid relationship between string and templates is > definitively possible, but there are risks that this will impact negatively > other areas of the language, risks that would need to be assessed very > carefully. > Another, simpler, option we consider was to use some kind of prefix to mark a > string template literal (e.g. make that explicit, instead of resorting to > language wizardry). That works, but has the disadvantage of breaking the spell > that there is only ?one string literal?, which is something we have worked > quite hard to achieve. > Cheers > Maurizio > On 09/03/2024 23:52, Brian Goetz wrote: >> I?ll let Maurizio give the details, because I?m sure I will have forgotten one >> or two. > ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From ccherlin at gmail.com Mon Mar 11 15:50:52 2024 From: ccherlin at gmail.com (Clement Cherlin) Date: Mon, 11 Mar 2024 10:50:52 -0500 Subject: Update on String Templates (JEP 459) In-Reply-To: <1848509621.25722001.1710162089780.JavaMail.zimbra@univ-eiffel.fr> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <636B984E-A544-4155-81D1-8752037A973B@oracle.com> <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com> <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com> <1848509621.25722001.1710162089780.JavaMail.zimbra@univ-eiffel.fr> Message-ID: On Mon, Mar 11, 2024 at 9:37?AM Remi Forax wrote: > > Hello, > > > Another, simpler, option we consider was to use some kind of prefix to mark a string template literal (e.g. make that explicit, instead of resorting to language wizardry). That works, but has the disadvantage of breaking the spell that there is only ?one string literal?, which is something we have worked quite hard to achieve. > > I vote for making string templates explicit. > Yes, it can be seen as complex as first because not everything is a String, but at the same time, I believe it makes the conversion rules far easier to understand. I agree, and suggest `backquotes` (and ```triple backquotes``` for Template Blocks) to denote template strings. They were already considered for Raw String Literals, which implies they're a viable option. Raw String Literals didn't get adopted, so the character remains available for use. > > For me, we already have a lot of methods that takes a String as parameter, so seeing String as a StringTemplate is not really a solution because it means adding a lot of overloads with all of the incompatibilities you describe. > I see the conversion of a StringTemplate to a String as a boxing conversion, if the current type is a StringTemplate and the target type is a String, the compiler will generate a code that is equivalent to calling ".interpolate()" implicitly. > It's not a real boxing conversion, because it's a one way conversion, i.e. there is a boxing conversion between StringTemplate to String but no boxing conversion from String to StringTemplate. We can add it, but i do not think it's necessary given that with a String s, it can always be converted to a StringTemplate using t"\{s}". > > One question can be if we prefer a callsite conversion like the boxing conversion described above or a declaration site conversion, i.e. ask all developers if they want to support StringTemplate to add a new overload. > Apart from the fact that adding overloads in a lot of existing projects looks like a sisiphus task, doing the conversion at use site also as the advantage of allowing the compiler generates an invokedynamic at use site so the boxing from a StringTemplate to a String will be as fast as the string concatenation using '+' (see Duncan email on amber-dev). > > regards, > R?mi > > ________________________________ I do not think implicit conversion from either String to String Template or String Template to String is wise, given the complications with overload resolution and potential for surprising undesired effects cited below. When a method doesn't accept a String Template, you simply process it to String first (or another type, remember that a StringTemplate doesn't have to be converted to a String!). This is no different than what you would do with a StringBuilder or other String-like-but-not-actually-String object. Cheers, Clement > From: "Maurizio Cimadamore" > To: "Brian Goetz" , "Guy Steele" > Cc: "Tagir Valeev" , "amber-spec-experts" > Sent: Monday, March 11, 2024 1:15:51 PM > Subject: Re: Update on String Templates (JEP 459) > > Hi all, > we tried mainly three approaches to allow smoother interop between strings and string templates: (a) make String a subclass of StringTemplate. Or (b) make constant strings bs convertible to string templates. Or, (c) use target-typing. All these approaches have some issues, discussed below. > > The first approach is slightly simpler, because it can be achieved entirely outside of the Java language. Unfortunately, adding ?String implements StringTemplate? adds overload ambiguities in cases such as this: > > format(StringTemplate) // 1 > format(String, Object...) // 2 > > This is actually a very important case, as we predice that StringTemplate will serve as a great replacement for methods out there accepting a string/Object? pack. > > Unfortunatly, if String <: StringTemplate, this means that calling format with a string literal will resolve to (1), not (2) as before. The problem here is that (2) is not even applicable during the two overload resolution phases (which is only allowed to use subtyping and conversions, respectively), as it is a varargs method. Because of this, (1) will now take the precedence, as that?s not varargs. While for String::format this is probably harmless, changing results of overload selection is something that should be done with care (esp. if different overloads have different return types), as it could lead to source compatibility issues. > > On top of these issues, making all strings be string templates has the disadvantage of also considering ?messy? strings obtained via concatenation of non-constant values string templates too, which seems bad. > > To overcome these issues, we attempetd to add an implicit conversion from constant strings to StringTemplate. As it was observed by Guy, in case of ambiguities, the non-converting variants (e.g. m(String)) would be preferred. That said, in the above example (with varargs) we would still get a potentially incompatible change - as a string literal would be applicable in (1) before (2) is even considered, so the same concerns surrounding overload resolution changes would remain. > > Another thing that came up is that conversions automatically bring in casting conversions. E.g. if you can go from A to B using assignment conversion, you can typically go the same direction using casting conversion. This raises two issues. The first is that casting conversion is generally a symmetric type relationship (e.g. if you can cast from A to B, then you can cast from B to A), while here we?re mostly discussing about one direction. But this is, perhaps, not a big deal - after all, ?constant strings? don?t have a denotable type, so perhaps it should come to no surprise that you can?t use them as a target type for a cast. > > The second ?issue? is that casting conversion brings about patterns, as that?s how pattern applicability is defined. For instance: > > switch("Hello") { > case StringTemplate st ... > } > > To make this work we would need at least to tweak exhaustiveness (otherwise javac would think the above switch is not exhaustive, and ask you to add a default). Secondly, some tweaks to the runtime tests would be required also. Not impossible, but would require some more work to make sure we?re ok with this direction. > > Another issue with the conversion is that it would expose a sharp edge in the current overload resolution and inference machinery. For instance, this program doesn?t compile correctly: > > List li = List.of(1, 1L) > > Similarly, this program would also not compile correctly: > > List li = List.of("Hello", "Hello \{world}"); > > The last possibility would be to say that a string literal is a poly expression. As such, a string literal can be typed to either String or StringTemplate depending on the target type (for instance, this is close to how int literals also work). > > This approach would still suffer from the same incompatible overload changes with varargs method as the other approaches. But, by avoiding to add a conversion, it makes things a little easier: for instance, in the case of pattern matching, nothing needs to be done, as the string literal will be turned into a string template before the switch even takes place (meaning that existing exhaustiveness and runtime checks would still work). But, there?s still dragons and irregularities when it comes to inference - for instance: > > List lst = List.of("hello", "world"); > > This would not type-check: we need a target-type to know which way the literal is going (List::of just accepts a type-variable X). Note that overload resolution happens at a time where the target-type is not known, so here we?d probably pick X = String, which will then fail to type-check against the target. > > Another issue with target-typing is that if you have two overloads: > > m(String) > m(StringTemplate) > > And you call this with a string literal, you get an ambiguity: you can go both ways, but String and StringTemplate are unrelated types, so we can?t pick one as ?most specific?. This issue could be addressed, in principle, by adding an ad-hoc most specific rule that, in case of an ambiguity, always gave precedence to String over StringTemplate. We do a similar trick for lambda expressions, where if two method accepts similarly looking functional interface, we give precedence to the non-boxing one. > > Anyway, the general message here is that it?s a bit of a ?pick your posion? situation. Adding a more fluid relationship between string and templates is definitively possible, but there are risks that this will impact negatively other areas of the language, risks that would need to be assessed very carefully. > > Another, simpler, option we consider was to use some kind of prefix to mark a string template literal (e.g. make that explicit, instead of resorting to language wizardry). That works, but has the disadvantage of breaking the spell that there is only ?one string literal?, which is something we have worked quite hard to achieve. > > Cheers > Maurizio > > On 09/03/2024 23:52, Brian Goetz wrote: > > I?ll let Maurizio give the details, because I?m sure I will have forgotten one or two. > > From archie.cobbs at gmail.com Mon Mar 11 16:07:58 2024 From: archie.cobbs at gmail.com (Archie Cobbs) Date: Mon, 11 Mar 2024 11:07:58 -0500 Subject: Update on String Templates (JEP 459) In-Reply-To: <1848509621.25722001.1710162089780.JavaMail.zimbra@univ-eiffel.fr> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <636B984E-A544-4155-81D1-8752037A973B@oracle.com> <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com> <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com> <1848509621.25722001.1710162089780.JavaMail.zimbra@univ-eiffel.fr> Message-ID: On Mon, Mar 11, 2024 at 9:37?AM Remi Forax wrote: > I vote for making string templates explicit. > Caveat: I've been following this discussion only loosely so I'm likely to say something stupid/ignorant/redundant; if so please ignore. But I am tending to agree with Remi. The recent simplifications Brian described are a definite improvement, but now we're left with a new question: What is the advantage of having the language literals for String and StringTemplate look so confusingly similar? Reversing that question, I'm not seeing the big downside of having a simple prefix for literals like this: var s = "this is a string"; var st1 = $"this is a (degenerate) template"; var st2 = $"this is also a \{template}"; var x = "this is a \{lexical_error}"; myobj.someOverloadedMethod($"this is definitely a template"); myobj.someOverloadedMethod("this is definitely a string!"); // no need to consult javadoc here Seems like the trade-off is straightforward: Cost: one character Benefit: instant disambiguation clarity in the developer's mind At least, it makes the whole API design/overload question straightforward. Put another way, StringTemplates are a cool new language feature, and as such it seems like they deserve a "first-class" allotment in the syntax of the language. -Archie -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon Mar 11 17:36:49 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 11 Mar 2024 17:36:49 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <636B984E-A544-4155-81D1-8752037A973B@oracle.com> <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com> <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com> <1848509621.25722001.1710162089780.JavaMail.zimbra@univ-eiffel.fr> Message-ID: The overlap between string literals and string template literals is indeed a tricky one, and bears some review of the options. Obviously string templates and strings have some things in common (its in the name!), but they are also different and evaluate to different types. So how ?same? or ?different? should they look? Simplistic arguments in favor of ?different?: - Ambiguity is bad, clarity is good - String / string template literals can be both wide and tall; having to examine the entirety to know which it is could be confusing - Simpler for compiler and specification writers Simplistic arguments in favor of ?same?: - Will be perceived as ?fussy? or distracting - Users are already grumpy that we?re not doing ?string interpolation? and calling it a day - Most of the time, it is perfectly obvious which one it is - Have to make up yet another new and unfamiliar syntax to disambiguate, think of the bike shedding There are probably others, but none of these seem like slam-dunks one way or the other. There are a few choices here: - Keep the current syntax approach - Give STs a new syntax - Give both STs and string literals an _optional_ new syntax, such as I_IZ_STRING??? and TEMPLATZ???, but allow the current approach when disambiguation is not needed The last seeks a compromise between the current path and the desire for explicitness. Suppose we allowed s??? and t??? literals, where the sigils were optional. What then? Obviously in the cases which are currently ambiguous-seeming, users could disambiguate explicitly. The prefix sigil means no one has to ?buffer? when interpreting the code. That?s nice. Having two ways to write classical string literals might confuse people who haven?t seen them before, or stimulate unproductive ?style wars?. That?s probably not too big a problem here. Overall, though, I am not so enthused about creating yet another new lexical mechanism for having different kinds of stringy things. The value is ? meh, and it seems an attractive nuisance. In other languages with multiple ?flavors? of string, there is a tendency to proliferate more flavors. (Raw strings, anyone?). My take is that this is something that is bothering us a lot because it is new, but I?m skeptical that it carries its weight. On Mar 11, 2024, at 9:07 AM, Archie Cobbs > wrote: On Mon, Mar 11, 2024 at 9:37?AM Remi Forax > wrote: I vote for making string templates explicit. Caveat: I've been following this discussion only loosely so I'm likely to say something stupid/ignorant/redundant; if so please ignore. But I am tending to agree with Remi. The recent simplifications Brian described are a definite improvement, but now we're left with a new question: What is the advantage of having the language literals for String and StringTemplate look so confusingly similar? Reversing that question, I'm not seeing the big downside of having a simple prefix for literals like this: var s = "this is a string"; var st1 = $"this is a (degenerate) template"; var st2 = $"this is also a \{template}"; var x = "this is a \{lexical_error}"; myobj.someOverloadedMethod($"this is definitely a template"); myobj.someOverloadedMethod("this is definitely a string!"); // no need to consult javadoc here Seems like the trade-off is straightforward: Cost: one character Benefit: instant disambiguation clarity in the developer's mind At least, it makes the whole API design/overload question straightforward. Put another way, StringTemplates are a cool new language feature, and as such it seems like they deserve a "first-class" allotment in the syntax of the language. -Archie -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Mon Mar 11 18:24:31 2024 From: guy.steele at oracle.com (Guy Steele) Date: Mon, 11 Mar 2024 18:24:31 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <636B984E-A544-4155-81D1-8752037A973B@oracle.com> <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com> <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com> <1848509621.25722001.1710162089780.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <5922B9E8-BB56-4212-8F43-48A676AB0E25@oracle.com> My thinking pretty much matched Brian?s analysis below, until I saw Archie?s examples and thought about them. Four points: (1) I would add one more simplistic argument in favor of ?different?: I see some value in a reader of code having fair warning, quite visible and up front, that what looks like a string may actually contain executable code (possibly having side effects). (Maybe this is related to what Brian meant by "The prefix sigil means no one has to ?buffer' when interpreting the code.".) I think we do want such a warning, if present, to be concise but hard to overlook, and I think the choice of ?$? fits that bill admirably. (Pro: The character ?$? is associated with string interpolation in a number of other languages, including C#, Dart, Groovy, JavaScript, Julia, Kotlin, PHP, TCL, TypeScript, and Visual Basic. Con: Of the languages just listed, those that use ?$? before the opening double quote are C# and Visual Basic, and the proposed Java syntax is not otherwise identical to the syntax of C# and Visual Basic, which enclose expressions in _unescaped_ braces.) (2) Because ?$? is an identifier in Java, it suggests that we can hold open a possible future where we allow other string-prefix sigils having the syntax of an identifier, but without really committing to that generality at this time. (3) Because ?$? is a _discouraged_ identifier in Java (see JLS ?3.8: "The dollar sign should be used only in mechanically generated source code or, rarely, to access pre-existing names on legacy systems.?), in practice all occurrences of dollar signs would in fact flag string templates. (4) Archie's suggestion does not create an alternate syntax for the Plain Old String Literals we have had in Java since its inception. For these reasons, I recommend that Archie?s suggestion (and perhaps also the C#/Visual Basic variation) be given careful (re-)consideration at this time. On Mar 11, 2024, at 1:36?PM, Brian Goetz wrote: The overlap between string literals and string template literals is indeed a tricky one, and bears some review of the options. Obviously string templates and strings have some things in common (its in the name!), but they are also different and evaluate to different types. So how ?same? or ?different? should they look? Simplistic arguments in favor of ?different?: - Ambiguity is bad, clarity is good - String / string template literals can be both wide and tall; having to examine the entirety to know which it is could be confusing - Simpler for compiler and specification writers Simplistic arguments in favor of ?same?: - Will be perceived as ?fussy? or distracting - Users are already grumpy that we?re not doing ?string interpolation? and calling it a day - Most of the time, it is perfectly obvious which one it is - Have to make up yet another new and unfamiliar syntax to disambiguate, think of the bike shedding There are probably others, but none of these seem like slam-dunks one way or the other. There are a few choices here: - Keep the current syntax approach - Give STs a new syntax - Give both STs and string literals an _optional_ new syntax, such as I_IZ_STRING??? and TEMPLATZ???, but allow the current approach when disambiguation is not needed The last seeks a compromise between the current path and the desire for explicitness. Suppose we allowed s??? and t??? literals, where the sigils were optional. What then? Obviously in the cases which are currently ambiguous-seeming, users could disambiguate explicitly. The prefix sigil means no one has to ?buffer? when interpreting the code. That?s nice. Having two ways to write classical string literals might confuse people who haven?t seen them before, or stimulate unproductive ?style wars?. That?s probably not too big a problem here. Overall, though, I am not so enthused about creating yet another new lexical mechanism for having different kinds of stringy things. The value is ? meh, and it seems an attractive nuisance. In other languages with multiple ?flavors? of string, there is a tendency to proliferate more flavors. (Raw strings, anyone?). My take is that this is something that is bothering us a lot because it is new, but I?m skeptical that it carries its weight. On Mar 11, 2024, at 9:07 AM, Archie Cobbs > wrote: On Mon, Mar 11, 2024 at 9:37?AM Remi Forax > wrote: I vote for making string templates explicit. Caveat: I've been following this discussion only loosely so I'm likely to say something stupid/ignorant/redundant; if so please ignore. But I am tending to agree with Remi. The recent simplifications Brian described are a definite improvement, but now we're left with a new question: What is the advantage of having the language literals for String and StringTemplate look so confusingly similar? Reversing that question, I'm not seeing the big downside of having a simple prefix for literals like this: var s = "this is a string"; var st1 = $"this is a (degenerate) template"; var st2 = $"this is also a \{template}"; var x = "this is a \{lexical_error}"; myobj.someOverloadedMethod($"this is definitely a template"); myobj.someOverloadedMethod("this is definitely a string!"); // no need to consult javadoc here Seems like the trade-off is straightforward: Cost: one character Benefit: instant disambiguation clarity in the developer's mind At least, it makes the whole API design/overload question straightforward. Put another way, StringTemplates are a cool new language feature, and as such it seems like they deserve a "first-class" allotment in the syntax of the language. -Archie -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.buckley at oracle.com Mon Mar 11 20:24:51 2024 From: alex.buckley at oracle.com (Alex Buckley) Date: Mon, 11 Mar 2024 13:24:51 -0700 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <636B984E-A544-4155-81D1-8752037A973B@oracle.com> <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com> <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com> <1848509621.25722001.1710162089780.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <1d47196c-2597-46d5-af21-7fc360b35775@oracle.com> On 3/11/2024 10:36 AM, Brian Goetz wrote: > Overall, though, I am not so enthused about creating yet another new > lexical mechanism for having different kinds of stringy things. All strings -- not just string literals, and not just constant expressions of type String -- can be composed with +. Is there an equivalent composition operator for string templates? (That is, all values of type StringTemplate, not just template literals.) I ask because the more lexical similarity between a template literal and a string literal, the more I think people will try to use + with two template literals, or with one template literal and one string literal. AIUI the result will be a surprise: String s = "Hello" + "\{x}"; // Second operand to + undergoes string conversion a.k.a. toString() print(s); // Hello0x12345678 Alex From james.laskey at oracle.com Mon Mar 11 20:47:10 2024 From: james.laskey at oracle.com (Jim Laskey) Date: Mon, 11 Mar 2024 20:47:10 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: <1d47196c-2597-46d5-af21-7fc360b35775@oracle.com> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <636B984E-A544-4155-81D1-8752037A973B@oracle.com> <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com> <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com> <1848509621.25722001.1710162089780.JavaMail.zimbra@univ-eiffel.fr> <1d47196c-2597-46d5-af21-7fc360b35775@oracle.com> Message-ID: <4F9D3A91-FD9E-41C6-99DB-59E1C8587FA0@oracle.com> String plus isn?t needed. Just as templates remove the need for string plus, the combining of string templates and strings can be done with nested embedded expressions. ? > On Mar 11, 2024, at 5:25?PM, Alex Buckley wrote: > > ?On 3/11/2024 10:36 AM, Brian Goetz wrote: >> Overall, though, I am not so enthused about creating yet another new lexical mechanism for having different kinds of stringy things. > All strings -- not just string literals, and not just constant expressions of type String -- can be composed with +. Is there an equivalent composition operator for string templates? (That is, all values of type StringTemplate, not just template literals.) > > I ask because the more lexical similarity between a template literal and a string literal, the more I think people will try to use + with two template literals, or with one template literal and one string literal. AIUI the result will be a surprise: > > String s = "Hello" + "\{x}"; > // Second operand to + undergoes string conversion a.k.a. toString() > print(s); // Hello0x12345678 > > Alex From alex.buckley at oracle.com Mon Mar 11 21:40:07 2024 From: alex.buckley at oracle.com (Alex Buckley) Date: Mon, 11 Mar 2024 14:40:07 -0700 Subject: Update on String Templates (JEP 459) In-Reply-To: <4F9D3A91-FD9E-41C6-99DB-59E1C8587FA0@oracle.com> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <636B984E-A544-4155-81D1-8752037A973B@oracle.com> <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com> <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com> <1848509621.25722001.1710162089780.JavaMail.zimbra@univ-eiffel.fr> <1d47196c-2597-46d5-af21-7fc360b35775@oracle.com> <4F9D3A91-FD9E-41C6-99DB-59E1C8587FA0@oracle.com> Message-ID: <2cfe511b-b729-43c8-8348-479b9d6efbbf@oracle.com> Given that few APIs will take StringTemplate on Day 1, I've been wondering how people will approach making string templates interoperate with APIs that only take String. They'll try the converts-to-String functionality of + -- "" + <> -- and find it's a dead end. BTW, it's always been true that there's no empty template literal, but with the removal of template processors there will be more "standalone" StringTemplate variables, and this will be a common error: StringTemplate st = ""; // Error, can't assign String to StringTemplate I also wondered if it could ever mean anything to compose two string templates with +. It feels similar to embedding a string template in a template literal. If the two templates being composed are in the same "language" then perhaps the Java language could help combine them. Alex On 3/11/2024 1:47 PM, Jim Laskey wrote: > String plus isn?t needed. Just as templates remove the need for string plus, the combining of string templates and strings can be done with nested embedded expressions. > ? > >> On Mar 11, 2024, at 5:25?PM, Alex Buckley wrote: >> >> ?On 3/11/2024 10:36 AM, Brian Goetz wrote: >>> Overall, though, I am not so enthused about creating yet another new lexical mechanism for having different kinds of stringy things. >> All strings -- not just string literals, and not just constant expressions of type String -- can be composed with +. Is there an equivalent composition operator for string templates? (That is, all values of type StringTemplate, not just template literals.) >> >> I ask because the more lexical similarity between a template literal and a string literal, the more I think people will try to use + with two template literals, or with one template literal and one string literal. AIUI the result will be a surprise: >> >> String s = "Hello" + "\{x}"; >> // Second operand to + undergoes string conversion a.k.a. toString() >> print(s); // Hello0x12345678 >> >> Alex From brian.goetz at oracle.com Tue Mar 12 17:08:47 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 12 Mar 2024 13:08:47 -0400 Subject: Update on String Templates (JEP 459) In-Reply-To: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> Message-ID: <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> OK, so let's summarize the EG discussion so far.? (As a reminder, syntax-heavy features like this are even more subject to "armchair theorization" than most, so please, take that into account when commenting.? As a further reminder, the best thing we could do right now is write more API code that manipulates string templates.) Overall, I think everyone agrees that the "make string templates the star of the show" approach is a winning direction.? No one seems too busted up at the loss of processors. I'm going to try and focus for now on "potential problems that might prompt further adjustment", rather than specific solutions. There is some ambient discomfort that the "sublanguage" of a template becomes a dynamic property of a template, introducing new opportunities for users to make mistakes with unprocessed templates.? (This was present before as well using the RAW processor, but much less prominent.)? But, I don't think this is a significant issue, its just something new to get used to. Most of the concerns have to do with the visual similarity between string literals and template literals.? While this is of course intended, there are some concerns that they may be "too similar". Concerns raised include: ?- In a code-generation scenario that leans on templates, sometimes we want to use a string literal as a degenerate form of template.? It may be surprising that this doesn't "just work", and alternatives (e.g., conversion functions, casting, etc) may have varying degrees of discoverability and yuck-factor. ?- Given (a) the visual similarity of string and template literals and (b) the lenient treatment of concatenation between strings and everything else, users may well be tempted to concatenate string literals with template literals, and may be surprised at the outcome. ?- Because template literals may be broad and wide, and their evaluation may involve side effects, we may want to give a lexical heads-up of "weird thing coming", rather than having template literals be framed more like "strings with benefits." Have I covered the concerns raised so far? Before we get too caught up in solutions, let's try to get on the same page about which of these are problems that need to be solved right now. (As a small matter of housekeeping, given that the preview train is already rolling, we will soon have to make a decision to (a) withdraw the current preview entirely, (b) re-preview the current design even though we know it will change, or (c) gain the requisite confidence in a new design in time to preview that. From my vantage point, (c) is starting to look increasingly unlikely, and I suspect (a) is a better choice than (b).? But I bring this up not to start a project management discussions, as much as to raise awareness that there are project management constraints.) On 3/8/2024 1:35 PM, Brian Goetz wrote: > > Time to check in with where were are with String Templates. ?We?ve > gone through two rounds of preview, and have received some feedback. > > As a reminder, the primary goal of gathering feedback is to learn > things about the design or implementation that we don?t already know. > ?This could be bug reports, experience reports, code review, careful > analysis, novel alternatives, etc. ? ?And the best feedback usually > comes from using the feature??in anger??? trying to actually write > code with it. (?Some people would prefer a?different syntax? or??some > people would prefer we focused on string interpolation only??fall > squarely in the??things we already knew? camp.) > > In the course of?using this feature in the `jextract` project, we did > learn quite a few things we didn?t already know, and this was > conclusive enough that it has motivated us to adjust our approach in > this feature. ?Specifically, the role of processors is ?outsized? to > the value they offer, and, after further exploration, we now believe > it is possible to achieve the goals of the feature without an explicit > ?processor? abstraction at all! ?This is a very positive development. > > First, I want to affirm that that the goals of the project have not > changed. ?From JEP 459: > > Goals > > ? Simplify the writing of Java programs by making it easy to express > strings that include values computed at run time. > ? Enhance the readability of expressions that mix text and > expressions, whether the text fits on a single source line (as with > string literals) or spans several source lines (as with text blocks). > ? Improve the security of Java programs that compose strings from > user-provided values and pass them to other systems (e.g., building > queries for databases) by supporting validation and transformation of > both the template and the values of its embedded expressions. > ? Retain flexibility by allowing Java libraries to define the > formatting syntax used in string templates. > ? Simplify the use of APIs that accept strings written in non-Java > languages (e.g., SQL, XML, and JSON). > ? Enable the creation of non-string values computed from literal text > and embedded expressions without having to transit through an > intermediate string representation. > > Non-Goals > ? It is not a goal to introduce syntactic sugar for Java's string > concatenation operator (+), since that would circumvent the goal of > validation. > ? It is not a goal to deprecate or remove the StringBuilder and > StringBuffer classes, which have traditionally been used for complex > or programmatic string composition. > > Another thing that has not changed is our view on the syntax for > embedding expressions. ?While many people did express the opinion of > ?why not ?just' do what Kotlin/Scala does?, this issue was more than > fully explored during the initial design round. ?(In fact, while > syntax disagreements are often purely subjective, this one was far > more clear ? the $-syntax is objectively worse, and would be doubly so > if injected into an existing language where there were already string > literals in the wild. ?This has all been more than adequately covered > elsewhere, so I won?t rehash it here.) > > > Now, let?s talk about what we do think should change: the role of > processors and the StringTemplate type. > > Processors were envisioned as a means to abstract the transformation > of templates to their final form (whether string, or something else.) > ?However, Java already has a well established means of abstracting > behavior: methods. ? (In fact, a processor application can be viewed > as merely a new syntax for a method call.) ?Our experience using the > feature highlighted the question: When converting a SQL query > expressed as a template to the form required by the database (such as > PreparedStatement), why do we need to say: > > ??DB.?? template ?? > > When we could use an ordinary Java library: > > ??Query q = Query.of(??template??) > > Indeed, one of the worst things about having processors in the > language is that API designers are put in the difficult situation of > not knowing whether to write a processor or an ordinary API, and often > have to make that choice before the consequences are fully understood. > ?(To add to this, processors raise similar questions at the use site.) > But the real?criticism here is that template capture and processing > are complected, when they should be separate, composable features. > > This motivated us to revisit some of the reasons why processors were > so central to the initial design in the first place. ?And it turned > out, this choice had been influenced ? perhaps overly so ? by early > implementation experiments. ?(One of the background design goals was > to enable expensive operations like `String::format` to be (much) > cheaper. ?Without digressing too deeply on performance, String::format > can be more than an order of magnitude worse than the equivalent > concatenation operation, and this in turn sometimes motivates > developers to use worse idioms for formatting. ?The FMT processor > brough that cost back in line with the equivalent concatenation.) > ?These early experiments biased the design towards needing to know the > processor at the point of template capture, but upon reexamination we > realized that there are other ways to achieve the desired performance > goals without requiring processors to be known at capture time. ?This, > in turn, enabled us to revisit a point in the design space we had > transited through earlier, where string templates were ?just a new > kind of literal? and the job performed by processors could instead be > performed by ordinary APIs. > > At this point, a simpler design and implementation emerged that met > the semantic, correctness, and performance goals: template literals > (?Hello \{name}?) are simply the literal form of StringTemplate: > > ??StringTemplate st = ?Hello \{name}?; > > String and StringTemplate remain unrelated types. ?(We explored a > number of ways to interconvert them, but they caused more trouble than > they solved.) ?Processing of string templates, including > interpolation, is done by ordinary APIs that deal in StringTemplate, > aided by some clever implementation tricks to ensure good performance. > > For APIs where interpolation is known to be safe in the domain, such > as PrintWriter, APIs can make that choice on behalf of the domain, by > providing overloads to embody this design choice: > > ???void println(String) { ? } > ???void println(StringTemplate) { ? interpolate and delegate to > println(String) ?. } > > The upshot is that for interpolation-safe APIs like println, we can > use a template directly without giving up any safety: > > ???System.out.println(?Hello \{name}?); > > In this example, the string template evaluates to StringTemplate, not > String (no implicit interpolation), and chooses the StringTemplate > overload of println, which in turn chooses how to process the > template. This stays true to the design principle that interpolation > is dangerous enough that it should be an explicit choice in the code ? > but it allows that choice to be made by libraries when the library is > comfortable doing so. > > Similarly, the FMT processor is replaced by an overload of > String::format that interprets templates with embedded format > specifiers (e.g., ?%d?): > > ??String format(String formatString, Object? parameters) { ? same as > today ? } > ??String format(StringTemplate template) {... equivalent of FMT ...} > > And users can call this as: > > ??String s = String.format(?Hello %12s\{name}?); > > Here, the String::format API has chosen to interpret string templates > according to the rules previously specified in the FMT processor (not > ordinary interpolation), but that choice is embedded in the library > semantics so no further explicit choice at the use site is required. > ?The user already chose to pass it to String::format; that?s all the > processing selection that is needed. > > Where APIs do not express a choice of what template expansion means, > users continue to be free to process them explicitly before passing > them, using APIs that do (such as String::format or ordinary > interpolation.). > > The result is: > > - The need for use-site "goop" (previously, the processor name; now, > static or instance methods to process a template) goes away entirely > when dealing with libraries that are already template-friendly. > - Even with libraries that require use-site goop, it is no more > intrusive than before, and can be reduced over time as APIs get with > the program. > - StringTemplate is just another type that APIs can support if they > want. ?The "DB" processor becomes an ordinary factory method that > accepts a string template or an ordinary builder API. > - APIs now can have _more_ control over the timing and meaning of > template processing, because we are not biasing so strongly towards > early processing. > - It becomes easier to abstract over template processing (i.e., > combine or manipulate templates as templates before processing) > - Interpolation remains an explicit choice, but ST-aware libraries can > make this choice on behalf of the user. > - The language feature and API surface get considerably smaller, which > is good. ?Core JDK APIs (e.g., println, format, exception > constructors) get upgraded to work with string templates. > > The remaining question that everyone is probably asking is: ?so how do > we do interpolation.? ?The answer there is ?ordinary library methods?. > ?This might be a static method (String.join(StringTemplate)) or an > instance method (template.join()), shed to be painted (but please, not > right now.). > > This is a sketch of direction, so feel free to pose questions/comments > on the direction. ?We?ll discuss the details as we go. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amaembo at gmail.com Tue Mar 12 17:24:10 2024 From: amaembo at gmail.com (Tagir Valeev) Date: Tue, 12 Mar 2024 18:24:10 +0100 Subject: Update on String Templates (JEP 459) In-Reply-To: <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <636B984E-A544-4155-81D1-8752037A973B@oracle.com> <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com> <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com> Message-ID: Hello, Maurizio! Thank you for the detailed explanation! On Mon, Mar 11, 2024 at 1:16?PM Maurizio Cimadamore < maurizio.cimadamore at oracle.com> wrote: > Hi all, > we tried mainly three approaches to allow smoother interop between strings > and string templates: (a) make String a subclass of StringTemplate. Or (b) > make constant strings bs *convertible* to string templates. Or, (c) use > target-typing. All these approaches have some issues, discussed below. > > The first approach is slightly simpler, because it can be achieved > entirely outside of the Java language. Unfortunately, adding ?String > implements StringTemplate? adds overload ambiguities in cases such as this: > > format(StringTemplate) // 1 > format(String, Object...) // 2 > > This is actually a very important case, as we predice that StringTemplate > will serve as a great replacement for methods out there accepting a > string/Object? pack. > > Unfortunatly, if String <: StringTemplate, this means that calling format > with a string literal will resolve to (1), not (2) as before. The problem > here is that (2) is not even applicable during the two overload resolution > phases (which is only allowed to use subtyping and conversions, > respectively), as it is a varargs method. Because of this, (1) will now > take the precedence, as that?s not varargs. While for String::format this > is probably harmless, changing results of overload selection is something > that should be done with care (esp. if different overloads have different > return types), as it could lead to source compatibility issues. > I would still like to advocate for String <: StringTemplate solution. I think that the overloading is not a big problem. Simply making String implements StringTemplate will not break any of existing code because there are no APIs yet that accept the StringTemplate instance. The problem may appear only when an API author actually adds such an overload and does this in an incompatible way with an existing String overload. This would be an extremely bad design choice, and the blame goes to the API author. You've correctly mentioned that for String::format this is harmless because the API is well-designed. We may suggest in StringTemplate documentation that the API designers should provide the same behavior for foo(String) and foo(StringTemplate) when they add an overload. I must say that we already had an experience of introducing new interfaces in the hierarchy of widely-used library classes. Closable got AutoClosable parent, StringBuilder became comparable, and so on. So far, the compatibility issues introduced were tolerable. Well, probably I'm missing something but we have preview rounds just for this purpose: to find out the disadvantages of the approach. > On top of these issues, making all strings be string templates has the > disadvantage of also considering ?messy? strings obtained via concatenation > of non-constant values string templates too, which seems bad. > I think that most of the APIs will still provide String overload. E.g., for preparing an SQL statement, it's a perfectly reasonable scenario to have a constant string as the input. So prepareStatement(String) will stay along with prepareStatement(StringTemplate). And people will still be able to use concatenation. I don't think that the absence of String <: StringTemplate relation will protect anybody from using the concatenation. On the other hand, if String actually implements StringTemplate, it will be a very simple static analysis rule to warn if the concatenation occurs in this context. If the expected type for concatenation is StringTemplate, then something is definitely wrong. Without 'String implements StringTemplate', one will not be able to write a concatenation directly in StringTemplate context. Instead, String-accepting overload will be used, and the expected type will be String, so static analyzer will have to guess whether it's dangerous to use the concatenation here. In short, I think that it's actually an advantage: we have an additional hint here that concatenation is undesired. Even compilation warning could be possible to implement. So, I don't see these points as real disadvantages. I definitely like this approach much more than adding any kind of implicit conversion or another literal syntax, which would complicate the specification much more. With best regards, Tagir Valeev. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Mar 12 17:32:20 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 12 Mar 2024 13:32:20 -0400 Subject: Does String extend StringTemplate? (Was: Update on String Templates (JEP 459)) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <636B984E-A544-4155-81D1-8752037A973B@oracle.com> <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com> <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com> Message-ID: <87dd98c8-e0ac-49e0-995a-5466a50219d3@oracle.com> Splitting off into a separate thread. I would like to redirect this discussion from the mechanical challenges and consequences to the goals and semantics. If we are considering "String extends StringTemplate", we are making a semantic statement that a String *is-a* StringTemplate. While I can imagine convincing oneself that this is true "if you look at it right", this sets off all my "self-justification" detectors. So, I recommend we step back and examine why we think this is a good idea before we descend into the mechanics.? My suspicion is that this is motivated by "I want to be able to automatically use String where a StringTemplate is desired", and that this seems a clever-enough hack to get there.? (I think we probably also need to drill further, into "why do we think it is important to be able to use String where StringTemplate is desired", and I suspect further that part of it will be "but the APIs are not yet fully equilibrated" (which would be a truly bad reason to give String a new supertype.)) On 3/12/2024 1:24 PM, Tagir Valeev wrote: > Hello, Maurizio! > > Thank you for the detailed explanation! > > On Mon, Mar 11, 2024 at 1:16?PM Maurizio Cimadamore > wrote: > > Hi all, > we tried mainly three approaches to allow smoother interop between > strings and string templates: (a) make String a subclass of > StringTemplate. Or (b) make constant strings bs /convertible/ to > string templates. Or, (c) use target-typing. All these approaches > have some issues, discussed below. > > The first approach is slightly simpler, because it can be achieved > entirely outside of the Java language. Unfortunately, adding > ?String implements StringTemplate? adds overload ambiguities in > cases such as this: > > |format(StringTemplate) // 1 format(String, Object...) // 2 | > > This is actually a very important case, as we predice that > StringTemplate will serve as a great replacement for methods out > there accepting a string/Object? pack. > > Unfortunatly, if String <: StringTemplate, this means that calling > format with a string literal will resolve to (1), not (2) as > before. The problem here is that (2) is not even applicable during > the two overload resolution phases (which is only allowed to use > subtyping and conversions, respectively), as it is a varargs > method. Because of this, (1) will now take the precedence, as > that?s not varargs. While for String::format this is probably > harmless, changing results of overload selection is something that > should be done with care (esp. if different overloads have > different return types), as it could lead to source compatibility > issues. > > I would still like to advocate for String <: StringTemplate solution. > I think that the overloading is not a big problem. Simply making > String implements StringTemplate will not break any of existing code > because there are no APIs yet that accept the StringTemplate instance. > The problem may appear only when an API author actually adds such an > overload and does this in an incompatible way with an existing String > overload. This would be an extremely bad design choice, and the blame > goes to the API author. You've correctly mentioned that for > String::format this is harmless because the API is well-designed. We > may suggest in StringTemplate documentation that the API designers > should provide the same behavior for foo(String) and > foo(StringTemplate) when they add an overload. > > I must say that we already had an experience of introducing new > interfaces in the hierarchy of widely-used library classes. Closable > got AutoClosable parent, StringBuilder became comparable, and so on. > So far, the compatibility issues introduced were tolerable. Well, > probably I'm missing something but we have preview rounds just for > this purpose: to find out the disadvantages of the approach. > > On top of these issues, making all strings be string templates has > the disadvantage of also considering ?messy? strings obtained via > concatenation of non-constant values string templates too, which > seems bad. > > I think that most of the APIs will still provide String overload. > E.g., for preparing an SQL statement, it's a perfectly reasonable > scenario?to have a constant string as the input. So > prepareStatement(String) will stay along with > prepareStatement(StringTemplate). And people will still be able to use > concatenation. I don't think that the absence of String <: > StringTemplate relation will protect anybody from using the > concatenation. On the other hand, if String actually implements > StringTemplate, it will be a very simple static analysis rule to warn > if the concatenation occurs in this context. If the expected type for > concatenation is StringTemplate, then something is definitely wrong. > Without 'String implements StringTemplate', one will not be able to > write a concatenation directly in StringTemplate context. Instead, > String-accepting overload will be used, and the expected type will be > String, so static analyzer will have to guess whether it's dangerous > to use the concatenation here. In short, I think that it's actually an > advantage: we have an additional hint here that concatenation is > undesired. Even compilation warning could be possible to implement. > > So, I don't see these points as real disadvantages. I definitely like > this approach much more than adding any kind of implicit conversion or > another literal syntax, which would complicate the specification much > more. > > With best regards, > Tagir Valeev. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Tue Mar 12 17:41:56 2024 From: guy.steele at oracle.com (Guy Steele) Date: Tue, 12 Mar 2024 17:41:56 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> Message-ID: <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com> Now that Maurizio has provided a delated explanation of prior investigations and some good examples, I am now convinced that the approach of providing a special-case conversion from String to StringTemplate is probably not a good idea. Then here is the decision tree that I would suggest: (1) If we decide that we do want, on its own merits, some up-front visual indication that distinguishes string literals from string templates, then it becomes easier to just say that strings and string templates are different beasts, neither a subtype of the other, and in particular (a) $?foo? (for example) is a degenerate string template, which is not the same as the string ?foo?; (b) $?? is a simple way to write an empty string template, in case you need to initialize a variable of type StringTemplate to something non-null; and (c) APIs should consider providing pairs of methods, where in each pair one takes a String argument and the other takes a StringTemplate argument. (2) If we decide we do not want that visual distinction, then we have the problem of whether ?foo? can be used as both a string and a string template. (2a) ?foo? is only a string, not a string template. This leads to some of the overloading problems that Maurizio has described, though I note that instead of StringTemplate x = ??; we could recommend StringTemplate x = StringTemplate.EMPTY; where StringTemplate provides a public static member named EMPTY. (2b) ?foo? can be used as both a string and a string template. In the absence of a special conversion, this would seem to require that String <: StringTemplate as Tagir suggests. On Mar 12, 2024, at 1:08?PM, Brian Goetz wrote: OK, so let's summarize the EG discussion so far. (As a reminder, syntax-heavy features like this are even more subject to "armchair theorization" than most, so please, take that into account when commenting. As a further reminder, the best thing we could do right now is write more API code that manipulates string templates.) Overall, I think everyone agrees that the "make string templates the star of the show" approach is a winning direction. No one seems too busted up at the loss of processors. I'm going to try and focus for now on "potential problems that might prompt further adjustment", rather than specific solutions. There is some ambient discomfort that the "sublanguage" of a template becomes a dynamic property of a template, introducing new opportunities for users to make mistakes with unprocessed templates. (This was present before as well using the RAW processor, but much less prominent.) But, I don't think this is a significant issue, its just something new to get used to. Most of the concerns have to do with the visual similarity between string literals and template literals. While this is of course intended, there are some concerns that they may be "too similar". Concerns raised include: - In a code-generation scenario that leans on templates, sometimes we want to use a string literal as a degenerate form of template. It may be surprising that this doesn't "just work", and alternatives (e.g., conversion functions, casting, etc) may have varying degrees of discoverability and yuck-factor. - Given (a) the visual similarity of string and template literals and (b) the lenient treatment of concatenation between strings and everything else, users may well be tempted to concatenate string literals with template literals, and may be surprised at the outcome. - Because template literals may be broad and wide, and their evaluation may involve side effects, we may want to give a lexical heads-up of "weird thing coming", rather than having template literals be framed more like "strings with benefits." Have I covered the concerns raised so far? Before we get too caught up in solutions, let's try to get on the same page about which of these are problems that need to be solved right now. (As a small matter of housekeeping, given that the preview train is already rolling, we will soon have to make a decision to (a) withdraw the current preview entirely, (b) re-preview the current design even though we know it will change, or (c) gain the requisite confidence in a new design in time to preview that. From my vantage point, (c) is starting to look increasingly unlikely, and I suspect (a) is a better choice than (b). But I bring this up not to start a project management discussions, as much as to raise awareness that there are project management constraints.) On 3/8/2024 1:35 PM, Brian Goetz wrote: Time to check in with where were are with String Templates. We?ve gone through two rounds of preview, and have received some feedback. As a reminder, the primary goal of gathering feedback is to learn things about the design or implementation that we don?t already know. This could be bug reports, experience reports, code review, careful analysis, novel alternatives, etc. And the best feedback usually comes from using the feature ?in anger? ? trying to actually write code with it. (?Some people would prefer a different syntax? or ?some people would prefer we focused on string interpolation only? fall squarely in the ?things we already knew? camp.) In the course of using this feature in the `jextract` project, we did learn quite a few things we didn?t already know, and this was conclusive enough that it has motivated us to adjust our approach in this feature. Specifically, the role of processors is ?outsized? to the value they offer, and, after further exploration, we now believe it is possible to achieve the goals of the feature without an explicit ?processor? abstraction at all! This is a very positive development. First, I want to affirm that that the goals of the project have not changed. From JEP 459: Goals ? Simplify the writing of Java programs by making it easy to express strings that include values computed at run time. ? Enhance the readability of expressions that mix text and expressions, whether the text fits on a single source line (as with string literals) or spans several source lines (as with text blocks). ? Improve the security of Java programs that compose strings from user-provided values and pass them to other systems (e.g., building queries for databases) by supporting validation and transformation of both the template and the values of its embedded expressions. ? Retain flexibility by allowing Java libraries to define the formatting syntax used in string templates. ? Simplify the use of APIs that accept strings written in non-Java languages (e.g., SQL, XML, and JSON). ? Enable the creation of non-string values computed from literal text and embedded expressions without having to transit through an intermediate string representation. Non-Goals ? It is not a goal to introduce syntactic sugar for Java's string concatenation operator (+), since that would circumvent the goal of validation. ? It is not a goal to deprecate or remove the StringBuilder and StringBuffer classes, which have traditionally been used for complex or programmatic string composition. Another thing that has not changed is our view on the syntax for embedding expressions. While many people did express the opinion of ?why not ?just' do what Kotlin/Scala does?, this issue was more than fully explored during the initial design round. (In fact, while syntax disagreements are often purely subjective, this one was far more clear ? the $-syntax is objectively worse, and would be doubly so if injected into an existing language where there were already string literals in the wild. This has all been more than adequately covered elsewhere, so I won?t rehash it here.) Now, let?s talk about what we do think should change: the role of processors and the StringTemplate type. Processors were envisioned as a means to abstract the transformation of templates to their final form (whether string, or something else.) However, Java already has a well established means of abstracting behavior: methods. (In fact, a processor application can be viewed as merely a new syntax for a method call.) Our experience using the feature highlighted the question: When converting a SQL query expressed as a template to the form required by the database (such as PreparedStatement), why do we need to say: DB.?? template ?? When we could use an ordinary Java library: Query q = Query.of(??template??) Indeed, one of the worst things about having processors in the language is that API designers are put in the difficult situation of not knowing whether to write a processor or an ordinary API, and often have to make that choice before the consequences are fully understood. (To add to this, processors raise similar questions at the use site.) But the real criticism here is that template capture and processing are complected, when they should be separate, composable features. This motivated us to revisit some of the reasons why processors were so central to the initial design in the first place. And it turned out, this choice had been influenced ? perhaps overly so ? by early implementation experiments. (One of the background design goals was to enable expensive operations like `String::format` to be (much) cheaper. Without digressing too deeply on performance, String::format can be more than an order of magnitude worse than the equivalent concatenation operation, and this in turn sometimes motivates developers to use worse idioms for formatting. The FMT processor brough that cost back in line with the equivalent concatenation.) These early experiments biased the design towards needing to know the processor at the point of template capture, but upon reexamination we realized that there are other ways to achieve the desired performance goals without requiring processors to be known at capture time. This, in turn, enabled us to revisit a point in the design space we had transited through earlier, where string templates were ?just a new kind of literal? and the job performed by processors could instead be performed by ordinary APIs. At this point, a simpler design and implementation emerged that met the semantic, correctness, and performance goals: template literals (?Hello \{name}?) are simply the literal form of StringTemplate: StringTemplate st = ?Hello \{name}?; String and StringTemplate remain unrelated types. (We explored a number of ways to interconvert them, but they caused more trouble than they solved.) Processing of string templates, including interpolation, is done by ordinary APIs that deal in StringTemplate, aided by some clever implementation tricks to ensure good performance. For APIs where interpolation is known to be safe in the domain, such as PrintWriter, APIs can make that choice on behalf of the domain, by providing overloads to embody this design choice: void println(String) { ? } void println(StringTemplate) { ? interpolate and delegate to println(String) ?. } The upshot is that for interpolation-safe APIs like println, we can use a template directly without giving up any safety: System.out.println(?Hello \{name}?); In this example, the string template evaluates to StringTemplate, not String (no implicit interpolation), and chooses the StringTemplate overload of println, which in turn chooses how to process the template. This stays true to the design principle that interpolation is dangerous enough that it should be an explicit choice in the code ? but it allows that choice to be made by libraries when the library is comfortable doing so. Similarly, the FMT processor is replaced by an overload of String::format that interprets templates with embedded format specifiers (e.g., ?%d?): String format(String formatString, Object? parameters) { ? same as today ? } String format(StringTemplate template) {... equivalent of FMT ...} And users can call this as: String s = String.format(?Hello %12s\{name}?); Here, the String::format API has chosen to interpret string templates according to the rules previously specified in the FMT processor (not ordinary interpolation), but that choice is embedded in the library semantics so no further explicit choice at the use site is required. The user already chose to pass it to String::format; that?s all the processing selection that is needed. Where APIs do not express a choice of what template expansion means, users continue to be free to process them explicitly before passing them, using APIs that do (such as String::format or ordinary interpolation.). The result is: - The need for use-site "goop" (previously, the processor name; now, static or instance methods to process a template) goes away entirely when dealing with libraries that are already template-friendly. - Even with libraries that require use-site goop, it is no more intrusive than before, and can be reduced over time as APIs get with the program. - StringTemplate is just another type that APIs can support if they want. The "DB" processor becomes an ordinary factory method that accepts a string template or an ordinary builder API. - APIs now can have _more_ control over the timing and meaning of template processing, because we are not biasing so strongly towards early processing. - It becomes easier to abstract over template processing (i.e., combine or manipulate templates as templates before processing) - Interpolation remains an explicit choice, but ST-aware libraries can make this choice on behalf of the user. - The language feature and API surface get considerably smaller, which is good. Core JDK APIs (e.g., println, format, exception constructors) get upgraded to work with string templates. The remaining question that everyone is probably asking is: ?so how do we do interpolation.? The answer there is ?ordinary library methods?. This might be a static method (String.join(StringTemplate)) or an instance method (template.join()), shed to be painted (but please, not right now.). This is a sketch of direction, so feel free to pose questions/comments on the direction. We?ll discuss the details as we go. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Tue Mar 12 17:54:03 2024 From: guy.steele at oracle.com (Guy Steele) Date: Tue, 12 Mar 2024 17:54:03 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com> Message-ID: I think I got my description of (2a) slightly wrong. Let me try again: ????? (2a) ?foo? is only a string, not a string template. In the absence of a special conversion, once again we are led to recommend that APIs provide pairs of methods, and I think we avoid most of the overloading problems that Maurizio has described. I note that instead of StringTemplate x = ??; we could recommend StringTemplate x = StringTemplate.EMPTY; where StringTemplate provides a public static member named EMPTY. We do have the burden of explaining to users that ?foo? is not a string template. ????? Now, all that said, I will now provide my best attempt to support the idea that (2b) is better than (2a): It will be difficult to explain the user a design such that the syntax of strings appears to be an obvious edge case of the syntax of string templates (the case where the number of interpolated expressions is zero) but the semantics of strings are not the obvious and analogous edge case of the semantics of string templates. (1) avoids this problem by making the syntaxes different. (2b) avoids the problem by making the semantics match. But (2a) totally has this problem. On Mar 12, 2024, at 1:41?PM, Guy Steele wrote: Now that Maurizio has provided a delated explanation of prior investigations and some good examples, I am now convinced that the approach of providing a special-case conversion from String to StringTemplate is probably not a good idea. Then here is the decision tree that I would suggest: (1) If we decide that we do want, on its own merits, some up-front visual indication that distinguishes string literals from string templates, then it becomes easier to just say that strings and string templates are different beasts, neither a subtype of the other, and in particular (a) $?foo? (for example) is a degenerate string template, which is not the same as the string ?foo?; (b) $?? is a simple way to write an empty string template, in case you need to initialize a variable of type StringTemplate to something non-null; and (c) APIs should consider providing pairs of methods, where in each pair one takes a String argument and the other takes a StringTemplate argument. (2) If we decide we do not want that visual distinction, then we have the problem of whether ?foo? can be used as both a string and a string template. (2a) ?foo? is only a string, not a string template. This leads to some of the overloading problems that Maurizio has described, though I note that instead of StringTemplate x = ??; we could recommend StringTemplate x = StringTemplate.EMPTY; where StringTemplate provides a public static member named EMPTY. (2b) ?foo? can be used as both a string and a string template. In the absence of a special conversion, this would seem to require that String <: StringTemplate as Tagir suggests. On Mar 12, 2024, at 1:08?PM, Brian Goetz wrote: OK, so let's summarize the EG discussion so far. (As a reminder, syntax-heavy features like this are even more subject to "armchair theorization" than most, so please, take that into account when commenting. As a further reminder, the best thing we could do right now is write more API code that manipulates string templates.) Overall, I think everyone agrees that the "make string templates the star of the show" approach is a winning direction. No one seems too busted up at the loss of processors. I'm going to try and focus for now on "potential problems that might prompt further adjustment", rather than specific solutions. There is some ambient discomfort that the "sublanguage" of a template becomes a dynamic property of a template, introducing new opportunities for users to make mistakes with unprocessed templates. (This was present before as well using the RAW processor, but much less prominent.) But, I don't think this is a significant issue, its just something new to get used to. Most of the concerns have to do with the visual similarity between string literals and template literals. While this is of course intended, there are some concerns that they may be "too similar". Concerns raised include: - In a code-generation scenario that leans on templates, sometimes we want to use a string literal as a degenerate form of template. It may be surprising that this doesn't "just work", and alternatives (e.g., conversion functions, casting, etc) may have varying degrees of discoverability and yuck-factor. - Given (a) the visual similarity of string and template literals and (b) the lenient treatment of concatenation between strings and everything else, users may well be tempted to concatenate string literals with template literals, and may be surprised at the outcome. - Because template literals may be broad and wide, and their evaluation may involve side effects, we may want to give a lexical heads-up of "weird thing coming", rather than having template literals be framed more like "strings with benefits." Have I covered the concerns raised so far? Before we get too caught up in solutions, let's try to get on the same page about which of these are problems that need to be solved right now. (As a small matter of housekeeping, given that the preview train is already rolling, we will soon have to make a decision to (a) withdraw the current preview entirely, (b) re-preview the current design even though we know it will change, or (c) gain the requisite confidence in a new design in time to preview that. From my vantage point, (c) is starting to look increasingly unlikely, and I suspect (a) is a better choice than (b). But I bring this up not to start a project management discussions, as much as to raise awareness that there are project management constraints.) On 3/8/2024 1:35 PM, Brian Goetz wrote: Time to check in with where were are with String Templates. We?ve gone through two rounds of preview, and have received some feedback. As a reminder, the primary goal of gathering feedback is to learn things about the design or implementation that we don?t already know. This could be bug reports, experience reports, code review, careful analysis, novel alternatives, etc. And the best feedback usually comes from using the feature ?in anger? ? trying to actually write code with it. (?Some people would prefer a different syntax? or ?some people would prefer we focused on string interpolation only? fall squarely in the ?things we already knew? camp.) In the course of using this feature in the `jextract` project, we did learn quite a few things we didn?t already know, and this was conclusive enough that it has motivated us to adjust our approach in this feature. Specifically, the role of processors is ?outsized? to the value they offer, and, after further exploration, we now believe it is possible to achieve the goals of the feature without an explicit ?processor? abstraction at all! This is a very positive development. First, I want to affirm that that the goals of the project have not changed. From JEP 459: Goals ? Simplify the writing of Java programs by making it easy to express strings that include values computed at run time. ? Enhance the readability of expressions that mix text and expressions, whether the text fits on a single source line (as with string literals) or spans several source lines (as with text blocks). ? Improve the security of Java programs that compose strings from user-provided values and pass them to other systems (e.g., building queries for databases) by supporting validation and transformation of both the template and the values of its embedded expressions. ? Retain flexibility by allowing Java libraries to define the formatting syntax used in string templates. ? Simplify the use of APIs that accept strings written in non-Java languages (e.g., SQL, XML, and JSON). ? Enable the creation of non-string values computed from literal text and embedded expressions without having to transit through an intermediate string representation. Non-Goals ? It is not a goal to introduce syntactic sugar for Java's string concatenation operator (+), since that would circumvent the goal of validation. ? It is not a goal to deprecate or remove the StringBuilder and StringBuffer classes, which have traditionally been used for complex or programmatic string composition. Another thing that has not changed is our view on the syntax for embedding expressions. While many people did express the opinion of ?why not ?just' do what Kotlin/Scala does?, this issue was more than fully explored during the initial design round. (In fact, while syntax disagreements are often purely subjective, this one was far more clear ? the $-syntax is objectively worse, and would be doubly so if injected into an existing language where there were already string literals in the wild. This has all been more than adequately covered elsewhere, so I won?t rehash it here.) Now, let?s talk about what we do think should change: the role of processors and the StringTemplate type. Processors were envisioned as a means to abstract the transformation of templates to their final form (whether string, or something else.) However, Java already has a well established means of abstracting behavior: methods. (In fact, a processor application can be viewed as merely a new syntax for a method call.) Our experience using the feature highlighted the question: When converting a SQL query expressed as a template to the form required by the database (such as PreparedStatement), why do we need to say: DB.?? template ?? When we could use an ordinary Java library: Query q = Query.of(??template??) Indeed, one of the worst things about having processors in the language is that API designers are put in the difficult situation of not knowing whether to write a processor or an ordinary API, and often have to make that choice before the consequences are fully understood. (To add to this, processors raise similar questions at the use site.) But the real criticism here is that template capture and processing are complected, when they should be separate, composable features. This motivated us to revisit some of the reasons why processors were so central to the initial design in the first place. And it turned out, this choice had been influenced ? perhaps overly so ? by early implementation experiments. (One of the background design goals was to enable expensive operations like `String::format` to be (much) cheaper. Without digressing too deeply on performance, String::format can be more than an order of magnitude worse than the equivalent concatenation operation, and this in turn sometimes motivates developers to use worse idioms for formatting. The FMT processor brough that cost back in line with the equivalent concatenation.) These early experiments biased the design towards needing to know the processor at the point of template capture, but upon reexamination we realized that there are other ways to achieve the desired performance goals without requiring processors to be known at capture time. This, in turn, enabled us to revisit a point in the design space we had transited through earlier, where string templates were ?just a new kind of literal? and the job performed by processors could instead be performed by ordinary APIs. At this point, a simpler design and implementation emerged that met the semantic, correctness, and performance goals: template literals (?Hello \{name}?) are simply the literal form of StringTemplate: StringTemplate st = ?Hello \{name}?; String and StringTemplate remain unrelated types. (We explored a number of ways to interconvert them, but they caused more trouble than they solved.) Processing of string templates, including interpolation, is done by ordinary APIs that deal in StringTemplate, aided by some clever implementation tricks to ensure good performance. For APIs where interpolation is known to be safe in the domain, such as PrintWriter, APIs can make that choice on behalf of the domain, by providing overloads to embody this design choice: void println(String) { ? } void println(StringTemplate) { ? interpolate and delegate to println(String) ?. } The upshot is that for interpolation-safe APIs like println, we can use a template directly without giving up any safety: System.out.println(?Hello \{name}?); In this example, the string template evaluates to StringTemplate, not String (no implicit interpolation), and chooses the StringTemplate overload of println, which in turn chooses how to process the template. This stays true to the design principle that interpolation is dangerous enough that it should be an explicit choice in the code ? but it allows that choice to be made by libraries when the library is comfortable doing so. Similarly, the FMT processor is replaced by an overload of String::format that interprets templates with embedded format specifiers (e.g., ?%d?): String format(String formatString, Object? parameters) { ? same as today ? } String format(StringTemplate template) {... equivalent of FMT ...} And users can call this as: String s = String.format(?Hello %12s\{name}?); Here, the String::format API has chosen to interpret string templates according to the rules previously specified in the FMT processor (not ordinary interpolation), but that choice is embedded in the library semantics so no further explicit choice at the use site is required. The user already chose to pass it to String::format; that?s all the processing selection that is needed. Where APIs do not express a choice of what template expansion means, users continue to be free to process them explicitly before passing them, using APIs that do (such as String::format or ordinary interpolation.). The result is: - The need for use-site "goop" (previously, the processor name; now, static or instance methods to process a template) goes away entirely when dealing with libraries that are already template-friendly. - Even with libraries that require use-site goop, it is no more intrusive than before, and can be reduced over time as APIs get with the program. - StringTemplate is just another type that APIs can support if they want. The "DB" processor becomes an ordinary factory method that accepts a string template or an ordinary builder API. - APIs now can have _more_ control over the timing and meaning of template processing, because we are not biasing so strongly towards early processing. - It becomes easier to abstract over template processing (i.e., combine or manipulate templates as templates before processing) - Interpolation remains an explicit choice, but ST-aware libraries can make this choice on behalf of the user. - The language feature and API surface get considerably smaller, which is good. Core JDK APIs (e.g., println, format, exception constructors) get upgraded to work with string templates. The remaining question that everyone is probably asking is: ?so how do we do interpolation.? The answer there is ?ordinary library methods?. This might be a static method (String.join(StringTemplate)) or an instance method (template.join()), shed to be painted (but please, not right now.). This is a sketch of direction, so feel free to pose questions/comments on the direction. We?ll discuss the details as we go. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ccherlin at gmail.com Tue Mar 12 22:08:07 2024 From: ccherlin at gmail.com (Clement Cherlin) Date: Tue, 12 Mar 2024 17:08:07 -0500 Subject: Update on String Templates (JEP 459) In-Reply-To: <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> Message-ID: I agree overall with the problem statement, with the following specific concerns: 1. Template literals (even ones with no embedded expressions) should be visually and syntactically distinct from string literals, because they are different types with different semantics. This visual distinction should be immediately and obviously apparent when reading code. 2. There should be an easy way to write a template literal with no embedded expressions, including an empty one. Motivation: A constant StringTemplate could, for example, be concatenated (via method call, not '+') with non-constant String Template(s) without having to mix String and StringTemplate types. 3. It should not be too easy to accidentally write a template literal when you mean to write a string literal, or vice-versa. 4. Changing a string literal to a template literal or vice-versa should be an explicit decision, not an implicit one. Conclusions: - There should be no implicit conversions, and context should not be required to determine whether a literal creates a String or a StringTemplate value. - Template literals should not differ from string literals solely by their contents. A template literal should either have a different quote character than a string literal, or have a mandatory prefix. As a consequence, inserting an embedded expression into a string literal without changing the quote character or adding the prefix should produce a compile-time error. Cheers, Clement Cherlin On Tue, Mar 12, 2024 at 2:20?PM Brian Goetz wrote: > > OK, so let's summarize the EG discussion so far. (As a reminder, syntax-heavy features like this are even more subject to "armchair theorization" than most, so please, take that into account when commenting. As a further reminder, the best thing we could do right now is write more API code that manipulates string templates.) > > Overall, I think everyone agrees that the "make string templates the star of the show" approach is a winning direction. No one seems too busted up at the loss of processors. > > I'm going to try and focus for now on "potential problems that might prompt further adjustment", rather than specific solutions. > > There is some ambient discomfort that the "sublanguage" of a template becomes a dynamic property of a template, introducing new opportunities for users to make mistakes with unprocessed templates. (This was present before as well using the RAW processor, but much less prominent.) But, I don't think this is a significant issue, its just something new to get used to. > > Most of the concerns have to do with the visual similarity between string literals and template literals. While this is of course intended, there are some concerns that they may be "too similar". Concerns raised include: > > - In a code-generation scenario that leans on templates, sometimes we want to use a string literal as a degenerate form of template. It may be surprising that this doesn't "just work", and alternatives (e.g., conversion functions, casting, etc) may have varying degrees of discoverability and yuck-factor. > > - Given (a) the visual similarity of string and template literals and (b) the lenient treatment of concatenation between strings and everything else, users may well be tempted to concatenate string literals with template literals, and may be surprised at the outcome. > > - Because template literals may be broad and wide, and their evaluation may involve side effects, we may want to give a lexical heads-up of "weird thing coming", rather than having template literals be framed more like "strings with benefits." > > Have I covered the concerns raised so far? > > Before we get too caught up in solutions, let's try to get on the same page about which of these are problems that need to be solved right now. > > > (As a small matter of housekeeping, given that the preview train is already rolling, we will soon have to make a decision to (a) withdraw the current preview entirely, (b) re-preview the current design even though we know it will change, or (c) gain the requisite confidence in a new design in time to preview that. From my vantage point, (c) is starting to look increasingly unlikely, and I suspect (a) is a better choice than (b). But I bring this up not to start a project management discussions, as much as to raise awareness that there are project management constraints.) > > > > > On 3/8/2024 1:35 PM, Brian Goetz wrote: > > > Time to check in with where were are with String Templates. We?ve gone through two rounds of preview, and have received some feedback. > > As a reminder, the primary goal of gathering feedback is to learn things about the design or implementation that we don?t already know. This could be bug reports, experience reports, code review, careful analysis, novel alternatives, etc. And the best feedback usually comes from using the feature ?in anger? ? trying to actually write code with it. (?Some people would prefer a different syntax? or ?some people would prefer we focused on string interpolation only? fall squarely in the ?things we already knew? camp.) > > In the course of using this feature in the `jextract` project, we did learn quite a few things we didn?t already know, and this was conclusive enough that it has motivated us to adjust our approach in this feature. Specifically, the role of processors is ?outsized? to the value they offer, and, after further exploration, we now believe it is possible to achieve the goals of the feature without an explicit ?processor? abstraction at all! This is a very positive development. > > First, I want to affirm that that the goals of the project have not changed. From JEP 459: > > Goals > > ? Simplify the writing of Java programs by making it easy to express strings that include values computed at run time. > ? Enhance the readability of expressions that mix text and expressions, whether the text fits on a single source line (as with string literals) or spans several source lines (as with text blocks). > ? Improve the security of Java programs that compose strings from user-provided values and pass them to other systems (e.g., building queries for databases) by supporting validation and transformation of both the template and the values of its embedded expressions. > ? Retain flexibility by allowing Java libraries to define the formatting syntax used in string templates. > ? Simplify the use of APIs that accept strings written in non-Java languages (e.g., SQL, XML, and JSON). > ? Enable the creation of non-string values computed from literal text and embedded expressions without having to transit through an intermediate string representation. > > Non-Goals > ? It is not a goal to introduce syntactic sugar for Java's string concatenation operator (+), since that would circumvent the goal of validation. > ? It is not a goal to deprecate or remove the StringBuilder and StringBuffer classes, which have traditionally been used for complex or programmatic string composition. > > Another thing that has not changed is our view on the syntax for embedding expressions. While many people did express the opinion of ?why not ?just' do what Kotlin/Scala does?, this issue was more than fully explored during the initial design round. (In fact, while syntax disagreements are often purely subjective, this one was far more clear ? the $-syntax is objectively worse, and would be doubly so if injected into an existing language where there were already string literals in the wild. This has all been more than adequately covered elsewhere, so I won?t rehash it here.) > > > Now, let?s talk about what we do think should change: the role of processors and the StringTemplate type. > > Processors were envisioned as a means to abstract the transformation of templates to their final form (whether string, or something else.) However, Java already has a well established means of abstracting behavior: methods. (In fact, a processor application can be viewed as merely a new syntax for a method call.) Our experience using the feature highlighted the question: When converting a SQL query expressed as a template to the form required by the database (such as PreparedStatement), why do we need to say: > > DB.?? template ?? > > When we could use an ordinary Java library: > > Query q = Query.of(??template??) > > Indeed, one of the worst things about having processors in the language is that API designers are put in the difficult situation of not knowing whether to write a processor or an ordinary API, and often have to make that choice before the consequences are fully understood. (To add to this, processors raise similar questions at the use site.) But the real criticism here is that template capture and processing are complected, when they should be separate, composable features. > > This motivated us to revisit some of the reasons why processors were so central to the initial design in the first place. And it turned out, this choice had been influenced ? perhaps overly so ? by early implementation experiments. (One of the background design goals was to enable expensive operations like `String::format` to be (much) cheaper. Without digressing too deeply on performance, String::format can be more than an order of magnitude worse than the equivalent concatenation operation, and this in turn sometimes motivates developers to use worse idioms for formatting. The FMT processor brough that cost back in line with the equivalent concatenation.) These early experiments biased the design towards needing to know the processor at the point of template capture, but upon reexamination we realized that there are other ways to achieve the desired performance goals without requiring processors to be known at capture time. This, in turn, enabled us to revisit a point in the design space we had transited through earlier, where string templates were ?just a new kind of literal? and the job performed by processors could instead be performed by ordinary APIs. > > At this point, a simpler design and implementation emerged that met the semantic, correctness, and performance goals: template literals (?Hello \{name}?) are simply the literal form of StringTemplate: > > StringTemplate st = ?Hello \{name}?; > > String and StringTemplate remain unrelated types. (We explored a number of ways to interconvert them, but they caused more trouble than they solved.) Processing of string templates, including interpolation, is done by ordinary APIs that deal in StringTemplate, aided by some clever implementation tricks to ensure good performance. > > For APIs where interpolation is known to be safe in the domain, such as PrintWriter, APIs can make that choice on behalf of the domain, by providing overloads to embody this design choice: > > void println(String) { ? } > void println(StringTemplate) { ? interpolate and delegate to println(String) ?. } > > The upshot is that for interpolation-safe APIs like println, we can use a template directly without giving up any safety: > > System.out.println(?Hello \{name}?); > > In this example, the string template evaluates to StringTemplate, not String (no implicit interpolation), and chooses the StringTemplate overload of println, which in turn chooses how to process the template. This stays true to the design principle that interpolation is dangerous enough that it should be an explicit choice in the code ? but it allows that choice to be made by libraries when the library is comfortable doing so. > > Similarly, the FMT processor is replaced by an overload of String::format that interprets templates with embedded format specifiers (e.g., ?%d?): > > String format(String formatString, Object? parameters) { ? same as today ? } > String format(StringTemplate template) {... equivalent of FMT ...} > > And users can call this as: > > String s = String.format(?Hello %12s\{name}?); > > Here, the String::format API has chosen to interpret string templates according to the rules previously specified in the FMT processor (not ordinary interpolation), but that choice is embedded in the library semantics so no further explicit choice at the use site is required. The user already chose to pass it to String::format; that?s all the processing selection that is needed. > > Where APIs do not express a choice of what template expansion means, users continue to be free to process them explicitly before passing them, using APIs that do (such as String::format or ordinary interpolation.). > > The result is: > > - The need for use-site "goop" (previously, the processor name; now, static or instance methods to process a template) goes away entirely when dealing with libraries that are already template-friendly. > - Even with libraries that require use-site goop, it is no more intrusive than before, and can be reduced over time as APIs get with the program. > - StringTemplate is just another type that APIs can support if they want. The "DB" processor becomes an ordinary factory method that accepts a string template or an ordinary builder API. > - APIs now can have _more_ control over the timing and meaning of template processing, because we are not biasing so strongly towards early processing. > - It becomes easier to abstract over template processing (i.e., combine or manipulate templates as templates before processing) > - Interpolation remains an explicit choice, but ST-aware libraries can make this choice on behalf of the user. > - The language feature and API surface get considerably smaller, which is good. Core JDK APIs (e.g., println, format, exception constructors) get upgraded to work with string templates. > > The remaining question that everyone is probably asking is: ?so how do we do interpolation.? The answer there is ?ordinary library methods?. This might be a static method (String.join(StringTemplate)) or an instance method (template.join()), shed to be painted (but please, not right now.). > > This is a sketch of direction, so feel free to pose questions/comments on the direction. We?ll discuss the details as we go. > > > From maurizio.cimadamore at oracle.com Wed Mar 13 10:29:21 2024 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 13 Mar 2024 10:29:21 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <636B984E-A544-4155-81D1-8752037A973B@oracle.com> <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com> <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com> Message-ID: Hi Tagir, while subclassing is handy, I think it actively works against the goal of trying to make string handling any safer. Let?s consider the case of a /new/ API, that wants to do things the /right/ way. This API will provide a StringTemplate-accepting factory. But if clients can supply a value using |"foo" + bar|, then we?re back to where we started: the new API is no safer than a String-accepting factory. Note: there is a big difference between passing |"foo" + bar| and |"foo\{bar}"|. In the former, the library only gets a string. It has no way to distinguish between which values were user-provided, and which ones were constant. In the latter, the string template has a value. The library might need to analyze that value more carefully as it might come from outside. The main value of string templates is to allow clients to capture what /can change/ and separate it from what /cannot/ change. Attacks typically lurk in the variable part. But eager interpolation (e.g. string +) destroys this separation. > I think that most of the APIs will still provide String overload. > E.g., for preparing an SQL statement, it's a perfectly reasonable > scenario?to have a constant string as the input. So > prepareStatement(String) will stay along with > prepareStatement(StringTemplate). And people will still be able to use > concatenation. I don't think that the absence of String <: > StringTemplate relation will protect anybody from using the > concatenation. On the other hand, if String actually implements > StringTemplate, it will be a very simple static analysis rule to warn > if the concatenation occurs in this context. If the expected type for > concatenation is StringTemplate, then something is definitely wrong. > Without 'String implements StringTemplate', one will not be able to > write a concatenation directly in StringTemplate context. Instead, > String-accepting overload will be used, and the expected type will be > String, so static analyzer will have to guess whether it's dangerous > to use the concatenation here. In short, I think that it's actually an > advantage: we have an additional hint here that concatenation is > undesired. Even compilation warning could be possible to implement. > > So, I don't see these points as real disadvantages. I definitely like > this approach much more than adding any kind of implicit conversion or > another literal syntax, which would complicate the specification much > more. I don?t buy that, since there?s already String-accepting API in the wild, then we can never be safer than that. String-accepting variant can be deprecated, if needs be. Maurizio ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From maurizio.cimadamore at oracle.com Wed Mar 13 14:36:28 2024 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 13 Mar 2024 14:36:28 +0000 Subject: Does String extend StringTemplate? (Was: Update on String Templates (JEP 459)) In-Reply-To: <87dd98c8-e0ac-49e0-995a-5466a50219d3@oracle.com> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <636B984E-A544-4155-81D1-8752037A973B@oracle.com> <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com> <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com> <87dd98c8-e0ac-49e0-995a-5466a50219d3@oracle.com> Message-ID: <82613b9b-0c51-4e30-a3c8-30843f1691f1@oracle.com> Hi Brian. I believe this is ultimately a bad idea. Note that I?ve been a strong supporter of this position in the past. Now, onto the reason I think it?s a bad idea. Let?s ignore legacy API for now. Let?s assume the world is already moved on and adopted StringTemplates. A new library that needs to parse strings which contain sensitive user-defined values, already got the memo, and will provide a StringTemplate-accepting factory (not merely a String- accepting one). This world is inherently safer than the world we have today - because a string template is typically composed of two facets: * a ?variable part? (the template arguments) * a ?constant part? (the template fragments) Libraries should focus their validation/escaping efforts on the variable part of a string template. But, for this assumption to hold water, we need to be able to guarantee that the user cannot accidentally sneak in some ?variable parts? into the ?constant part? of a string template. Unfortunately, the String <: StringTemplate approach seems to allow exactly that: |Foo of(StringTemplate) { ... } Foo.of("Hello!") // ok, string is a constant Foo.of("Hello" + world) // what? | If messy, concatenated strings can be treated as degenerate templates, I believe we?d be no better off than we are today - even in the case of brand new API that fully bought into the idea of StringTemplate. Maurizio On 12/03/2024 17:32, Brian Goetz wrote: > Splitting off into a separate thread. > > I would like to redirect this discussion from the mechanical > challenges and consequences to the goals and semantics. > > If we are considering "String extends StringTemplate", we are making a > semantic statement that a String *is-a* StringTemplate.? While I can > imagine convincing oneself that this is true "if you look at it > right", this sets off all my "self-justification" detectors. > > So, I recommend we step back and examine why we think this is a good > idea before we descend into the mechanics.? My suspicion is that this > is motivated by "I want to be able to automatically use String where a > StringTemplate is desired", and that this seems a clever-enough hack > to get there.? (I think we probably also need to drill further, into > "why do we think it is important to be able to use String where > StringTemplate is desired", and I suspect further that part of it will > be "but the APIs are not yet fully equilibrated" (which would be a > truly bad reason to give String a new supertype.)) > > > > > On 3/12/2024 1:24 PM, Tagir Valeev wrote: >> Hello, Maurizio! >> >> Thank you for the detailed explanation! >> >> On Mon, Mar 11, 2024 at 1:16?PM Maurizio Cimadamore >> wrote: >> >> Hi all, >> we tried mainly three approaches to allow smoother interop >> between strings and string templates: (a) make String a subclass >> of StringTemplate. Or (b) make constant strings bs /convertible/ >> to string templates. Or, (c) use target-typing. All these >> approaches have some issues, discussed below. >> >> The first approach is slightly simpler, because it can be >> achieved entirely outside of the Java language. Unfortunately, >> adding ?String implements StringTemplate? adds overload >> ambiguities in cases such as this: >> >> |format(StringTemplate) // 1 format(String, Object...) // 2 | >> >> This is actually a very important case, as we predice that >> StringTemplate will serve as a great replacement for methods out >> there accepting a string/Object? pack. >> >> Unfortunatly, if String <: StringTemplate, this means that >> calling format with a string literal will resolve to (1), not (2) >> as before. The problem here is that (2) is not even applicable >> during the two overload resolution phases (which is only allowed >> to use subtyping and conversions, respectively), as it is a >> varargs method. Because of this, (1) will now take the >> precedence, as that?s not varargs. While for String::format this >> is probably harmless, changing results of overload selection is >> something that should be done with care (esp. if different >> overloads have different return types), as it could lead to >> source compatibility issues. >> >> I would still like to advocate for String <: StringTemplate solution. >> I think that the overloading is not a big problem. Simply making >> String implements StringTemplate will not break any of existing code >> because there are no APIs yet that accept the StringTemplate >> instance. The problem may appear only when an API author actually >> adds such an overload and does this in an incompatible way with an >> existing String overload. This would be an extremely bad design >> choice, and the blame goes to the API author. You've correctly >> mentioned that for String::format this is harmless because the API is >> well-designed. We may suggest in StringTemplate documentation that >> the API designers should provide the same behavior for foo(String) >> and foo(StringTemplate) when they add an overload. >> >> I must say that we already had an experience of introducing new >> interfaces in the hierarchy of widely-used library classes. Closable >> got AutoClosable parent, StringBuilder became comparable, and so on. >> So far, the compatibility issues introduced were tolerable. Well, >> probably I'm missing something but we have preview rounds just for >> this purpose: to find out the disadvantages of the approach. >> >> On top of these issues, making all strings be string templates >> has the disadvantage of also considering ?messy? strings obtained >> via concatenation of non-constant values string templates too, >> which seems bad. >> >> I think that most of the APIs will still provide String overload. >> E.g., for preparing an SQL statement, it's a perfectly reasonable >> scenario?to have a constant string as the input. So >> prepareStatement(String) will stay along with >> prepareStatement(StringTemplate). And people will still be able to >> use concatenation. I don't think that the absence of String <: >> StringTemplate relation will protect anybody from using the >> concatenation. On the other hand, if String actually implements >> StringTemplate, it will be a very simple static analysis rule to warn >> if the concatenation occurs in this context. If the expected type for >> concatenation is StringTemplate, then something is definitely wrong. >> Without 'String implements StringTemplate', one will not be able to >> write a concatenation directly in StringTemplate context. Instead, >> String-accepting overload will be used, and the expected type will be >> String, so static analyzer will have to guess whether it's dangerous >> to use the concatenation here. In short, I think that it's actually >> an advantage: we have an additional hint here that concatenation is >> undesired. Even compilation warning could be possible to implement. >> >> So, I don't see these points as real disadvantages. I definitely like >> this approach much more than adding any kind of implicit conversion >> or another literal syntax, which would complicate the specification >> much more. >> >> With best regards, >> Tagir Valeev. >> > ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From archie.cobbs at gmail.com Wed Mar 13 14:48:45 2024 From: archie.cobbs at gmail.com (Archie Cobbs) Date: Wed, 13 Mar 2024 09:48:45 -0500 Subject: Update on String Templates (JEP 459) In-Reply-To: <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> Message-ID: On Tue, Mar 12, 2024 at 12:08?PM Brian Goetz wrote: > Have I covered the concerns raised so far? > Thanks for the helpful discussion check-point. This thread has touched on a lot of different bits so for a moment I want to focus on one narrow question. Forget for a moment all the stuff about method resolution, varargs, and whether String <: StringTemplate. I was intrigued by this comment (Maurizio): > Another, simpler, option we considered was to use some kind of prefix to mark a string template literal (e.g. make that explicit, instead of resorting to language wizardry). That works, but has the disadvantage of breaking the spell that there is only ?one string literal?, which is something we have worked quite hard to achieve. What exactly is the advantage, in terms of the mental model of the programmer, of having "one string literal"? Maybe I'm just not seeing it. I can understand the advantage of having String <: StringTemplate - that gives me more flexibility when passing around objects - great! But do I need that same flexibility with *literals*? Consider how we handle float vs. double literals. They overlap for 32-bit values, which is very convenient, but you can also "force" a narrower interpretation by adding an "f" suffix. That seems like pretty much the best of both worlds to me. So is this an analogous situation? Then we'd allow a StringTemplate literal to have an *optional* "$" prefix: obj.takingString("abcd"); // ok - string obj.takingTemplate("abcd"); // ok - template obj.takingStringOrTemplate($"abcd"); // ok - template obj.takingStringOrTemplate("abcd"); // ok - string or template (personally I don't care) obj.takingString($"abcd"); // fail obj.takingTemplate($"abcd"); // ok - template obj.takingString("x = \{var}"); // fail obj.takingTemplate("x = \{var}"); // ok - template Thanks, -Archie -- Archie L. Cobbs -------------- next part -------------- An HTML attachment was scrubbed... URL: From maurizio.cimadamore at oracle.com Wed Mar 13 15:40:12 2024 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 13 Mar 2024 15:40:12 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> Message-ID: <1d408e14-c41e-454f-94f1-3e0b597520e7@oracle.com> I don?t disagree. After some more pondering, while text blocks and string literals clearly have a lot of overlap - e.g. they both end up being String objects, that?s not the case with string templates. So, if the following assignment fails: |String s = "foo \{bar}"; | One might argue that perhaps the syntax should be ?more obviously different?. Maurizio On 13/03/2024 14:48, Archie Cobbs wrote: > Consider how we handle float vs. double literals. They overlap for > 32-bit values, which is very convenient, but you can also "force" a > narrower interpretation by adding an "f" suffix. That seems like > pretty much the best of both worlds to me. > > So is this an analogous situation? Then we'd allow a StringTemplate > literal to have an /optional/ "$" prefix: > > obj.takingString("abcd"); ??????????? // ok - string > obj.takingTemplate("abcd"); // ok - template > obj.takingStringOrTemplate($"abcd"); // ok - template > obj.takingStringOrTemplate("abcd"); // ok - string or template > (personally I don't care) > obj.takingString($"abcd"); // fail > obj.takingTemplate($"abcd"); // ok - template > obj.takingString("x = \{var}"); // fail > obj.takingTemplate("x = \{var}"); // ok - template ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From maurizio.cimadamore at oracle.com Wed Mar 13 15:45:49 2024 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 13 Mar 2024 15:45:49 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com> Message-ID: Hi Guy, On 12/03/2024 17:54, Guy Steele wrote: > (1) avoids this problem by making the syntaxes different. (2b) avoids > the problem by making the semantics match. But (2a) totally has this > problem. I agree that 2a leaves us in a place that is suboptimal. I think 2b is also undesirable (as I explained elsewhere), as it would compromise the design goals of the feature too much IMHO. So, the choice is (also IMHO) between an ad-hoc conversion (with the problems that I described in my previous email) and a different literal syntax (your (1)). Maurizio From guy.steele at oracle.com Wed Mar 13 18:03:02 2024 From: guy.steele at oracle.com (Guy Steele) Date: Wed, 13 Mar 2024 18:03:02 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com> Message-ID: Now that Maurizio has made quite clear the need for string templates to be understood as something distinct from strings, when used for security-related purposes (the need to ensure that material from interpolated expressions is vetted by the template processor), I agree that 2b is undesirable. We do not want to tempt users to think that ?Hello, \{x}? and ?Hello ? + x are completely interchangeable. On the other hand, there are applications where vetting is not important, and while the rest of this sentence is explicitly stated as a non-goal of JEP 459, I suspect we do, secretly, actually want users to feel free to use templates rather than ?+? concatenation to construct "plain old unvetted strings?. In the current state of JEP 459, this can be indicated in a clear way: STR.?Hello, \{x}? And of course STR can be replaced by the name of some other template processor. Brian has now proposed that the template processor mechanism is clunky and redundant, and would be better handled by just providing methods that take arguments of type StringTemplate. Sounds good to me. In that world, we would probably want a template processor method that takes a StringTemplate and just does obvious, unvetted string concatenation after doing `toString` on each of the expression values. An obvious name for this method is `String.of`. So we would write String.of(?Hello, \{x}?) But this is unsatisfying because it is verbose. I suggest that, rather than having a bit of prefix syntax that allows specification of any template processor, all we really need is a very concise prefix syntax that distinguishes the STR case from all other cases, the assumption being that all other cases do vetting of some sort (else they would just accept strings rather than string templates). That, plus Archie?s recent suggestion that ?$? be optional, leads me to suggest the following approach (which I suspect might be a good compromise because I expect that nearly everyone in this discussion will dislike some aspect of it :-) : ????????? String is not a subtype of StringTemplate; they are disjoint types. $?foo? is a (trivial) string template literal ?foo? is a string literal $?Hello, \{x}? is a (nontrivial) string template literal ?Hello, \{x}? is a shorthand (expanded by the compiler) for `String.of($?Hello, \{x}?)` ????????? Thus you always need ?$? to be present before the leading double quote to get a template value. If there is no ?$? before the leading double quote, you get a string value. String literals are constant expressions, but if what otherwise looks like a string literal (no leading ?$?) contains ?\{?, then it is not a constant expression (and if having a constant expression is important to some user, that user should use ?+? concatenation instead). We would need to think of a good name for "what otherwise looks like a string literal (no leading ?$?) but contains ?\{? ?; right now, the best I can think of is ?string interpolation literal??but it isn't really a literal, it?s an expression. Maybe the right terms are: $?foo? trivial string template expression ?foo? string literal $?Hello, \{x}? nontrivial string template expression ?Hello, \{x}? string interpolation expression APIs that need to vet things can provide methods that accept string templates but no methods that accept strings; type checking will then prevent the accident of writing SQL.process(?INSERT INTO Students (name) VALUES (\{new name});?); when it should have been SQL.process($?INSERT INTO Students (name) VALUES (\{new name});?); (This example is of course borrowed from the explanation of ?Little Bobby Tables? over at the Explain XKCD wiki https://www.explainxkcd.com/wiki/index.php/Robert%27);_DROP_TABLE_Students;-- .) On Mar 13, 2024, at 11:45?AM, Maurizio Cimadamore wrote: Hi Guy, On 12/03/2024 17:54, Guy Steele wrote: (1) avoids this problem by making the syntaxes different. (2b) avoids the problem by making the semantics match. But (2a) totally has this problem. I agree that 2a leaves us in a place that is suboptimal. I think 2b is also undesirable (as I explained elsewhere), as it would compromise the design goals of the feature too much IMHO. So, the choice is (also IMHO) between an ad-hoc conversion (with the problems that I described in my previous email) and a different literal syntax (your (1)). Maurizio -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Mar 13 19:05:59 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 13 Mar 2024 15:05:59 -0400 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> Message-ID: On 3/13/2024 10:48 AM, Archie Cobbs wrote: > > I was intrigued by this comment (Maurizio): > > > Another, simpler, option we considered was to use some kind of > prefix to mark a string template literal (e.g. make that explicit, > instead of resorting to language wizardry). That works, but has the > disadvantage of breaking the spell that there is only ?one string > literal?, which is something we have worked quite hard to achieve. > > What exactly is the advantage, in terms of the mental model of the > programmer, of having "one string literal"? When we started doing text blocks, we did a survey of string literal-like features in other languages, and were a little concerned that a lot of languages had a proliferation of different kinds of strings with different rules.?? (An example of "different rules for different kinds of strings" would be that $ is a regular character in a string literal, but an escape character in an interpolated string.) Before we figured out the design center of text blocks (think "two-dimensional string literals"), there were a number of envisioned extension directions for string literals -- multi-line, raw, embedded expressions, etc.? And because these extension directions are orthogonal, there could easily be 2^n kinds of string literal.? We didn't want to put users in a position of having to choose between e.g., "raw" and "multi-line", nor did we want to risk there being interactions between the rules for these different sub-kinds. One technique we use to tie together these various forms is by having a common sub-language within the quotes; each of the forms uses the same set of escape sequences (though this set is extended with context-specific options, such as \{ for templates.)? Another is the delimiters; they are all "double-quote flavored", again to provide a sense that these are all projections of the same core literal feature.? The more we wander from this center, the more we risk ending up with locally-sane but globally-inconsistent sub-features. > Maybe I'm just not seeing it. > > I can understand the advantage of having String <: StringTemplate - > that gives me more flexibility when passing around objects - great! > But do I need that same flexibility with /literals/? > > Consider how we handle float vs. double literals. They overlap for > 32-bit values, which is very convenient, but you can also "force" a > narrower interpretation by adding an "f" suffix. That seems like > pretty much the best of both worlds to me. > > So is this an analogous situation? Then we'd allow a StringTemplate > literal to have an /optional/ "$" prefix: > > obj.takingString("abcd"); ??????????? // ok - string > obj.takingTemplate("abcd"); // ok - template > obj.takingStringOrTemplate($"abcd"); // ok - template > obj.takingStringOrTemplate("abcd"); // ok - string or template > (personally I don't care) > obj.takingString($"abcd"); // fail > obj.takingTemplate($"abcd"); // ok - template > obj.takingString("x = \{var}"); // fail > obj.takingTemplate("x = \{var}"); // ok - template > > Thanks, > -Archie > > -- > Archie L. Cobbs -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Wed Mar 13 19:33:19 2024 From: john.r.rose at oracle.com (John Rose) Date: Wed, 13 Mar 2024 12:33:19 -0700 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> Message-ID: On 9 Mar 2024, at 3:48, Tagir Valeev wrote: > The idea is interesting. There's a thing that disturbs me though. > Currently, proc."string" and proc."string \{template}" are uniformly > processed, and the processor may not care much about whether it's a string > or a template: both can be processed uniformly. After this change, removing > the last embedded expression from the template (e.g., after inlining a > constant) will implicitly change the type of the literal from > StringTemplate to String. This may either cause a compilation error, or > silently bind to another overload which may or may not behave like a > template overload with a single-fragment-template. For API authors, this > means that every method accepting StringTemplate should have a counterpart > accepting String. The logic inside both methods would likely be very > similar, so probably both will eventually call a third private method. For > API user, it could be unclear how to call a method accepting StringTemplate > if I have simple string in hands but there's no String method (or it does > slightly different thing due to poor API design). Should I use some ugly > construct like "This is a string but the API wants a template, so I append > an empty embedded expression\{""}"? This is a huge thread that I hesitate to dive into, but here?s me putting in one toe: Why do we care so much about no-arg string templates? It?s a small corner case! The workarounds (for the no-arg case) are totally straightforward even if the string template literals (as a syntax) are required to have at least one argument. Can we have a plausible use case, please, for why a ST with no arguments would be important, so important that we are motived to invent a sigil syntax or special type system rules, to avoid requiring the user to invoke a static factory? Also, Tagir?s workaround of adding a fake argument looks like it would work just fine, of course depending on which processor was eventually used. And in that vein let me add one new (very bike-sheddy) suggestion before I beat a hasty retreat: Instead of in (1) a sigil before the quote like Guy?s $"hello", put it (1b) after the quote, and in the ST case only. The ST syntax could explicitly allow that a no-arg string template would be spelled with a leading sequence "\{}... which looks like the coder started writing a ST argument, but in fact dropped it. So "hello" is a 5-char string, in any context. And "\{}hello" is a 5-char no-arg string template, in any context. That?s Tagir?s workaround, elevated a bit into a new corner case of (existing) syntax. But even that teeny bit of syntax strikes me as overkill, because I don?t see the importance of the use cases (no-arg STs) it helps. Just call ST.of("hello") and call it a day. In any case, it seems fine to let the IDE take the lead with no-arg STs, helping the user decide when and how to disambiguate strings from no-arg STs. Putting in syntax or type system help for this is surely more expensive than punting to the IDE, unless there is going to be heavy use of no-arg STs for some use cases I am not seeing. From guy.steele at oracle.com Wed Mar 13 20:13:30 2024 From: guy.steele at oracle.com (Guy Steele) Date: Wed, 13 Mar 2024 20:13:30 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> Message-ID: > On Mar 13, 2024, at 3:33?PM, John Rose wrote: > > On 9 Mar 2024, at 3:48, Tagir Valeev wrote: > >> The idea is interesting. There's a thing that disturbs me though. >> Currently, proc."string" and proc."string \{template}" are uniformly >> processed, and the processor may not care much about whether it's a string >> or a template: both can be processed uniformly. After this change, removing >> the last embedded expression from the template (e.g., after inlining a >> constant) will implicitly change the type of the literal from >> StringTemplate to String. This may either cause a compilation error, or >> silently bind to another overload which may or may not behave like a >> template overload with a single-fragment-template. For API authors, this >> means that every method accepting StringTemplate should have a counterpart >> accepting String. The logic inside both methods would likely be very >> similar, so probably both will eventually call a third private method. For >> API user, it could be unclear how to call a method accepting StringTemplate >> if I have simple string in hands but there's no String method (or it does >> slightly different thing due to poor API design). Should I use some ugly >> construct like "This is a string but the API wants a template, so I append >> an empty embedded expression\{""}"? > > This is a huge thread that I hesitate to dive into, but here?s me putting in one toe: Why do we care so much about no-arg string templates? It?s a small corner case! The workarounds (for the no-arg case) are totally straightforward even if the string template literals (as a syntax) are required to have at least one argument. > > Can we have a plausible use case, please, for why a ST with no arguments would be important, so important that we are motived to invent a sigil syntax or special type system rules, to avoid requiring the user to invoke a static factory? > > Also, Tagir?s workaround of adding a fake argument looks like it would work just fine, of course depending on which processor was eventually used. > > And in that vein let me add one new (very bike-sheddy) suggestion before I beat a hasty retreat: Instead of in (1) a sigil before the quote like Guy?s $"hello", put it (1b) after the quote, and in the ST case only. The ST syntax could explicitly allow that a no-arg string template would be spelled with a leading sequence "\{}... which looks like the coder started writing a ST argument, but in fact dropped it. So "hello" is a 5-char string, in any context. And "\{}hello" is a 5-char no-arg string template, in any context. That?s Tagir?s workaround, elevated a bit into a new corner case of (existing) syntax. > > But even that teeny bit of syntax strikes me as overkill, because I don?t see the importance of the use cases (no-arg STs) it helps. Just call ST.of("hello") and call it a day. > > In any case, it seems fine to let the IDE take the lead with no-arg STs, helping the user decide when and how to disambiguate strings from no-arg STs. Putting in syntax or type system help for this is surely more expensive than punting to the IDE, unless there is going to be heavy use of no-arg STs for some use cases I am not seeing. Well, just off the top of my head as a thought experiment, if I had a series of SQL commands to process, some with arguments and some not, I would rather write SQL.process($?CREATE TABLE foo;?); SQL.process($?ALTER TABLE foo ADD name varchar(40);?); SQL.process($?ALTER TABLE foo ADD title varchar(30);?); SQL.process($?INSERT INTO foo (name, title) VALUES (?Guy?, ?Hacker?);?); SQL.process($?INSERT INTO foo (name, title) VALUES (\{other name}, \{other job});?); than SQL.process(ST.of(?CREATE TABLE foo;?)); SQL.process(ST.of(?ALTER TABLE foo ADD name varchar(40);?)); SQL.process(ST.of(?ALTER TABLE foo ADD title varchar(30);?)); SQL.process(ST.of(?INSERT INTO foo (name, title) VALUES (?Guy?, ?Hacker?);?)); SQL.process(?INSERT INTO foo (name, title) VALUES (\{other name}, \{other job});?); especially if I thought that maybe down the road I might want to change the constants 30 and 40 and ?Hacker' to variables. I don't want to have to keep adding and deleting calls to ST.of as I edit the template strings during program development to have different numbers of interpolated expressions. ?Guy From forax at univ-mlv.fr Wed Mar 13 20:34:37 2024 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 13 Mar 2024 21:34:37 +0100 (CET) Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> Message-ID: <446018562.28334858.1710362077464.JavaMail.zimbra@univ-eiffel.fr> ----- Original Message ----- > From: "Guy Steele" > To: "John Rose" > Cc: "Tagir Valeev" , "Brian Goetz" , "amber-spec-experts" > > Sent: Wednesday, March 13, 2024 9:13:30 PM > Subject: Re: Update on String Templates (JEP 459) >> On Mar 13, 2024, at 3:33?PM, John Rose wrote: >> >> On 9 Mar 2024, at 3:48, Tagir Valeev wrote: >> >>> The idea is interesting. There's a thing that disturbs me though. >>> Currently, proc."string" and proc."string \{template}" are uniformly >>> processed, and the processor may not care much about whether it's a string >>> or a template: both can be processed uniformly. After this change, removing >>> the last embedded expression from the template (e.g., after inlining a >>> constant) will implicitly change the type of the literal from >>> StringTemplate to String. This may either cause a compilation error, or >>> silently bind to another overload which may or may not behave like a >>> template overload with a single-fragment-template. For API authors, this >>> means that every method accepting StringTemplate should have a counterpart >>> accepting String. The logic inside both methods would likely be very >>> similar, so probably both will eventually call a third private method. For >>> API user, it could be unclear how to call a method accepting StringTemplate >>> if I have simple string in hands but there's no String method (or it does >>> slightly different thing due to poor API design). Should I use some ugly >>> construct like "This is a string but the API wants a template, so I append >>> an empty embedded expression\{""}"? >> >> This is a huge thread that I hesitate to dive into, but here?s me putting in one >> toe: Why do we care so much about no-arg string templates? It?s a small >> corner case! The workarounds (for the no-arg case) are totally straightforward >> even if the string template literals (as a syntax) are required to have at >> least one argument. >> >> Can we have a plausible use case, please, for why a ST with no arguments would >> be important, so important that we are motived to invent a sigil syntax or >> special type system rules, to avoid requiring the user to invoke a static >> factory? >> >> Also, Tagir?s workaround of adding a fake argument looks like it would work just >> fine, of course depending on which processor was eventually used. >> >> And in that vein let me add one new (very bike-sheddy) suggestion before I beat >> a hasty retreat: Instead of in (1) a sigil before the quote like Guy?s >> $"hello", put it (1b) after the quote, and in the ST case only. The ST syntax >> could explicitly allow that a no-arg string template would be spelled with a >> leading sequence "\{}... which looks like the coder started writing a ST >> argument, but in fact dropped it. So "hello" is a 5-char string, in any >> context. And "\{}hello" is a 5-char no-arg string template, in any context. >> That?s Tagir?s workaround, elevated a bit into a new corner case of (existing) >> syntax. >> >> But even that teeny bit of syntax strikes me as overkill, because I don?t see >> the importance of the use cases (no-arg STs) it helps. Just call >> ST.of("hello") and call it a day. >> >> In any case, it seems fine to let the IDE take the lead with no-arg STs, helping >> the user decide when and how to disambiguate strings from no-arg STs. Putting >> in syntax or type system help for this is surely more expensive than punting to >> the IDE, unless there is going to be heavy use of no-arg STs for some use cases >> I am not seeing. > > Well, just off the top of my head as a thought experiment, if I had a series of > SQL commands to process, some with arguments and some not, I would rather write > > SQL.process($?CREATE TABLE foo;?); > SQL.process($?ALTER TABLE foo ADD name varchar(40);?); > SQL.process($?ALTER TABLE foo ADD title varchar(30);?); > SQL.process($?INSERT INTO foo (name, title) VALUES (?Guy?, ?Hacker?);?); > SQL.process($?INSERT INTO foo (name, title) VALUES (\{other name}, \{other > job});?); > > than > > SQL.process(ST.of(?CREATE TABLE foo;?)); > SQL.process(ST.of(?ALTER TABLE foo ADD name varchar(40);?)); > SQL.process(ST.of(?ALTER TABLE foo ADD title varchar(30);?)); > SQL.process(ST.of(?INSERT INTO foo (name, title) VALUES (?Guy?, ?Hacker?);?)); > SQL.process(?INSERT INTO foo (name, title) VALUES (\{other name}, \{other > job});?); > > especially if I thought that maybe down the road I might want to change the > constants 30 and 40 and ?Hacker' to variables. I don't want to have to keep > adding and deleting calls to ST.of as I edit the template strings during > program development to have different numbers of interpolated expressions. Given what Maurizio said and this, i think the only missing piece in the puzzle is what about existing methods taking a String as parameter. We know that for SQL.process(), we do not want process() to take a String but only a StringTemplate. But what about the existing methods that takes a String. Given a method Logger.warning(String), should LOG.warning($?CREATE TABLE foo;?); LOG.warning($?INSERT INTO foo (name, title) VALUES (\{other name}, \{other job});?); be legal ? Is there an auto-conversion (a kind of boxing conversion) from StringTemplate to String ? > > ?Guy R?mi From guy.steele at oracle.com Wed Mar 13 21:04:46 2024 From: guy.steele at oracle.com (Guy Steele) Date: Wed, 13 Mar 2024 21:04:46 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: <446018562.28334858.1710362077464.JavaMail.zimbra@univ-eiffel.fr> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <446018562.28334858.1710362077464.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <5241ED01-588E-45DD-9C59-2F6A3D62F3B8@oracle.com> > On Mar 13, 2024, at 4:34?PM, Remi Forax wrote: > > ----- Original Message ----- >> From: "Guy Steele" >> To: "John Rose" >> Cc: "Tagir Valeev" , "Brian Goetz" , "amber-spec-experts" >> >> Sent: Wednesday, March 13, 2024 9:13:30 PM >> Subject: Re: Update on String Templates (JEP 459) > >>> On Mar 13, 2024, at 3:33?PM, John Rose wrote: >>> >>> On 9 Mar 2024, at 3:48, Tagir Valeev wrote: >>> >>>> The idea is interesting. There's a thing that disturbs me though. >>>> Currently, proc."string" and proc."string \{template}" are uniformly >>>> processed, and the processor may not care much about whether it's a string >>>> or a template: both can be processed uniformly. After this change, removing >>>> the last embedded expression from the template (e.g., after inlining a >>>> constant) will implicitly change the type of the literal from >>>> StringTemplate to String. This may either cause a compilation error, or >>>> silently bind to another overload which may or may not behave like a >>>> template overload with a single-fragment-template. For API authors, this >>>> means that every method accepting StringTemplate should have a counterpart >>>> accepting String. The logic inside both methods would likely be very >>>> similar, so probably both will eventually call a third private method. For >>>> API user, it could be unclear how to call a method accepting StringTemplate >>>> if I have simple string in hands but there's no String method (or it does >>>> slightly different thing due to poor API design). Should I use some ugly >>>> construct like "This is a string but the API wants a template, so I append >>>> an empty embedded expression\{""}"? >>> >>> This is a huge thread that I hesitate to dive into, but here?s me putting in one >>> toe: Why do we care so much about no-arg string templates? It?s a small >>> corner case! The workarounds (for the no-arg case) are totally straightforward >>> even if the string template literals (as a syntax) are required to have at >>> least one argument. >>> >>> Can we have a plausible use case, please, for why a ST with no arguments would >>> be important, so important that we are motived to invent a sigil syntax or >>> special type system rules, to avoid requiring the user to invoke a static >>> factory? >>> >>> Also, Tagir?s workaround of adding a fake argument looks like it would work just >>> fine, of course depending on which processor was eventually used. >>> >>> And in that vein let me add one new (very bike-sheddy) suggestion before I beat >>> a hasty retreat: Instead of in (1) a sigil before the quote like Guy?s >>> $"hello", put it (1b) after the quote, and in the ST case only. The ST syntax >>> could explicitly allow that a no-arg string template would be spelled with a >>> leading sequence "\{}... which looks like the coder started writing a ST >>> argument, but in fact dropped it. So "hello" is a 5-char string, in any >>> context. And "\{}hello" is a 5-char no-arg string template, in any context. >>> That?s Tagir?s workaround, elevated a bit into a new corner case of (existing) >>> syntax. >>> >>> But even that teeny bit of syntax strikes me as overkill, because I don?t see >>> the importance of the use cases (no-arg STs) it helps. Just call >>> ST.of("hello") and call it a day. >>> >>> In any case, it seems fine to let the IDE take the lead with no-arg STs, helping >>> the user decide when and how to disambiguate strings from no-arg STs. Putting >>> in syntax or type system help for this is surely more expensive than punting to >>> the IDE, unless there is going to be heavy use of no-arg STs for some use cases >>> I am not seeing. >> >> Well, just off the top of my head as a thought experiment, if I had a series of >> SQL commands to process, some with arguments and some not, I would rather write >> >> SQL.process($?CREATE TABLE foo;?); >> SQL.process($?ALTER TABLE foo ADD name varchar(40);?); >> SQL.process($?ALTER TABLE foo ADD title varchar(30);?); >> SQL.process($?INSERT INTO foo (name, title) VALUES (?Guy?, ?Hacker?);?); >> SQL.process($?INSERT INTO foo (name, title) VALUES (\{other name}, \{other >> job});?); >> >> than >> >> SQL.process(ST.of(?CREATE TABLE foo;?)); >> SQL.process(ST.of(?ALTER TABLE foo ADD name varchar(40);?)); >> SQL.process(ST.of(?ALTER TABLE foo ADD title varchar(30);?)); >> SQL.process(ST.of(?INSERT INTO foo (name, title) VALUES (?Guy?, ?Hacker?);?)); >> SQL.process(?INSERT INTO foo (name, title) VALUES (\{other name}, \{other >> job});?); >> >> especially if I thought that maybe down the road I might want to change the >> constants 30 and 40 and ?Hacker' to variables. I don't want to have to keep >> adding and deleting calls to ST.of as I edit the template strings during >> program development to have different numbers of interpolated expressions. > > Given what Maurizio said and this, i think the only missing piece in the puzzle is what about existing methods taking a String as parameter. > > We know that for SQL.process(), we do not want process() to take a String but only a StringTemplate. > But what about the existing methods that takes a String. > > Given a method Logger.warning(String), should > LOG.warning($?CREATE TABLE foo;?); > LOG.warning($?INSERT INTO foo (name, title) VALUES (\{other name}, \{other job});?); > > be legal ? Is there an auto-conversion (a kind of boxing conversion) from StringTemplate to String ? In my proposal, the answer would be ?no?. Instead you would have two choices: (1) Instead of string template expressions as in the example just given, you could use string literals or string interpolation expressions (omit the ?$? characters): LOG.warning(?CREATE TABLE foo;?); LOG.warning(?INSERT INTO foo (name, title) VALUES (\{other name}, \{other job});?); (2) If instead you have some other sort of expression (such as a variable) whose type is StringTempate, you can write LOG.warning(String.of(myStringTemplate)); This makes quite explicit that a conversion is happening from StringTemplate to String. From guy.steele at oracle.com Wed Mar 13 21:12:15 2024 From: guy.steele at oracle.com (Guy Steele) Date: Wed, 13 Mar 2024 21:12:15 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: <5241ED01-588E-45DD-9C59-2F6A3D62F3B8@oracle.com> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <446018562.28334858.1710362077464.JavaMail.zimbra@univ-eiffel.fr> <5241ED01-588E-45DD-9C59-2F6A3D62F3B8@oracle.com> Message-ID: > On Mar 13, 2024, at 5:04?PM, Guy Steele wrote: > >> On Mar 13, 2024, at 4:34?PM, Remi Forax wrote: >> >> Given what Maurizio said and this, i think the only missing piece in the puzzle is what about existing methods taking a String as parameter. >> >> We know that for SQL.process(), we do not want process() to take a String but only a StringTemplate. >> But what about the existing methods that takes a String. >> >> Given a method Logger.warning(String), should >> LOG.warning($?CREATE TABLE foo;?); >> LOG.warning($?INSERT INTO foo (name, title) VALUES (\{other name}, \{other job});?); >> >> be legal ? Is there an auto-conversion (a kind of boxing conversion) from StringTemplate to String ? > > In my proposal, the answer would be ?no?. Instead you would have two choices: > > (1) Instead of string template expressions as in the example just given, you could use string literals or string interpolation expressions (omit the ?$? characters): > > LOG.warning(?CREATE TABLE foo;?); > LOG.warning(?INSERT INTO foo (name, title) VALUES (\{other name}, \{other job});?); > > (2) If instead you have some other sort of expression (such as a variable) whose type is StringTempate, you can write > > LOG.warning(String.of(myStringTemplate)); > > This makes quite explicit that a conversion is happening from StringTemplate to String. That reminds me: I would recommend that the instance method `toString` for class StringTemplate _not_ be the same as `String.of(Template)`; rather, it should print in some form that shows the internal structure of the StringTemplate. ?Guy From brian.goetz at oracle.com Wed Mar 13 21:25:21 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 13 Mar 2024 17:25:21 -0400 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <446018562.28334858.1710362077464.JavaMail.zimbra@univ-eiffel.fr> <5241ED01-588E-45DD-9C59-2F6A3D62F3B8@oracle.com> Message-ID: <280a40e9-2c80-4699-8464-22fca2944b4b@oracle.com> That is how it works in the current version, and this behavior would be carried forward.? Otherwise, it is a form of implicit interpolation, which goes against the goals of the project. On 3/13/2024 5:12 PM, Guy Steele wrote: > That reminds me: I would recommend that the instance method `toString` for class StringTemplate_not_ > be the same as `String.of(Template)`; rather, it should print in some > form that shows the internal structure of the StringTemplate. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Wed Mar 13 22:00:57 2024 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Wed, 13 Mar 2024 23:00:57 +0100 (CET) Subject: Update on String Templates (JEP 459) In-Reply-To: <5241ED01-588E-45DD-9C59-2F6A3D62F3B8@oracle.com> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <446018562.28334858.1710362077464.JavaMail.zimbra@univ-eiffel.fr> <5241ED01-588E-45DD-9C59-2F6A3D62F3B8@oracle.com> Message-ID: <665665054.28373576.1710367257784.JavaMail.zimbra@univ-eiffel.fr> ----- Original Message ----- > From: "Guy Steele" > To: "Remi Forax" > Cc: "Brian Goetz" , "amber-spec-experts" > Sent: Wednesday, March 13, 2024 10:04:46 PM > Subject: Re: Update on String Templates (JEP 459) >> On Mar 13, 2024, at 4:34?PM, Remi Forax wrote: >> >> ----- Original Message ----- >>> From: "Guy Steele" >>> To: "John Rose" >>> Cc: "Tagir Valeev" , "Brian Goetz" , >>> "amber-spec-experts" >>> >>> Sent: Wednesday, March 13, 2024 9:13:30 PM >>> Subject: Re: Update on String Templates (JEP 459) >> >>>> On Mar 13, 2024, at 3:33?PM, John Rose wrote: >>>> >>>> On 9 Mar 2024, at 3:48, Tagir Valeev wrote: >>>> >>>>> The idea is interesting. There's a thing that disturbs me though. >>>>> Currently, proc."string" and proc."string \{template}" are uniformly >>>>> processed, and the processor may not care much about whether it's a string >>>>> or a template: both can be processed uniformly. After this change, removing >>>>> the last embedded expression from the template (e.g., after inlining a >>>>> constant) will implicitly change the type of the literal from >>>>> StringTemplate to String. This may either cause a compilation error, or >>>>> silently bind to another overload which may or may not behave like a >>>>> template overload with a single-fragment-template. For API authors, this >>>>> means that every method accepting StringTemplate should have a counterpart >>>>> accepting String. The logic inside both methods would likely be very >>>>> similar, so probably both will eventually call a third private method. For >>>>> API user, it could be unclear how to call a method accepting StringTemplate >>>>> if I have simple string in hands but there's no String method (or it does >>>>> slightly different thing due to poor API design). Should I use some ugly >>>>> construct like "This is a string but the API wants a template, so I append >>>>> an empty embedded expression\{""}"? >>>> >>>> This is a huge thread that I hesitate to dive into, but here?s me putting in one >>>> toe: Why do we care so much about no-arg string templates? It?s a small >>>> corner case! The workarounds (for the no-arg case) are totally straightforward >>>> even if the string template literals (as a syntax) are required to have at >>>> least one argument. >>>> >>>> Can we have a plausible use case, please, for why a ST with no arguments would >>>> be important, so important that we are motived to invent a sigil syntax or >>>> special type system rules, to avoid requiring the user to invoke a static >>>> factory? >>>> >>>> Also, Tagir?s workaround of adding a fake argument looks like it would work just >>>> fine, of course depending on which processor was eventually used. >>>> >>>> And in that vein let me add one new (very bike-sheddy) suggestion before I beat >>>> a hasty retreat: Instead of in (1) a sigil before the quote like Guy?s >>>> $"hello", put it (1b) after the quote, and in the ST case only. The ST syntax >>>> could explicitly allow that a no-arg string template would be spelled with a >>>> leading sequence "\{}... which looks like the coder started writing a ST >>>> argument, but in fact dropped it. So "hello" is a 5-char string, in any >>>> context. And "\{}hello" is a 5-char no-arg string template, in any context. >>>> That?s Tagir?s workaround, elevated a bit into a new corner case of (existing) >>>> syntax. >>>> >>>> But even that teeny bit of syntax strikes me as overkill, because I don?t see >>>> the importance of the use cases (no-arg STs) it helps. Just call >>>> ST.of("hello") and call it a day. >>>> >>>> In any case, it seems fine to let the IDE take the lead with no-arg STs, helping >>>> the user decide when and how to disambiguate strings from no-arg STs. Putting >>>> in syntax or type system help for this is surely more expensive than punting to >>>> the IDE, unless there is going to be heavy use of no-arg STs for some use cases >>>> I am not seeing. >>> >>> Well, just off the top of my head as a thought experiment, if I had a series of >>> SQL commands to process, some with arguments and some not, I would rather write >>> >>> SQL.process($?CREATE TABLE foo;?); >>> SQL.process($?ALTER TABLE foo ADD name varchar(40);?); >>> SQL.process($?ALTER TABLE foo ADD title varchar(30);?); >>> SQL.process($?INSERT INTO foo (name, title) VALUES (?Guy?, ?Hacker?);?); >>> SQL.process($?INSERT INTO foo (name, title) VALUES (\{other name}, \{other >>> job});?); >>> >>> than >>> >>> SQL.process(ST.of(?CREATE TABLE foo;?)); >>> SQL.process(ST.of(?ALTER TABLE foo ADD name varchar(40);?)); >>> SQL.process(ST.of(?ALTER TABLE foo ADD title varchar(30);?)); >>> SQL.process(ST.of(?INSERT INTO foo (name, title) VALUES (?Guy?, ?Hacker?);?)); >>> SQL.process(?INSERT INTO foo (name, title) VALUES (\{other name}, \{other >>> job});?); >>> >>> especially if I thought that maybe down the road I might want to change the >>> constants 30 and 40 and ?Hacker' to variables. I don't want to have to keep >>> adding and deleting calls to ST.of as I edit the template strings during >>> program development to have different numbers of interpolated expressions. >> >> Given what Maurizio said and this, i think the only missing piece in the puzzle >> is what about existing methods taking a String as parameter. >> >> We know that for SQL.process(), we do not want process() to take a String but >> only a StringTemplate. >> But what about the existing methods that takes a String. >> >> Given a method Logger.warning(String), should >> LOG.warning($?CREATE TABLE foo;?); >> LOG.warning($?INSERT INTO foo (name, title) VALUES (\{other name}, \{other >> job});?); >> >> be legal ? Is there an auto-conversion (a kind of boxing conversion) from >> StringTemplate to String ? > > In my proposal, the answer would be ?no?. Instead you would have two choices: > > (1) Instead of string template expressions as in the example just given, you > could use string literals or string interpolation expressions (omit the ?$? > characters): > > LOG.warning(?CREATE TABLE foo;?); > LOG.warning(?INSERT INTO foo (name, title) VALUES (\{other name}, \{other > job});?); > > (2) If instead you have some other sort of expression (such as a variable) whose > type is StringTempate, you can write > > LOG.warning(String.of(myStringTemplate)); > > This makes quite explicit that a conversion is happening from StringTemplate to > String. Make sense, i like it. (1) make the string interpolation explicit, and it can be fully optimize using an invokedynamic (same trick as STR."...") (2) The current method is interpolate(), so LOG.warning(myStringTemplate.interpolate()); Compared to the previous iteration, no Processor interface, no weird calling syntax but instead two new literals, string interpolation and string template. I think the only missing optimization was FMT."..." but it can be done if necessary by specializing String.format(StringTemplate) at the compiler level. R?mi From john.r.rose at oracle.com Wed Mar 13 22:22:20 2024 From: john.r.rose at oracle.com (John Rose) Date: Wed, 13 Mar 2024 15:22:20 -0700 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> Message-ID: On 13 Mar 2024, at 13:13, Guy Steele wrote: > ? Well, just off the top of my head as a thought experiment, if I had a series of SQL commands to process, some with arguments and some not, I would rather write > > SQL.process($?CREATE TABLE foo;?); > SQL.process($?ALTER TABLE foo ADD name varchar(40);?); > SQL.process($?ALTER TABLE foo ADD title varchar(30);?); > SQL.process($?INSERT INTO foo (name, title) VALUES (?Guy?, ?Hacker?);?); > SQL.process($?INSERT INTO foo (name, title) VALUES (\{other name}, \{other job});?); > > than > > SQL.process(ST.of(?CREATE TABLE foo;?)); > SQL.process(ST.of(?ALTER TABLE foo ADD name varchar(40);?)); > SQL.process(ST.of(?ALTER TABLE foo ADD title varchar(30);?)); > SQL.process(ST.of(?INSERT INTO foo (name, title) VALUES (?Guy?, ?Hacker?);?)); > SQL.process(?INSERT INTO foo (name, title) VALUES (\{other name}, \{other job});?); OK, yes. I think a simpler example is needed to answer my question more fully. In this example, the name ?foo? is given as a literal. But, even if only as a workaround, it would probably not hurt code like that to quote such a name as an argument. So: > var foo = ?foo?; // or static final String FOO = ?foo?; > SQL.process(?CREATE TABLE \{foo};?); > SQL.process(?ALTER TABLE \{foo} ADD name varchar(40);?); > SQL.process(?ALTER TABLE \{foo} ADD title varchar(30);?); > SQL.process(?INSERT INTO \{foo} (name, title) VALUES (?Guy?, ?Hacker?);?); > SQL.process(?INSERT INTO \{foo} (name, title) VALUES (\{other name}, \{other job});?); And it?s not just a workaround here, it?s arguably better style (D.R.Y.) to factor out the name foo that links everything together. Non-support of no-arg STs would possibly push users towards a more D.R.Y. style, possibly a good thing. I think such multi-command examples, in many little languages, will tend to have some term like foo shared across phrases. The very small example I?m looking for would ideally be non-factorable, just a little string with not much substructure. Because if it?s factorable, then maybe the user should just factor it, and then it?s a ST with arguments. And if it?s not factorable, then maybe it is some stand-alone thing that won?t be harmed by making it a canned constant, or making it via a factory method, or making it a true string which is introduced into the processor by other means. Not all languages offend against D.R.Y. as much as SQL. A more contextual stateful little language, like Forth or Turtle graphics or Postscript, might have lots of little fixed commands (like ?left? for a turtle). When we work with such little languages we sometime have lots of static final strings to help us find the commands and spell them correctly. (Like static final String LEFT = ?left? in class TurtleGraphics.) That would maybe morph into lots of static final STs? ???????? OVERLOADS Another possible answer, in the use case with SQL above, is that if the language processor expects lots of ad hoc non-factorable (or non-factored) strings, it should cater to that expectation by taking String as an overload option. That places pressure on the API designer to perform the conversion (ST.of) on the fly. And overloads can expand non-linearly when there are several arguments in play. And there are sometimes ambiguity risks in some corner cases, as Maurizio has shown. Still sometimes it?s a good tradeoff to add an overload, if the problems are in truly minor corner cases. Or maybe allowing strings instead of STs gives up some optimizations? But often it?s better to let the chips fall with API design and do the work to optimize whichever API turns out to be most user-friendly. I don?t see (maybe I missed it) a decisive objection to overloading across ST and String, at least for some processing APIs. ???????? ALGEBRA These examples also lead me to a different source of questions, which is whether or how the existing practice of string constant expressions (like static final FOO above) can or should connect to STs as well. It?s an interesting line of thought, so I?ll write something here, but (bottom line) I don?t think we want to act on it, at least at first. String constants have a privileged role in the JLS, and also in programmer practice (as with FOO above). Can/should STs leverage this somehow? Should a ?constant ST expression? be an alternative to a ST literal? I?m thinking of a String or ST constant like MY_FORTH_PROLOGUE which I stick at the front of some ST that I?m building. But that would seem to require some way to concatenate such a string to an ST, an expression like ST ?+? String -> ST, which seems disturbing to me, but might actually make sense. Or would nesting be better, something like `(define tp (foo , at sub-tp bar))? A variation of \{x} like \@{subtp}? (And would there be javac constant folding rules for it, as well as dynamic rules for evaluation?) This is speculative brainstorming; I?m not seriously recommending it for now. Still, continuing? If MY_FORTH_PROLOGUE should be a static final ST, then I want options for prepending it locally to ad hoc strings. So the question about ?what about constants? turns into a larger question, ?what about ST algebra on ST expressions?? If you allow constants to be defined non-locally, you need a way to combine them with ?more stuff? locally. This relates to the issue raised earlier of whether nested STs should be part of the ST API: Whether you concatenate two STs or nest one inside another, it seems you are doing some kind of generic ST algebra, generic across all uses of ST, not just for some processors. And, circling back, if there were a way to fold ST literals together (with some non-local parts) then that would lead to another alternative to a ?sigil? to disambiguate a plain string from a no-arg ST. You?d use the concatenation operation (whatever that is) to combine an empty ST into the string that needs markup. Kind of like when we say ??+x to abbreviate String.valueOf(x). Given nesting or concatenation syntax, no-arg ST literals could be disambiguated by a prefix like ST.EMPTY+??? or like ?\@{}??, which either prepends or nests a degenerate ST. That could serve a role like $??? in your examples, Guy, although of course a single-char sigil looks nicer. Maybe we want some more algebra like that someday, but I am not enthusiastic enough to recommend it now. I guess the most I?d recommend is somehow leave room for building up nested or concatenated literals, as a future addition. Allowing STs to start like ?\{}?? would solve today?s disambiguation problem with a kludge like $???, and also a hint of more ?algebra? in the future. > SQL.process(?\{}CREATE TABLE foo;?); > SQL.process(?\{}ALTER TABLE foo ADD name varchar(40);?); > SQL.process(?\{}ALTER TABLE foo ADD title varchar(30);?); > SQL.process(?\{}INSERT INTO foo (name, title) VALUES (?Guy?, ?Hacker?);?); > SQL.process(?\{}INSERT INTO foo (name, title) VALUES (\{other name}, \{other job});?); HTH From john.r.rose at oracle.com Wed Mar 13 22:37:57 2024 From: john.r.rose at oracle.com (John Rose) Date: Wed, 13 Mar 2024 15:37:57 -0700 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> Message-ID: <2636E7B0-2A41-4034-8367-A20687FABCF9@oracle.com> On 13 Mar 2024, at 15:22, John Rose wrote: > ? OVERLOADS ? > > I don?t see (maybe I missed it) a decisive objection to overloading across ST > and String, at least for some processing APIs. Perhaps it is this: A language processor API that takes STs and never Strings is making it clear that all inputs should be properly vetted, nothing taken on trust as a bare string. Doing that MIGHT require a performance model which permits expensive vetting operations to be memoized on particular OCCURRENCES of inputs (not just the input strings viewed in and of themselves). If that?s true, then I guess that?s support for Guy?s proposal: That STs (even trivial ones) should never look identical to strings. Maybe they should always be preceded by a sigil $, or (per my suggestion) they should always have at least one occurrence of \{ inside, even if it?s a trivial nop. I kind of like Guy?s offensive-to-everyone suggestion that $ is required to make a true ST. Then it?s clear how the veteting APIs mate up with their vetted inputs. And if $ is not placed in front, we surrender to the string-pasters, but at least the resulting true-string expressions won?t be accepted by the vetting APIs. From maurizio.cimadamore at oracle.com Wed Mar 13 23:47:32 2024 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 13 Mar 2024 23:47:32 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: <2636E7B0-2A41-4034-8367-A20687FABCF9@oracle.com> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2636E7B0-2A41-4034-8367-A20687FABCF9@oracle.com> Message-ID: <0003bf81-07cc-46ac-9771-a3d362c06a7a@oracle.com> There is a problem/slippery slope with overloads, which I think should be discussed (and that discussion seems, at least to me, more important than the discussion on how we spell string literals). Consider the case of a /new/ API, that perhaps wants to build SQL queries (or any other kind of injection-sensitive factory): |Query makeQuery(???) | What should be the natural parameter type for this query? Well, we know that String is flawed here. Easy to reach for, but also too easy to abuse. StringTemplate is a much better type because it allows user-injectable values and constant parts to carried in separate parts of the string template, so that the library has a chance at looking at what?s going on. Ok, so let?s say we write the factory as: |Query makeQuery(StringTemplate) | As that is clearly the safer option. This obviously works well /as long as clients are passing template with arguments/. No-argument templates might be a corner case, but, sooner or later somebody might want to do this: |makeQuery("SELECT foo FROM bar WHERE foo = 42"); | Only to discover that this doesn?t compile. What then? There are a couple of alternatives I can think of. The first is to add a String-accepting overload: |Query makeQuery(StringTemplate) Query makeQuery(String) | The second is to use some use-site factory call to turn the string into a degenerate string template: |makeQuery(StringTemplate.fromString("SELECT foo FROM bar WHERE foo = 42")); | IMHO, both approaches have problems: they force the user to go from the safer StringTemplate world, to the more unsafe String world. It?s sort of like crossing the Rubicon: once you?re in String-land, it then become easier to introduce potentially very costly mistakes. If we have overloads: |makeQuery("SELECT " + foo + " FROM " + bar + " WHERE " + condition); | This would now compile just fine. Effectively, safety-wise we?d be back at square one. The factory case is only marginally better - because using the factory is more convoluted, so it would perhaps be easier to spot that something fishy is going on. That said, as the expression got more complicated, it?s easier for bugs to sneak in: |makeQuery(StringTemplate.fromString("SELECT " + foo + "FROM bar WHERE foo = 42")); | So, at least in my opinion, having a string template literal, or some kind of compiler-controlled promotion from string /constants/ to string templates, is not just something we need to type less characters (I honestly couldn?t care less about that, at least not at this stage). These things are needed to allow developers to remain in StringTemplate-land. That is, the best /overall/ outcome is for the library /not/ to have an overload, /and/ for the client to either say this: |makeQuery("SELECT foo FROM bar WHERE foo = 42"); // works because of implicit promotion of constant String -> StringTemplate | or this: |makeQuery("SELECT foo FROM bar WHERE foo = 42"); // works because it's a string template all along | Maurizio On 13/03/2024 22:37, John Rose wrote: On 13 Mar 2024, at 15:22, John Rose wrote: ? OVERLOADS ? I don?t see (maybe I missed it) a decisive objection to overloading across ST and String, at least for some processing APIs. Perhaps it is this: A language processor API that takes STs and never Strings is making it clear that all inputs should be properly vetted, nothing taken on trust as a bare string. Doing that MIGHT require a performance model which permits expensive vetting operations to be memoized on particular OCCURRENCES of inputs (not just the input strings viewed in and of themselves). If that?s true, then I guess that?s support for Guy?s proposal: That STs (even trivial ones) should never look identical to strings. Maybe they should always be preceded by a sigil $, or (per my suggestion) they should always have at least one occurrence of { inside, even if it?s a trivial nop. I kind of like Guy?s offensive-to-everyone suggestion that $ is required to make a true ST. Then it?s clear how the veteting APIs mate up with their vetted inputs. And if $ is not placed in front, we surrender to the string-pasters, but at least the resulting true-string expressions won?t be accepted by the vetting APIs. ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Thu Mar 14 01:22:53 2024 From: john.r.rose at oracle.com (John Rose) Date: Wed, 13 Mar 2024 18:22:53 -0700 Subject: Update on String Templates (JEP 459) In-Reply-To: <0003bf81-07cc-46ac-9771-a3d362c06a7a@oracle.com> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2636E7B0-2A41-4034-8367-A20687FABCF9@oracle.com> <0003bf81-07cc-46ac-9771-a3d362c06a7a@oracle.com> Message-ID: <6D8BC382-E9EC-40AB-A6EB-273FD326E2B4@oracle.com> Thanks, Maurizio. I find your arguments helpful and persuasive. They indicate that ?autoboxing? is the wrong model, since it would lift ad hoc strings into places that want only STs. The poly-expression move, applied only to string literals, is not so bad, since the only ad hoc strings liftable to STs are those right next to the API points that demand STs. But, if we are going to make ST-demanding APIs the lock and STs the keys, it might be reasonable to demand that all STs look distinctive (with that extra sigil), which is an argument even against the poly-expression move. Guy?s disruptive suggestion, of having both kinds of interpolation expressions, would play out as two tiers of vetting and security. The lower tier is inhabited by strings. You have to drive carefully on those streets, where dodgy APIs accept all kinds of strings, and there are no $ sigils to indicate vetted inputs. The higher tier would be API points that demand STs (and do not welcome plain strings). To get into that safer tier, you pay a cover charge, the $ sigils (or API points which manufacture STs explicitly). It might seem wrong to ask a cover charge for a tier we want users to prefer, but the IDE will surely help pay it as needed. (The $ is visible in the code, as a reminder the security is enabled. Like the wrist band you get when you pay the cover?) On the other hand, if we try to make everything be one tier (everything potentially vettable, but with loopholes for raw strings), the security guarantees get muddier. If everything is equally secure, and there are loopholes (for string concat and the like) then everything is also equally insecure, in some hand-wavy sense. More hand-waving: Distinct tiers is a more honest design, allowing for better invariants within the higher tier, and relaxed behavior in the lower tier. Also, maybe, having the distinct tiers be visibly connected by syntax encourages folks muddling around with string-concat to lift their code to work on STs instead of strings. Switch the APIs and add the dollar signs. OK, I?ll stop now. I?m past the point where I need to try the API on some serious project, before I speculate more. On 13 Mar 2024, at 16:47, Maurizio Cimadamore wrote: > There is a problem/slippery slope with overloads, which I think should > be discussed (and that discussion seems, at least to me, more > important than the discussion on how we spell string literals). > > Consider the case of a /new/ API, that perhaps wants to build SQL > queries (or any other kind of injection-sensitive factory): > > |Query makeQuery(???) | > > What should be the natural parameter type for this query? Well, we > know that String is flawed here. Easy to reach for, but also too easy > to abuse. StringTemplate is a much better type because it allows > user-injectable values and constant parts to carried in separate parts > of the string template, so that the library has a chance at looking at > what?s going on. > > Ok, so let?s say we write the factory as: > > |Query makeQuery(StringTemplate) | > > As that is clearly the safer option. This obviously works well /as > long as clients are passing template with arguments/. > > No-argument templates might be a corner case, but, sooner or later > somebody might want to do this: > > |makeQuery("SELECT foo FROM bar WHERE foo = 42"); | > > Only to discover that this doesn?t compile. What then? There are a > couple of alternatives I can think of. The first is to add a > String-accepting overload: > > |Query makeQuery(StringTemplate) Query makeQuery(String) | > > The second is to use some use-site factory call to turn the string > into a degenerate string template: > > |makeQuery(StringTemplate.fromString("SELECT foo FROM bar WHERE foo = > 42")); | > > IMHO, both approaches have problems: they force the user to go from > the safer StringTemplate world, to the more unsafe String world. > It?s sort of like crossing the Rubicon: once you?re in > String-land, it then become easier to introduce potentially very > costly mistakes. If we have overloads: > > |makeQuery("SELECT " + foo + " FROM " + bar + " WHERE " + condition); > | > > This would now compile just fine. Effectively, safety-wise we?d be > back at square one. The factory case is only marginally better - > because using the factory is more convoluted, so it would perhaps be > easier to spot that something fishy is going on. That said, as the > expression got more complicated, it?s easier for bugs to sneak in: > > |makeQuery(StringTemplate.fromString("SELECT " + foo + "FROM bar WHERE > foo = 42")); | > > So, at least in my opinion, having a string template literal, or some > kind of compiler-controlled promotion from string /constants/ to > string templates, is not just something we need to type less > characters (I honestly couldn?t care less about that, at least not > at this stage). These things are needed to allow developers to remain > in StringTemplate-land. > > That is, the best /overall/ outcome is for the library /not/ to have > an overload, /and/ for the client to either say this: > > |makeQuery("SELECT foo FROM bar WHERE foo = 42"); // works because of > implicit promotion of constant String -> StringTemplate | > > or this: > > |makeQuery("SELECT > foo FROM bar WHERE foo = 42"); // works because it's a string template > all along | > > Maurizio > > On 13/03/2024 22:37, John Rose wrote: > > On 13 Mar 2024, at 15:22, John Rose wrote: > > ? OVERLOADS ? > > I don?t see (maybe I missed it) a decisive objection to > overloading > across ST and String, at least for some processing APIs. > Perhaps it is this: A language processor API that takes STs and > never Strings is making it clear that all inputs should be > properly > vetted, nothing taken on trust as a bare string. > > Doing that MIGHT require a performance model which permits > expensive > vetting operations to be memoized on particular OCCURRENCES of > inputs > (not just the input strings viewed in and of themselves). > > If that?s true, then I guess that?s support for Guy?s > proposal: That > STs (even trivial ones) should never look identical to strings. > Maybe they should always be preceded by a sigil $, or (per my > suggestion) they should always have at least one occurrence of { > inside, even if it?s a trivial nop. > > I kind of like Guy?s offensive-to-everyone suggestion that $ is > required to make a true ST. Then it?s clear how the veteting APIs > mate up with their vetted inputs. And if $ is not placed in front, > we surrender to the string-pasters, but at least the resulting > true-string expressions won?t be accepted by the vetting APIs. > > ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Thu Mar 14 15:08:15 2024 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 14 Mar 2024 15:08:15 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com> Message-ID: <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com> Second thoughts about how to explain a string interpolation literal: > On Mar 13, 2024, at 2:02?PM, Guy Steele wrote: > . . . > > ????????? > String is not a subtype of StringTemplate; they are disjoint types. > > $?foo? is a (trivial) string template literal > ?foo? is a string literal > $?Hello, \{x}? is a (nontrivial) string template literal > ?Hello, \{x}? is a shorthand (expanded by the compiler) for `String.of($?Hello, \{x}?)` > ????????? Given that the intent is that String.of (or whatever we want to call it?possibly the `interpolation` instance method of class `StringTemplate` rather than a static method `String.of`) should just do standard string concatenation, we might be better off just saying that a string interpolation literal is expanded by the compiler into uses of ?+?; for example, ?Hello, \{x}.? (I have added a period to the example to make the point clearer) is expanded into ?Hello, ? + x + ?.? and in general ?c0\{e1}c1\{e2}c2?\{en}cn? (where each ck is a possibly empty sequence of string characters and each ek is an expression) is expanded into ?c0? + (e1) + ?c1? + (e2) + ?c2? + ? + (en) + ?cn? The point is that, with this definition, ?c0\{e1}c1\{e2}c2?\{en}cn? is a constant expression iff every ek is a constant expression. This is handy for interpolating constant variables into a string that is itself intended to be constant. ?Guy From ccherlin at gmail.com Thu Mar 14 16:24:57 2024 From: ccherlin at gmail.com (Clement Cherlin) Date: Thu, 14 Mar 2024 11:24:57 -0500 Subject: Update on String Templates (JEP 459) In-Reply-To: <2636E7B0-2A41-4034-8367-A20687FABCF9@oracle.com> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2636E7B0-2A41-4034-8367-A20687FABCF9@oracle.com> Message-ID: On Wed, Mar 13, 2024 at 6:45?PM John Rose wrote: > > On 13 Mar 2024, at 15:22, John Rose wrote: > > > ? OVERLOADS ? > > > > I don?t see (maybe I missed it) a decisive objection to overloading across ST > > and String, at least for some processing APIs. > > Perhaps it is this: A language processor API that takes STs and never Strings is making it clear that all inputs should be properly vetted, nothing taken on trust as a bare string. > > Doing that MIGHT require a performance model which permits expensive vetting operations to be memoized on particular OCCURRENCES of inputs (not just the input strings viewed in and of themselves). > > If that?s true, then I guess that?s support for Guy?s proposal: That STs (even trivial ones) should never look identical to strings. Maybe they should always be preceded by a sigil $, or (per my suggestion) they should always have at least one occurrence of \{ inside, even if it?s a trivial nop. > > I kind of like Guy?s offensive-to-everyone suggestion that $ is required to make a true ST. Then it?s clear how the veteting APIs mate up with their vetted inputs. And if $ is not placed in front, we surrender to the string-pasters, but at least the resulting true-string expressions won?t be accepted by the vetting APIs. Adding an empty interpolated value to signal a template is not a viable solution, because "\{}abc" is not equivalent to ST.of("abc"). Running the current preview, RAW."\{}abc" produces StringTemplate{ fragments = [ "", "abc" ], values = [null] } which interpolates to "nullabc". RAW."abc" produces StringTemplate{ fragments = [ "abc" ], values = [] } which interpolates to "abc". I strongly support using different quotes or a prefixed sigil over any form of linguistic magic like "if an interpolated value is empty we pretend it's not there but still treat the literal as a template" or "a string literal can be implicitly converted to a template literal in {context}". Cheers, Clement Cherlin From maurizio.cimadamore at oracle.com Thu Mar 14 17:40:55 2024 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 14 Mar 2024 17:40:55 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com> <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com> Message-ID: Not to pour too much cold water on the idea of having string interpolation literal, but I?d like to mention a few points here. First, it was a deliberate design goal of the string template feature to make interpolation an explicit act. Note that, if we had the syntax you describe, we actually achieve the opposite effect: string interpolation is now the default, and implicit, and actually /cheaper/ (to type) than the safer template alternative. This is a bit of a red herring, I think. The second problem is that interpolation literals can sometimes be deceiving. Consider this example: |String.format("Hello, my name is %s{name}"); // can you spot the bug? | Where |String::format| has a new overload which accepts a StringTemplate. Basically, since here we forgot the leading ?$? (or whatever char that is), the whole thing is just a big interpolation. Semantically equivalent to: |String.format("Hello, my name is %s" + name); // whoops! | This will fail, as |String::format| will be waiting for an argument (a string), but none is provided. So: || Exception java.util.MissingFormatArgumentException: Format specifier '%s' | at Formatter.format (Formatter.java:2672) | at Formatter.format (Formatter.java:2609) | at String.format (String.java:2897) | at (#2:1) | This is a very odd (and new!) failure mode, that I?m sure is gonna surprise developers. Maurizio On 14/03/2024 15:08, Guy Steele wrote: > Second thoughts about how to explain a string interpolation literal: > >> On Mar 13, 2024, at 2:02?PM, Guy Steele wrote: >> . . . >> >> ????????? >> String is not a subtype of StringTemplate; they are disjoint types. >> >> $?foo? is a (trivial) string template literal >> ?foo? is a string literal >> $?Hello, \{x}? is a (nontrivial) string template literal >> ?Hello, \{x}? is a shorthand (expanded by the compiler) for `String.of($?Hello, \{x}?)` >> ????????? > Given that the intent is that String.of (or whatever we want to call it?possibly the `interpolation` instance method of class `StringTemplate` rather than a static method `String.of`) should just do standard string concatenation, we might be better off just saying that a string interpolation literal is expanded by the compiler into uses of ?+?; for example, > > ?Hello, \{x}.? > > (I have added a period to the example to make the point clearer) is expanded into > > ?Hello, ? + x + ?.? > > and in general > > ?c0\{e1}c1\{e2}c2?\{en}cn? > > (where each ck is a possibly empty sequence of string characters and each ek is an expression) is expanded into > > ?c0? + (e1) + ?c1? + (e2) + ?c2? + ? + (en) + ?cn? > > The point is that, with this definition, ?c0\{e1}c1\{e2}c2?\{en}cn? is a constant expression iff every ek is a constant expression. This is handy for interpolating constant variables into a string that is itself intended to be constant. > > ?Guy > ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From ccherlin at gmail.com Thu Mar 14 19:04:02 2024 From: ccherlin at gmail.com (Clement Cherlin) Date: Thu, 14 Mar 2024 14:04:02 -0500 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com> <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com> Message-ID: I think there are a few basic use cases which everyone wants to be safe and ergonomic. 1. New APIs that accept StringTemplate, not String, and do processing with the value above and beyond direct interpolation (SQL queries, HTML/XML escaping, transforming to JSON, etc.). 2. Existing APIs that accept String or (String, Object...) that have StringTemplate support added, such as PrintWriter::println or String::format. 3. Old APIs that have not been (and may never be) updated to accept StringTemplate, but we want to pass interpolated strings to. # Problems Use case #1: Issues passing constant templates if there is no explicit syntactic distinction between string and template literals. Use case #2: Complicated and potentially erroneous overload selection if there is no explicit syntactic distinction between string and template literals. Use case 3: Passing interpolated templates to APIs that only support String, without excess ceremony. # Proposed Solution I believe there is a common solution to these problems that (hopefully) addresses all of these issues. Prefixing a template with an explicit processor was nice in one way, because the processor made the semantics of the interpolation explicit. However, processors were more trouble than they were worth. What if instead of the extremes of a myriad of processors, or a single template prefix, or no prefix and complex/confusing context rules, we have exactly two prefixes? To avoid bikeshedding (obviously, the final names would be much shorter), I will call them TEMPLATE and INTERPOLATE. These are semantically identical to the old RAW and STR processors respectively, but syntactically have no "." between them and the leading quote. TEMPLATE"hey \{name}" -> StringTemplate INTERPOLATE"hey \{name}" -> String Unlike processors, these two are the *only* valid prefixes. This brings back the clarity of RAW and STR without the complexity of processor classes. Processing of TEMPLATE literals is done by normal methods that take StringTemplate. INTERPOLATE literals evaluate directly to regular Strings. The two kinds of expressions can have different translation strategies, like constant-ification of INTERPOLATE expressions with constant values, as Guy suggests. # Examples Use case #1 generateQuery(TEMPLATE"update table \{tableName} set \{column} = \{value} where \{whereExp}"); // OK generateQuery(INTERPOLATE"update table \{tableName} set \{column} = \{value} where \{whereExp}"); // incompatible type error Use case #2 System.out.println(TEMPLATE"Hello, \{world}!"); // OK System.out.println(INTERPOLATE"Hello, \{world}!"); // OK, and if 'world' is constant, it may be folded String.format(TEMPLATE"I am %d\{age} years old"); // OK String.format(INTERPOLATE"I am %d\{age} years old"); // IDE warning and runtime exception because format string doesn't match number of parameters. Use case #3 someOldStringMethod(TEMPLATE"some runtime values go here: \{value1} and here: \{value2}"); // incompatible type error someOldStringMethod(INTERPOLATE"some runtime values go here: \{value1} and here: \{value2}"); // OK What do you think? Cheers, Clement Cherlin On Thu, Mar 14, 2024 at 12:44?PM Maurizio Cimadamore wrote: > > Not to pour too much cold water on the idea of having string interpolation literal, but I?d like to mention a few points here. > > First, it was a deliberate design goal of the string template feature to make interpolation an explicit act. Note that, if we had the syntax you describe, we actually achieve the opposite effect: string interpolation is now the default, and implicit, and actually cheaper (to type) than the safer template alternative. This is a bit of a red herring, I think. > > The second problem is that interpolation literals can sometimes be deceiving. Consider this example: > > String.format("Hello, my name is %s{name}"); // can you spot the bug? > > Where String::format has a new overload which accepts a StringTemplate. > > Basically, since here we forgot the leading ?$? (or whatever char that is), the whole thing is just a big interpolation. Semantically equivalent to: > > String.format("Hello, my name is %s" + name); // whoops! > > This will fail, as String::format will be waiting for an argument (a string), but none is provided. So: > > | Exception java.util.MissingFormatArgumentException: Format specifier '%s' > | at Formatter.format (Formatter.java:2672) > | at Formatter.format (Formatter.java:2609) > | at String.format (String.java:2897) > | at (#2:1) > > This is a very odd (and new!) failure mode, that I?m sure is gonna surprise developers. > > Maurizio > > On 14/03/2024 15:08, Guy Steele wrote: > > Second thoughts about how to explain a string interpolation literal: > > On Mar 13, 2024, at 2:02?PM, Guy Steele wrote: > . . . > > ????????? > String is not a subtype of StringTemplate; they are disjoint types. > > $?foo? is a (trivial) string template literal > ?foo? is a string literal > $?Hello, \{x}? is a (nontrivial) string template literal > ?Hello, \{x}? is a shorthand (expanded by the compiler) for `String.of($?Hello, \{x}?)` > ????????? > > Given that the intent is that String.of (or whatever we want to call it?possibly the `interpolation` instance method of class `StringTemplate` rather than a static method `String.of`) should just do standard string concatenation, we might be better off just saying that a string interpolation literal is expanded by the compiler into uses of ?+?; for example, > > ?Hello, \{x}.? > > (I have added a period to the example to make the point clearer) is expanded into > > ?Hello, ? + x + ?.? > > and in general > > ?c0\{e1}c1\{e2}c2?\{en}cn? > > (where each ck is a possibly empty sequence of string characters and each ek is an expression) is expanded into > > ?c0? + (e1) + ?c1? + (e2) + ?c2? + ? + (en) + ?cn? > > The point is that, with this definition, ?c0\{e1}c1\{e2}c2?\{en}cn? is a constant expression iff every ek is a constant expression. This is handy for interpolating constant variables into a string that is itself intended to be constant. > > ?Guy > From maurizio.cimadamore at oracle.com Thu Mar 14 19:24:37 2024 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 14 Mar 2024 19:24:37 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com> <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com> Message-ID: <80744de8-103f-4a51-8df6-52642aff00a6@oracle.com> On 14/03/2024 19:04, Clement Cherlin wrote: > What if instead of the extremes of a myriad of processors, or a single > template prefix, or no prefix and complex/confusing context rules, we > have exactly two prefixes? To avoid bikeshedding (obviously, the final > names would be much shorter), I will call them TEMPLATE and > INTERPOLATE. These are semantically identical to the old RAW and STR > processors respectively, but syntactically have no "." between them > and the leading quote. > > TEMPLATE"hey \{name}" -> StringTemplate > INTERPOLATE"hey \{name}" -> String See my latest email to Guy. Having /different/ prefix for interpolated vs. raw template literals does help a bit with the case I brought up there - as here we?re basically in a world where a string literal with embedded arguments /must/ have a suitable prefix. A possible point which is not too far from where we are today is just reuse STR and RAW as prefixes, but also make RAW /optional/, so that: * it can be used to disambiguate interpretation of strings w/o embedded expressions; o it can be the /default/ if you type something that does have embedded expressions (e.g. nudge towards the safer route if there?s embedded expressions) Another idea that came up was, instead of just using prefixes, use /types/: * the STR prefix is written String * the RAW prefix is written StringTemplate This is slightly better (say what you mean!) - but a potential problem is that one might wonder why a special syntax is needed given a cast is just a pair of |(| and |)| away? Maurizio ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Thu Mar 14 19:39:17 2024 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 14 Mar 2024 19:39:17 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com> <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com> Message-ID: This is a very important example to consider. I observe, however, that there are at least two possible ways to avoid the unpleasant surprise: (1) Don't have string interpolation literals, because accidentally using a string interpolation literal instead of a string template literals can result in invoking the wrong overload of a method. (2) Don?t overload methods so as to accept either a string or a string template. If we were to take approach (2), then: (a) We would keep `println` as is, and not allow it to accept a template, but that?s okay?if you thought you wanted a template, what you really want is plan old string interpolation, and the type checking will make sure you don't use the wrong one. (b) A SQL processor would accept a template but not a string?if you thought you wanted string interpolation, what you really want is a template, and the type checking will make sure you don't use the wrong one. (c) I think `format` is a special case that we tend to get hung up on, and I think that, in this particular branch of the design space we are exploring, perhaps a name other than `String.format` should be chosen for the method that does string formatting on templates. Possible names are `StringTemplate.format` and `String.format$`, but I will leave further bikeshedding on this to others. I do recognize that this move will not enable the type system per se to absolutely prevent programmers from writing String.format("Hello, my name is %s{name}"); // can you spot the bug? but, as Clement has observed, such cases will probably provoke a warning about a mismatch between the number of arguments and the number of %-specifiers that require parameters, so maybe overloading would be okay anyway for `String.format`. Anyway, my point is that whether to overload a method to accept either a string or a string template can be evaluated on a case-by-case basis according to a small number of principles that I think we could enumerate and explain pretty easily. ?Guy On Mar 14, 2024, at 1:40?PM, Maurizio Cimadamore wrote: Not to pour too much cold water on the idea of having string interpolation literal, but I?d like to mention a few points here. First, it was a deliberate design goal of the string template feature to make interpolation an explicit act. Note that, if we had the syntax you describe, we actually achieve the opposite effect: string interpolation is now the default, and implicit, and actually cheaper (to type) than the safer template alternative. This is a bit of a red herring, I think. The second problem is that interpolation literals can sometimes be deceiving. Consider this example: String.format("Hello, my name is %s{name}"); // can you spot the bug? Where String::format has a new overload which accepts a StringTemplate. Basically, since here we forgot the leading ?$? (or whatever char that is), the whole thing is just a big interpolation. Semantically equivalent to: String.format("Hello, my name is %s" + name); // whoops! This will fail, as String::format will be waiting for an argument (a string), but none is provided. So: | Exception java.util.MissingFormatArgumentException: Format specifier '%s' | at Formatter.format (Formatter.java:2672) | at Formatter.format (Formatter.java:2609) | at String.format (String.java:2897) | at (#2:1) This is a very odd (and new!) failure mode, that I?m sure is gonna surprise developers. Maurizio On 14/03/2024 15:08, Guy Steele wrote: Second thoughts about how to explain a string interpolation literal: On Mar 13, 2024, at 2:02?PM, Guy Steele wrote: . . . ????????? String is not a subtype of StringTemplate; they are disjoint types. $?foo? is a (trivial) string template literal ?foo? is a string literal $?Hello, \{x}? is a (nontrivial) string template literal ?Hello, \{x}? is a shorthand (expanded by the compiler) for `String.of($?Hello, \{x}?)` ????????? Given that the intent is that String.of (or whatever we want to call it?possibly the `interpolation` instance method of class `StringTemplate` rather than a static method `String.of`) should just do standard string concatenation, we might be better off just saying that a string interpolation literal is expanded by the compiler into uses of ?+?; for example, ?Hello, \{x}.? (I have added a period to the example to make the point clearer) is expanded into ?Hello, ? + x + ?.? and in general ?c0\{e1}c1\{e2}c2?\{en}cn? (where each ck is a possibly empty sequence of string characters and each ek is an expression) is expanded into ?c0? + (e1) + ?c1? + (e2) + ?c2? + ? + (en) + ?cn? The point is that, with this definition, ?c0\{e1}c1\{e2}c2?\{en}cn? is a constant expression iff every ek is a constant expression. This is handy for interpolating constant variables into a string that is itself intended to be constant. ?Guy ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Thu Mar 14 20:21:03 2024 From: john.r.rose at oracle.com (John Rose) Date: Thu, 14 Mar 2024 13:21:03 -0700 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2636E7B0-2A41-4034-8367-A20687FABCF9@oracle.com> Message-ID: <286722DC-1BD8-46E8-9FF5-32D8E6C06624@oracle.com> On 14 Mar 2024, at 9:24, Clement Cherlin wrote: > ? > Adding an empty interpolated value to signal a template is not a > viable solution, because "\{}abc" is not equivalent to ST.of("abc"). > Running the current preview, > > RAW."\{}abc" produces StringTemplate{ fragments = [ "", "abc" ], > values = [null] } which interpolates to "nullabc". Surely that is a bug in the preview. In making that suggestion I assumed that omitting the expression altogether was illegal in the current syntax. Having empty brackets be illegal (instead of an obscure way to say ?\{null}?), they would have been a way to force a string to be a template, as a compatible extension. But it?s not my favorite suggestion; just something I put out there FWIW. From ccherlin at gmail.com Thu Mar 14 20:53:09 2024 From: ccherlin at gmail.com (Clement Cherlin) Date: Thu, 14 Mar 2024 15:53:09 -0500 Subject: Update on String Templates (JEP 459) In-Reply-To: <80744de8-103f-4a51-8df6-52642aff00a6@oracle.com> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com> <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com> <80744de8-103f-4a51-8df6-52642aff00a6@oracle.com> Message-ID: On Thu, Mar 14, 2024 at 2:24?PM Maurizio Cimadamore wrote: > > On 14/03/2024 19:04, Clement Cherlin wrote: > > What if instead of the extremes of a myriad of processors, or a single > template prefix, or no prefix and complex/confusing context rules, we > have exactly two prefixes? To avoid bikeshedding (obviously, the final > names would be much shorter), I will call them TEMPLATE and > INTERPOLATE. These are semantically identical to the old RAW and STR > processors respectively, but syntactically have no "." between them > and the leading quote. > > TEMPLATE"hey \{name}" -> StringTemplate > INTERPOLATE"hey \{name}" -> String > > See my latest email to Guy. > > Having different prefix for interpolated vs. raw template literals does help a bit with the case I brought up there - as here we?re basically in a world where a string literal with embedded arguments must have a suitable prefix. Yes, that's what I had in mind when writing this proposal. > A possible point which is not too far from where we are today is just reuse STR and RAW as prefixes, I considered that, but while I have no problem with STR, I think the name RAW is too much a legacy of the Processor API. I think RAW should be replaced by something that signifies "this is a template", not "process this template with the processor that does nothing". I will continue to resist the urge to present alternative prefixes until the appropriate time. > but also make RAW optional, so that: > > it can be used to disambiguate interpretation of strings w/o embedded expressions; > > it can be the default if you type something that does have embedded expressions (e.g. nudge towards the safer route if there?s embedded expressions) That's certainly an option. I don't prefer it, but it's not terrible, and would reduce clutter somewhat when you are dealing exclusively with templates and don't need the reminder. Like "final" on an effectively final value, it could be a matter of taste and convention whether to include the prefix for a template with embedded expressions. However, it has the downside that you may accidentally convert a template back to a string literal by removing the last embedded expression. > Another idea that came up was, instead of just using prefixes, use types: > > the STR prefix is written String > the RAW prefix is written StringTemplate > > This is slightly better (say what you mean!) - but a potential problem is that one might wonder why a special syntax is needed given a cast is just a pair of ( and ) away? > > Maurizio I'm going to assume you're joking here, so I don't feel the need to write a thousand words about how terrible Java's casting syntax is. Cheers, Clement Cherlin From maurizio.cimadamore at oracle.com Thu Mar 14 22:00:17 2024 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 14 Mar 2024 22:00:17 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com> <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com> Message-ID: On 14/03/2024 19:39, Guy Steele wrote: > This is a very important example to consider. I observe, however, that > there are at least two possible ways to avoid the unpleasant surprise: > > (1) Don't have string interpolation literals, because accidentally > using a string interpolation literal instead of a string template > literals can result in invoking the wrong overload of a method. > > (2) Don?t overload methods so as to accept either a string or a string > template. I agree with your analysis, but note that there is also a third option: (3) make it so that both string interpolation literal and string template literal have a prefix. I believe that is enough to solve the issue (because the program I wrote would no longer compile: the compiler would require an explicit prefix). Maurizio > > If we were to take approach (2), then: > > (a) We would keep `println` as is, and not allow it to accept a > template, but that?s okay?if you thought you wanted a template, what > you really want is plan old string interpolation, and the type > checking will make sure you don't use the wrong one. > > (b) A SQL processor would accept a template but not a string?if you > thought you wanted string interpolation, what you really want is a > template, and the type checking will make sure you don't use the wrong > one. > > (c) I think `format` is a special case that we tend to get hung up on, > and I think that, in this particular branch of the design space we are > exploring, perhaps a name other than `String.format` should be chosen > for the method that does string formatting on templates. Possible > names are `StringTemplate.format` and `String.format$`, but I will > leave further bikeshedding on this to others. I do recognize that this > move will not enable the type system per se to absolutely prevent > programmers from writing > |String.format("Hello, my name is %s{name}"); // can you spot the bug? | > but, as Clement has observed, such cases will probably provoke a > warning about a mismatch between the number of arguments and the > number of %-specifiers that require parameters, so maybe overloading > would be okay anyway for `String.format`. > > Anyway, my point is that whether to overload a method to accept either > a string or a string template can be evaluated on a case-by-case basis > according to a small number of principles that I think we could > enumerate and explain pretty easily. > > ?Guy > >> On Mar 14, 2024, at 1:40?PM, Maurizio Cimadamore >> wrote: >> >> Not to pour too much cold water on the idea of having string >> interpolation literal, but I?d like to mention a few points here. >> >> First, it was a deliberate design goal of the string template feature >> to make interpolation an explicit act. Note that, if we had the >> syntax you describe, we actually achieve the opposite effect: string >> interpolation is now the default, and implicit, and actually >> /cheaper/ (to type) than the safer template alternative. This is a >> bit of a red herring, I think. >> >> The second problem is that interpolation literals can sometimes be >> deceiving. Consider this example: >> >> |String.format("Hello, my name is %s{name}"); // can you spot the bug? | >> >> Where |String::format| has a new overload which accepts a StringTemplate. >> >> Basically, since here we forgot the leading ?$? (or whatever char >> that is), the whole thing is just a big interpolation. Semantically >> equivalent to: >> >> |String.format("Hello, my name is %s" + name); // whoops! | >> >> This will fail, as |String::format| will be waiting for an argument >> (a string), but none is provided. So: >> >> || Exception java.util.MissingFormatArgumentException: Format >> specifier '%s' | at Formatter.format (Formatter.java:2672) | at >> Formatter.format (Formatter.java:2609) | at String.format >> (String.java:2897) | at (#2:1) | >> >> This is a very odd (and new!) failure mode, that I?m sure is gonna >> surprise developers. >> >> Maurizio >> >> On 14/03/2024 15:08, Guy Steele wrote: >> >> >> >>> Second thoughts about how to explain a string interpolation literal: >>> >>>> On Mar 13, 2024, at 2:02?PM, Guy Steele wrote: >>>> . . . >>>> >>>> ????????? >>>> String is not a subtype of StringTemplate; they are disjoint types. >>>> >>>> $?foo? is a (trivial) string template literal >>>> ?foo? is a string literal >>>> $?Hello, \{x}? is a (nontrivial) string template literal >>>> ?Hello, \{x}? is a shorthand (expanded by the compiler) for `String.of($?Hello, \{x}?)` >>>> ????????? >>> Given that the intent is that String.of (or whatever we want to call it?possibly the `interpolation` instance method of class `StringTemplate` rather than a static method `String.of`) should just do standard string concatenation, we might be better off just saying that a string interpolation literal is expanded by the compiler into uses of ?+?; for example, >>> >>> ?Hello, \{x}.? >>> >>> (I have added a period to the example to make the point clearer) is expanded into >>> >>> ?Hello, ? + x + ?.? >>> >>> and in general >>> >>> ?c0\{e1}c1\{e2}c2?\{en}cn? >>> >>> (where each ck is a possibly empty sequence of string characters and each ek is an expression) is expanded into >>> >>> ?c0? + (e1) + ?c1? + (e2) + ?c2? + ? + (en) + ?cn? >>> >>> The point is that, with this definition, ?c0\{e1}c1\{e2}c2?\{en}cn? is a constant expression iff every ek is a constant expression. This is handy for interpolating constant variables into a string that is itself intended to be constant. >>> >>> ?Guy >>> >> >> >> ? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Thu Mar 14 22:05:27 2024 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 14 Mar 2024 22:05:27 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com> <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com> Message-ID: Is your intent that a string interpolation literal would have a type other than String? If so, I agree that this is a third option?with the consequence that each API designer now needs to contemplate three-way overloading. If that is not your intent, then I am not seeing how the prefix helps?so please explain? Thanks, Guy On Mar 14, 2024, at 6:00?PM, Maurizio Cimadamore wrote: On 14/03/2024 19:39, Guy Steele wrote: This is a very important example to consider. I observe, however, that there are at least two possible ways to avoid the unpleasant surprise: (1) Don't have string interpolation literals, because accidentally using a string interpolation literal instead of a string template literals can result in invoking the wrong overload of a method. (2) Don?t overload methods so as to accept either a string or a string template. I agree with your analysis, but note that there is also a third option: (3) make it so that both string interpolation literal and string template literal have a prefix. I believe that is enough to solve the issue (because the program I wrote would no longer compile: the compiler would require an explicit prefix). Maurizio If we were to take approach (2), then: (a) We would keep `println` as is, and not allow it to accept a template, but that?s okay?if you thought you wanted a template, what you really want is plan old string interpolation, and the type checking will make sure you don't use the wrong one. (b) A SQL processor would accept a template but not a string?if you thought you wanted string interpolation, what you really want is a template, and the type checking will make sure you don't use the wrong one. (c) I think `format` is a special case that we tend to get hung up on, and I think that, in this particular branch of the design space we are exploring, perhaps a name other than `String.format` should be chosen for the method that does string formatting on templates. Possible names are `StringTemplate.format` and `String.format$`, but I will leave further bikeshedding on this to others. I do recognize that this move will not enable the type system per se to absolutely prevent programmers from writing String.format("Hello, my name is %s{name}"); // can you spot the bug? but, as Clement has observed, such cases will probably provoke a warning about a mismatch between the number of arguments and the number of %-specifiers that require parameters, so maybe overloading would be okay anyway for `String.format`. Anyway, my point is that whether to overload a method to accept either a string or a string template can be evaluated on a case-by-case basis according to a small number of principles that I think we could enumerate and explain pretty easily. ?Guy On Mar 14, 2024, at 1:40?PM, Maurizio Cimadamore wrote: Not to pour too much cold water on the idea of having string interpolation literal, but I?d like to mention a few points here. First, it was a deliberate design goal of the string template feature to make interpolation an explicit act. Note that, if we had the syntax you describe, we actually achieve the opposite effect: string interpolation is now the default, and implicit, and actually cheaper (to type) than the safer template alternative. This is a bit of a red herring, I think. The second problem is that interpolation literals can sometimes be deceiving. Consider this example: String.format("Hello, my name is %s{name}"); // can you spot the bug? Where String::format has a new overload which accepts a StringTemplate. Basically, since here we forgot the leading ?$? (or whatever char that is), the whole thing is just a big interpolation. Semantically equivalent to: String.format("Hello, my name is %s" + name); // whoops! This will fail, as String::format will be waiting for an argument (a string), but none is provided. So: | Exception java.util.MissingFormatArgumentException: Format specifier '%s' | at Formatter.format (Formatter.java:2672) | at Formatter.format (Formatter.java:2609) | at String.format (String.java:2897) | at (#2:1) This is a very odd (and new!) failure mode, that I?m sure is gonna surprise developers. Maurizio On 14/03/2024 15:08, Guy Steele wrote: Second thoughts about how to explain a string interpolation literal: On Mar 13, 2024, at 2:02?PM, Guy Steele wrote: . . . ????????? String is not a subtype of StringTemplate; they are disjoint types. $?foo? is a (trivial) string template literal ?foo? is a string literal $?Hello, \{x}? is a (nontrivial) string template literal ?Hello, \{x}? is a shorthand (expanded by the compiler) for `String.of($?Hello, \{x}?)` ????????? Given that the intent is that String.of (or whatever we want to call it?possibly the `interpolation` instance method of class `StringTemplate` rather than a static method `String.of`) should just do standard string concatenation, we might be better off just saying that a string interpolation literal is expanded by the compiler into uses of ?+?; for example, ?Hello, \{x}.? (I have added a period to the example to make the point clearer) is expanded into ?Hello, ? + x + ?.? and in general ?c0\{e1}c1\{e2}c2?\{en}cn? (where each ck is a possibly empty sequence of string characters and each ek is an expression) is expanded into ?c0? + (e1) + ?c1? + (e2) + ?c2? + ? + (en) + ?cn? The point is that, with this definition, ?c0\{e1}c1\{e2}c2?\{en}cn? is a constant expression iff every ek is a constant expression. This is handy for interpolating constant variables into a string that is itself intended to be constant. ?Guy ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From maurizio.cimadamore at oracle.com Thu Mar 14 22:07:03 2024 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 14 Mar 2024 22:07:03 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com> <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com> <80744de8-103f-4a51-8df6-52642aff00a6@oracle.com> Message-ID: On 14/03/2024 20:53, Clement Cherlin wrote: > I'm going to assume you're joking here, so I don't feel the need to > write a thousand words about how terrible Java's casting syntax is. Honestly no, I wasn?t joking, at least not from a semantic perspective. Let me see if I can explain myself. Let?s say a string interpolation literal is spelled like: |String"my name is \{name}" | What is this expression doing? Well, it?s taking some literal that looks like a string, but has some embedded expression, and explicitly ask to turn that thing into a String? Now, isn?t that what (morally) a cast is for? E.g. the object you are casting has a type (StringTemplate) and you want to turn it into something else (String). Seems quite close. And, while cast between reference types don't do much (beside changing the type), cast between primitives, or between primitives and references (boxed types) do end up changing the underlying representation. So again, semantically we're not too far. Where I think cast is a bad fit is that we can't use the cast syntax just as a syntactic device. If cast syntax works, it means there's a casting conversion between String and StringTemplate which means (as I mentioned the other day) that pattern matching will need to come along for the ride too. In terms of syntax, I might agree with you that it?s not a great option, but the ?conversion vibe? that a cast gives isn?t totally off the mark. Maurizio ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From maurizio.cimadamore at oracle.com Thu Mar 14 22:15:01 2024 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 14 Mar 2024 22:15:01 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com> <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com> Message-ID: On 14/03/2024 22:05, Guy Steele wrote: > Is your intent that a string interpolation literal would have a type > other than String? If so, I agree that this is a third option?with the > consequence that each API designer now needs to contemplate three-way > overloading. > > If that is not your intent, then I am not seeing how the prefix > helps?so please explain? Let's go back to the example I mentioned: |String.format("Hello, my name is %s\{name}"); // can you spot the bug? | There's a string with an embedded expression here. The compiler might require a prefix here (e.g. do you want a string, or a string template?). If no prefix is added (as in the above code) it might just be an error, and this won't compile. This means that if I do: |String.format(INTERPOLATED"Hello, my name is %s\{name}"); | I will select String.format(String, Object...) - but I will do so deliberately - it's not just what happens "by default" (as was the case before). Or, if I want the template version, I do: |String.format(TEMPLATE"Hello, my name is %s\{name}");| Basically, requiring all literals that have embedded expression to have a prefix removes the problem of defaulting on the String side of the fence. Then, personally I'd also prefer if the default was actually on the StringTemplate side of the fence, so that the above was actually identical to this: |String.format("Hello, my name is %s\{name}"); // ok, this is a template| Note that these two prefixes might also come in handy when disambiguating a literal with no embedded expressions. Only, in that case the default would point the other way. To summarize: * template literal with arguments -> defaults to StringTemplate. User can ask interpolation explicitly, by adding a prefix * template literal w/o arguments -> defaults to String. User can ask a degenerate template explicitly, by adding a prefix This doesn't sound too bad, and it feels like it has the defaults pointing the right way? Maurizio > Thanks, > Guy > >> On Mar 14, 2024, at 6:00?PM, Maurizio Cimadamore >> wrote: >> >> >> On 14/03/2024 19:39, Guy Steele wrote: >>> This is a very important example to consider. I observe, however, >>> that there are at least two possible ways to avoid the unpleasant >>> surprise: >>> >>> (1) Don't have string interpolation literals, because accidentally >>> using a string interpolation literal instead of a string template >>> literals can result in invoking the wrong overload of a method. >>> >>> (2) Don?t overload methods so as to accept either a string or a >>> string template. >> >> I agree with your analysis, but note that there is also a third option: >> >> (3) make it so that both string interpolation literal and string >> template literal have a prefix. >> >> I believe that is enough to solve the issue (because the program I >> wrote would no longer compile: the compiler would require an explicit >> prefix). >> >> Maurizio >> >>> >>> If we were to take approach (2), then: >>> >>> (a) We would keep `println` as is, and not allow it to accept a >>> template, but that?s okay?if you thought you wanted a template, what >>> you really want is plan old string interpolation, and the type >>> checking will make sure you don't use the wrong one. >>> >>> (b) A SQL processor would accept a template but not a string?if you >>> thought you wanted string interpolation, what you really want is a >>> template, and the type checking will make sure you don't use the >>> wrong one. >>> >>> (c) I think `format` is a special case that we tend to get hung up >>> on, and I think that, in this particular branch of the design space >>> we are exploring, perhaps a name other than `String.format` should >>> be chosen for the method that does string formatting on templates. >>> Possible names are `StringTemplate.format` and `String.format$`, but >>> I will leave further bikeshedding on this to others. I do recognize >>> that this move will not enable the type system per se to absolutely >>> prevent programmers from writing >>> |String.format("Hello, my name is %s{name}"); // can you spot the bug? | >>> but, as Clement has observed, such cases will probably provoke a >>> warning about a mismatch between the number of arguments and the >>> number of %-specifiers that require parameters, so maybe overloading >>> would be okay anyway for `String.format`. >>> >>> Anyway, my point is that whether to overload a method to accept >>> either a string or a string template can be evaluated on a >>> case-by-case basis according to a small number of principles that I >>> think we could enumerate and explain pretty easily. >>> >>> ?Guy >>> >>>> On Mar 14, 2024, at 1:40?PM, Maurizio Cimadamore >>>> wrote: >>>> >>>> Not to pour too much cold water on the idea of having string >>>> interpolation literal, but I?d like to mention a few points here. >>>> >>>> First, it was a deliberate design goal of the string template >>>> feature to make interpolation an explicit act. Note that, if we had >>>> the syntax you describe, we actually achieve the opposite effect: >>>> string interpolation is now the default, and implicit, and actually >>>> /cheaper/ (to type) than the safer template alternative. This is a >>>> bit of a red herring, I think. >>>> >>>> The second problem is that interpolation literals can sometimes be >>>> deceiving. Consider this example: >>>> >>>> |String.format("Hello, my name is %s{name}"); // can you spot the bug? | >>>> >>>> Where |String::format| has a new overload which accepts a >>>> StringTemplate. >>>> >>>> Basically, since here we forgot the leading ?$? (or whatever char >>>> that is), the whole thing is just a big interpolation. Semantically >>>> equivalent to: >>>> >>>> |String.format("Hello, my name is %s" + name); // whoops! | >>>> >>>> This will fail, as |String::format| will be waiting for an argument >>>> (a string), but none is provided. So: >>>> >>>> || Exception java.util.MissingFormatArgumentException: Format >>>> specifier '%s' | at Formatter.format (Formatter.java:2672) | at >>>> Formatter.format (Formatter.java:2609) | at String.format >>>> (String.java:2897) | at (#2:1) | >>>> >>>> This is a very odd (and new!) failure mode, that I?m sure is gonna >>>> surprise developers. >>>> >>>> Maurizio >>>> >>>> On 14/03/2024 15:08, Guy Steele wrote: >>>> >>>> >>>> >>>>> Second thoughts about how to explain a string interpolation literal: >>>>> >>>>>> On Mar 13, 2024, at 2:02?PM, Guy Steele wrote: >>>>>> . . . >>>>>> >>>>>> ????????? >>>>>> String is not a subtype of StringTemplate; they are disjoint types. >>>>>> >>>>>> $?foo? is a (trivial) string template literal >>>>>> ?foo? is a string literal >>>>>> $?Hello, \{x}? is a (nontrivial) string template literal >>>>>> ?Hello, \{x}? is a shorthand (expanded by the compiler) for `String.of($?Hello, \{x}?)` >>>>>> ????????? >>>>> Given that the intent is that String.of (or whatever we want to call it?possibly the `interpolation` instance method of class `StringTemplate` rather than a static method `String.of`) should just do standard string concatenation, we might be better off just saying that a string interpolation literal is expanded by the compiler into uses of ?+?; for example, >>>>> >>>>> ?Hello, \{x}.? >>>>> >>>>> (I have added a period to the example to make the point clearer) is expanded into >>>>> >>>>> ?Hello, ? + x + ?.? >>>>> >>>>> and in general >>>>> >>>>> ?c0\{e1}c1\{e2}c2?\{en}cn? >>>>> >>>>> (where each ck is a possibly empty sequence of string characters and each ek is an expression) is expanded into >>>>> >>>>> ?c0? + (e1) + ?c1? + (e2) + ?c2? + ? + (en) + ?cn? >>>>> >>>>> The point is that, with this definition, ?c0\{e1}c1\{e2}c2?\{en}cn? is a constant expression iff every ek is a constant expression. This is handy for interpolating constant variables into a string that is itself intended to be constant. >>>>> >>>>> ?Guy >>>>> >>>> >>>> >>>> ? >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robbepincket at live.be Thu Mar 14 22:36:55 2024 From: robbepincket at live.be (Robbe Pincket) Date: Thu, 14 Mar 2024 22:36:55 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com> <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com> Message-ID: Hi experts I thought I?d give my 2 cents here for a sec. I just looked through this long email chain. I was busy with other things in life so I haven?t checked it out earlier. First of all, I was surprised it took so long for someone to only apply implicit conversion between `String` and `StringTemplate` only for constant `String`s given that there is already a similar case in the compiler. `int`s can?t be implicitly cast to a `byte`, except (some) constant `int` expressions **can** be implicitly converted to `byte`. This is by far my favorite suggestion posted so far (and if weren?t suggested yet, I would have). So I?m a bit surprised it seems to have disappeared again. On another idea going around, using a `$` prefix for string templates and having implicit `String.of(?)` on the ?template? if it isn?t there is at the bottom of my list. The fact that forgetting the `$` prefix just opens you up to an SQL injection attack, while the feature is being advertised as ?safe? is for me unacceptable. (I?m not a big fan of `TEMPLATE"Foo: \{bar}"` either as it?s just so much longer than `"Foo: " + bar`) I haven?t seen anyone suggesting the opposite though? Have a `$` prefix for standard String interpolation (for those apis that don?t accept a String) and when it?s not there it?s a normal `StringTemplate`. Adding an extra char by accident feels much less likely to me than forgetting one. But I wouldn?t be against having it be something like `STR"..."` instead of `$"..."`. Combining these would give the following: ``` String s1 = "test" // still a string literal StringTemplate st2 = "test" // allowed, constant strings can be implicitly converted to templates StringTemplate st3 = "Foo: \{bar}" // Simple string template // either String s4a = $"Foo: \{bar}" // short for String.of("Foo: \{bar}") String s4b = STR"Foo: \{bar}" // short for String.of("Foo: \{bar}") ``` Kind regards Robbe Pincket -------------- next part -------------- An HTML attachment was scrubbed... URL: From maurizio.cimadamore at oracle.com Thu Mar 14 23:44:20 2024 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 14 Mar 2024 23:44:20 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com> <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com> Message-ID: On 14/03/2024 22:36, Robbe Pincket wrote: > (I?m not a big fan of `TEMPLATE"Foo: \{bar}"` either as it?s just so > much longer than `"Foo: " + bar`) Note that when I suggested TEMPLATE as a prefix I was obviously not being super serious :-) Let's do a test (bear with me). Let's assume the two prefixes were S and T (not saying I like them, just trying them out for size). Let's also assume there's no conversion. Then your examples become: ``` String s1 = "test" // still a string literal StringTemplate st2 = T"test" // allowed, constant strings can be implicitly converted to templates StringTemplate st3 = "Foo: \{bar}" // Simple string template String s4c = S"Foo: \{bar}" // short for String.of("Foo: \{bar}") ``` I think that's not too bad? (please don't focus too much on the letters). In the sense: the rare cases (st2) has a prefix. And the operation we want explicit (s4c) also has a prefix. Everything else is fine. Control question #1: does the conversion here change things much? Or, are we reaching for conversions just to have something "shorter" ? Control question #2: let's now assume that S and T were spelled (String) and (StringTemplate), respectively. How do we feel about this? ``` String s1 = "test" // still a string literal StringTemplate st2 = (StringTemplate)"test" // allowed, cast from constant string to template StringTemplate st3 = "Foo: \{bar}" // Simple string template String s4c = (String)"Foo: \{bar}" // allowed, cast from template back to String (interpolation) ``` Maurizio -------------- next part -------------- An HTML attachment was scrubbed... URL: From robbepincket at live.be Fri Mar 15 00:24:11 2024 From: robbepincket at live.be (Robbe Pincket) Date: Fri, 15 Mar 2024 00:24:11 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com> <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com> Message-ID: On 14/03/2024 23:44 UTC, Maurizio Cimadamore wrote: On 14/03/2024 22:36, Robbe Pincket wrote: (I?m not a big fan of `TEMPLATE"Foo: \{bar}"` either as it?s just so much longer than `"Foo: " + bar`) Note that when I suggested TEMPLATE as a prefix I was obviously not being super serious :-) Let's do a test (bear with me). Let's assume the two prefixes were S and T (not saying I like them, just trying them out for size). Let's also assume there's no conversion. Then your examples become: ``` String s1 = "test" // still a string literal StringTemplate st2 = T"test" // allowed, constant strings can be implicitly converted to templates StringTemplate st3 = "Foo: \{bar}" // Simple string template String s4c = S"Foo: \{bar}" // short for String.of("Foo: \{bar}") ``` I think that's not too bad? (please don't focus too much on the letters). In the sense: the rare cases (st2) has a prefix. And the operation we want explicit (s4c) also has a prefix. Everything else is fine. So the difference is that `T` (or something else) has to be used for templates without any holes? To me it feels a bit weird to have a prefix for the special case of a hole-less template. I think I saw an argument passing by, saying something along the line that `String` and `StringTemplate` are semantically different so implicit conversion in either direction would be bad because it would be ambigous. If this `T` idea is based on that, I don't really see why it would be that bad. If an API accepts either, I would intuitivly expect that passing a string and passing a hole-less template with the same string would give me the same result. Control question #1: does the conversion here change things much? Or, are we reaching for conversions just to have something "shorter" ? Control question #2: let's now assume that S and T were spelled (String) and (StringTemplate), respectively. How do we feel about this? ``` String s1 = "test" // still a string literal StringTemplate st2 = (StringTemplate)"test" // allowed, cast from constant string to template StringTemplate st3 = "Foo: \{bar}" // Simple string template String s4c = (String)"Foo: \{bar}" // allowed, cast from template back to String (interpolation) ``` I think I answered #1. Having the 'T' *just* for the "hole-less" template feels a bit odd? I don't you can sell me on #2. * Why would I prefer using `(String)"Foo: \{bar}"` over `"Foo: " + bar`. This is not just a "length issue", as templates would win with more holes, but there is also a cost of switching habits. * (Ignoring primitives), casts have up until now always just returned the input, but now with a different static type. Using the casting syntax to do actual interpolation (or create a template from a string) feels weird to me If ?(StringTemplate)"test"` and `(String)"Foo: \{bar}"` are valid, will the following things work too? `"test" instanceof StringTemplate template` and `"Foo: \{bar}" instanceof String str`. The second one I'd assume no, the first one is a bit unclear to me. Maurizio Kind regards Robbe Pincket -------------- next part -------------- An HTML attachment was scrubbed... URL: From maurizio.cimadamore at oracle.com Fri Mar 15 00:49:48 2024 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 15 Mar 2024 00:49:48 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com> <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com> Message-ID: <8882415e-aa07-4738-b6ee-1926bfb420b1@oracle.com> On 15/03/2024 00:24, Robbe Pincket wrote: > > On 14/03/2024 23:44 UTC, Maurizio Cimadamore wrote: > > ?????????? On 14/03/2024 22:36, Robbe Pincket wrote: > > (I?m not a big fan of `TEMPLATE"Foo: \{bar}"` either as it?s just so > much longer than `"Foo: " + bar`) > > ?????????? Note that when I suggested TEMPLATE as a prefix I was > obviously not being super serious :-) > > ?????????? Let's do a test (bear with me). Let's assume the two > prefixes were S and T (not saying I like them, just trying them out > for size). Let's also assume there's no conversion. Then your examples > become: > > ?????????? ``` > > ?????????? String s1 = "test" // still a string literal > > StringTemplate st2 = T"test" // allowed, constant strings can be > implicitly converted to templates > > StringTemplate st3 = "Foo: \{bar}" // Simple string template > > ?????????? String s4c = S"Foo: \{bar}" // short for String.of("Foo: > \{bar}") > > ?????????? ``` > > ?????????? I think that's not too bad? (please don't focus too much on > the letters). > > ?????????? In the sense: the rare cases (st2) has a prefix. And the > operation we want explicit (s4c) also has a prefix. Everything else is > fine. > > So the difference is that `T` (or something else) has to be used for > templates without any holes? > > To me it feels a bit weird to have a prefix for the special case of a > hole-less template. > Ok. Note that T would be required in that case, but one might also use it as a visual delimiter: if a template is very long, it might not be too readable to leave it implicit as to whether the thing in quotes is a template or not. > > I think I saw an argument passing by, saying something along the line > that `String` and `StringTemplate` are semantically different so > implicit conversion in either direction would be bad because it would > be ambigous. > > If this `T` idea is based on that, I don't really see why it would be > that bad. If an API accepts either, I would intuitivly expect that > passing a string and passing a hole-less template with the same string > would give me the same result. > > ?????????? Control question #1: does the conversion here change things > much? Or, are we reaching for conversions just to have something > "shorter" ? > > ?????????? Control question #2: let's now assume that S and T were > spelled (String) and (StringTemplate), respectively. How do we feel > about this? > > ?????????? ``` > > ?????????? String s1 = "test" // still a string literal > > StringTemplate st2 = (StringTemplate)"test" // allowed, cast from > constant string to template > > StringTemplate st3 = "Foo: \{bar}" // Simple string template > > ?????????? String s4c = (String)"Foo: \{bar}" // allowed, cast from > template back to String (interpolation) > > ?????????? ``` > > I think I answered #1. Having the 'T' *just* for the "hole-less" > template feels a bit odd? I don't you can sell me on #2. > Fair enough - I had to ask :-) > > If ?(StringTemplate)"test"` and `(String)"Foo: \{bar}"` are valid, > will the following things work too? `"test" instanceof StringTemplate > template` and `"Foo: \{bar}" instanceof String str`. The second one > I'd assume no, the first one is a bit unclear to me. > These are questions I raised even in the context of the implicit conversion you are advocating for: once you add an assignment conversion, cast comes with it, and with cast, patterns and instanceof. In other words, that's the price we have to pay for eliminating the T in the hole-less template in the way you proposed. Casts just make that trade-off more explicit. Another thing I don't love about implicit conversion, is that they don't play with inference too well: ``` List ls = List.of("Hello"); ``` The above would be an error. The type-variable (X) for List::of is seeing two different constraints: * X = StringTemplate (from the target type) * String <: X (from the argument) This fails, because we'd infer StringTemplate which is a supertype of String. Even if we could somehow "convince" inference that StringTemplate is a valid "more general" type, I see lots and lots of dragons here: * inference would have to be careful only to do certain moves if constant strings are involved * if we pick StringTemplate, we're basically saying that the method is applicable by conversion, so in overload step 2. But is this the overload step we used to pick that candidate in the first place? Probably not, because String <: X requires only subtyping, not conversion. Ultimately, implicit conversion would only "kind of work" and will lead to issues when interacting with generics. While it's tempting to sweep issues under the rug (after all they do not seem very important for the examples we're discussing), such compromises have a tendency to bit us back when feature "grow up" and start playing more with other feature: any loss of compositionality there costs quite a bit. Which is why I'm not in love with implicit conversions Maurizio > ?????????? Maurizio > > Kind regards > > Robbe Pincket > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Fri Mar 15 01:05:46 2024 From: guy.steele at oracle.com (Guy Steele) Date: Fri, 15 Mar 2024 01:05:46 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com> <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com> Message-ID: <330FB2A7-3154-4CC5-AA34-D4ECBFBC713C@oracle.com> Thanks for these derails, but they don?t quite answer my question: how does the compiler makes the decision to require the prefix? Specifically, is it done purely by examining the types of the literals (in which case the existing story, about how method overloading decides which of several methods with the same name to call, is adequate), or are you imagining some additional ad-hoc mechanism that is somehow examining the syntax of method arguments (in which case some care will be needed to ensure that it interacts properly with the rest of the method overloading resolution mechanism)? I ask because, given your explanation below, I am not seeing how types alone can do the job?but maybe I am missing something. ?Guy On Mar 14, 2024, at 6:15?PM, Maurizio Cimadamore wrote: On 14/03/2024 22:05, Guy Steele wrote: Is your intent that a string interpolation literal would have a type other than String? If so, I agree that this is a third option?with the consequence that each API designer now needs to contemplate three-way overloading. If that is not your intent, then I am not seeing how the prefix helps?so please explain? Let's go back to the example I mentioned: String.format("Hello, my name is %s\{name}"); // can you spot the bug? There's a string with an embedded expression here. The compiler might require a prefix here (e.g. do you want a string, or a string template?). If no prefix is added (as in the above code) it might just be an error, and this won't compile. This means that if I do: String.format(INTERPOLATED"Hello, my name is %s\{name}"); I will select String.format(String, Object...) - but I will do so deliberately - it's not just what happens "by default" (as was the case before). Or, if I want the template version, I do: String.format(TEMPLATE"Hello, my name is %s\{name}"); Basically, requiring all literals that have embedded expression to have a prefix removes the problem of defaulting on the String side of the fence. Then, personally I'd also prefer if the default was actually on the StringTemplate side of the fence, so that the above was actually identical to this: String.format("Hello, my name is %s\{name}"); // ok, this is a template Note that these two prefixes might also come in handy when disambiguating a literal with no embedded expressions. Only, in that case the default would point the other way. To summarize: * template literal with arguments -> defaults to StringTemplate. User can ask interpolation explicitly, by adding a prefix * template literal w/o arguments -> defaults to String. User can ask a degenerate template explicitly, by adding a prefix This doesn't sound too bad, and it feels like it has the defaults pointing the right way? Maurizio Thanks, Guy On Mar 14, 2024, at 6:00?PM, Maurizio Cimadamore wrote: On 14/03/2024 19:39, Guy Steele wrote: This is a very important example to consider. I observe, however, that there are at least two possible ways to avoid the unpleasant surprise: (1) Don't have string interpolation literals, because accidentally using a string interpolation literal instead of a string template literals can result in invoking the wrong overload of a method. (2) Don?t overload methods so as to accept either a string or a string template. I agree with your analysis, but note that there is also a third option: (3) make it so that both string interpolation literal and string template literal have a prefix. I believe that is enough to solve the issue (because the program I wrote would no longer compile: the compiler would require an explicit prefix). Maurizio If we were to take approach (2), then: (a) We would keep `println` as is, and not allow it to accept a template, but that?s okay?if you thought you wanted a template, what you really want is plan old string interpolation, and the type checking will make sure you don't use the wrong one. (b) A SQL processor would accept a template but not a string?if you thought you wanted string interpolation, what you really want is a template, and the type checking will make sure you don't use the wrong one. (c) I think `format` is a special case that we tend to get hung up on, and I think that, in this particular branch of the design space we are exploring, perhaps a name other than `String.format` should be chosen for the method that does string formatting on templates. Possible names are `StringTemplate.format` and `String.format$`, but I will leave further bikeshedding on this to others. I do recognize that this move will not enable the type system per se to absolutely prevent programmers from writing String.format("Hello, my name is %s{name}"); // can you spot the bug? but, as Clement has observed, such cases will probably provoke a warning about a mismatch between the number of arguments and the number of %-specifiers that require parameters, so maybe overloading would be okay anyway for `String.format`. Anyway, my point is that whether to overload a method to accept either a string or a string template can be evaluated on a case-by-case basis according to a small number of principles that I think we could enumerate and explain pretty easily. ?Guy On Mar 14, 2024, at 1:40?PM, Maurizio Cimadamore wrote: Not to pour too much cold water on the idea of having string interpolation literal, but I?d like to mention a few points here. First, it was a deliberate design goal of the string template feature to make interpolation an explicit act. Note that, if we had the syntax you describe, we actually achieve the opposite effect: string interpolation is now the default, and implicit, and actually cheaper (to type) than the safer template alternative. This is a bit of a red herring, I think. The second problem is that interpolation literals can sometimes be deceiving. Consider this example: String.format("Hello, my name is %s{name}"); // can you spot the bug? Where String::format has a new overload which accepts a StringTemplate. Basically, since here we forgot the leading ?$? (or whatever char that is), the whole thing is just a big interpolation. Semantically equivalent to: String.format("Hello, my name is %s" + name); // whoops! This will fail, as String::format will be waiting for an argument (a string), but none is provided. So: | Exception java.util.MissingFormatArgumentException: Format specifier '%s' | at Formatter.format (Formatter.java:2672) | at Formatter.format (Formatter.java:2609) | at String.format (String.java:2897) | at (#2:1) This is a very odd (and new!) failure mode, that I?m sure is gonna surprise developers. Maurizio On 14/03/2024 15:08, Guy Steele wrote: Second thoughts about how to explain a string interpolation literal: On Mar 13, 2024, at 2:02?PM, Guy Steele wrote: . . . ????????? String is not a subtype of StringTemplate; they are disjoint types. $?foo? is a (trivial) string template literal ?foo? is a string literal $?Hello, \{x}? is a (nontrivial) string template literal ?Hello, \{x}? is a shorthand (expanded by the compiler) for `String.of($?Hello, \{x}?)` ????????? Given that the intent is that String.of (or whatever we want to call it?possibly the `interpolation` instance method of class `StringTemplate` rather than a static method `String.of`) should just do standard string concatenation, we might be better off just saying that a string interpolation literal is expanded by the compiler into uses of ?+?; for example, ?Hello, \{x}.? (I have added a period to the example to make the point clearer) is expanded into ?Hello, ? + x + ?.? and in general ?c0\{e1}c1\{e2}c2?\{en}cn? (where each ck is a possibly empty sequence of string characters and each ek is an expression) is expanded into ?c0? + (e1) + ?c1? + (e2) + ?c2? + ? + (en) + ?cn? The point is that, with this definition, ?c0\{e1}c1\{e2}c2?\{en}cn? is a constant expression iff every ek is a constant expression. This is handy for interpolating constant variables into a string that is itself intended to be constant. ?Guy ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Fri Mar 15 01:07:16 2024 From: guy.steele at oracle.com (Guy Steele) Date: Fri, 15 Mar 2024 01:07:16 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: <330FB2A7-3154-4CC5-AA34-D4ECBFBC713C@oracle.com> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com> <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com> <330FB2A7-3154-4CC5-AA34-D4ECBFBC713C@oracle.com> Message-ID: On Mar 14, 2024, at 9:05?PM, Guy Steele wrote: Thanks for these derails, Sorry: ?details" but they don?t quite answer my question: how does the compiler makes the decision to require the prefix? Specifically, is it done purely by examining the types of the literals (in which case the existing story, about how method overloading decides which of several methods with the same name to call, is adequate), or are you imagining some additional ad-hoc mechanism that is somehow examining the syntax of method arguments (in which case some care will be needed to ensure that it interacts properly with the rest of the method overloading resolution mechanism)? I ask because, given your explanation below, I am not seeing how types alone can do the job?but maybe I am missing something. ?Guy On Mar 14, 2024, at 6:15?PM, Maurizio Cimadamore wrote: On 14/03/2024 22:05, Guy Steele wrote: Is your intent that a string interpolation literal would have a type other than String? If so, I agree that this is a third option?with the consequence that each API designer now needs to contemplate three-way overloading. If that is not your intent, then I am not seeing how the prefix helps?so please explain? Let's go back to the example I mentioned: String.format("Hello, my name is %s\{name}"); // can you spot the bug? There's a string with an embedded expression here. The compiler might require a prefix here (e.g. do you want a string, or a string template?). If no prefix is added (as in the above code) it might just be an error, and this won't compile. This means that if I do: String.format(INTERPOLATED"Hello, my name is %s\{name}"); I will select String.format(String, Object...) - but I will do so deliberately - it's not just what happens "by default" (as was the case before). Or, if I want the template version, I do: String.format(TEMPLATE"Hello, my name is %s\{name}"); Basically, requiring all literals that have embedded expression to have a prefix removes the problem of defaulting on the String side of the fence. Then, personally I'd also prefer if the default was actually on the StringTemplate side of the fence, so that the above was actually identical to this: String.format("Hello, my name is %s\{name}"); // ok, this is a template Note that these two prefixes might also come in handy when disambiguating a literal with no embedded expressions. Only, in that case the default would point the other way. To summarize: * template literal with arguments -> defaults to StringTemplate. User can ask interpolation explicitly, by adding a prefix * template literal w/o arguments -> defaults to String. User can ask a degenerate template explicitly, by adding a prefix This doesn't sound too bad, and it feels like it has the defaults pointing the right way? Maurizio Thanks, Guy On Mar 14, 2024, at 6:00?PM, Maurizio Cimadamore wrote: On 14/03/2024 19:39, Guy Steele wrote: This is a very important example to consider. I observe, however, that there are at least two possible ways to avoid the unpleasant surprise: (1) Don't have string interpolation literals, because accidentally using a string interpolation literal instead of a string template literals can result in invoking the wrong overload of a method. (2) Don?t overload methods so as to accept either a string or a string template. I agree with your analysis, but note that there is also a third option: (3) make it so that both string interpolation literal and string template literal have a prefix. I believe that is enough to solve the issue (because the program I wrote would no longer compile: the compiler would require an explicit prefix). Maurizio If we were to take approach (2), then: (a) We would keep `println` as is, and not allow it to accept a template, but that?s okay?if you thought you wanted a template, what you really want is plan old string interpolation, and the type checking will make sure you don't use the wrong one. (b) A SQL processor would accept a template but not a string?if you thought you wanted string interpolation, what you really want is a template, and the type checking will make sure you don't use the wrong one. (c) I think `format` is a special case that we tend to get hung up on, and I think that, in this particular branch of the design space we are exploring, perhaps a name other than `String.format` should be chosen for the method that does string formatting on templates. Possible names are `StringTemplate.format` and `String.format$`, but I will leave further bikeshedding on this to others. I do recognize that this move will not enable the type system per se to absolutely prevent programmers from writing String.format("Hello, my name is %s{name}"); // can you spot the bug? but, as Clement has observed, such cases will probably provoke a warning about a mismatch between the number of arguments and the number of %-specifiers that require parameters, so maybe overloading would be okay anyway for `String.format`. Anyway, my point is that whether to overload a method to accept either a string or a string template can be evaluated on a case-by-case basis according to a small number of principles that I think we could enumerate and explain pretty easily. ?Guy On Mar 14, 2024, at 1:40?PM, Maurizio Cimadamore wrote: Not to pour too much cold water on the idea of having string interpolation literal, but I?d like to mention a few points here. First, it was a deliberate design goal of the string template feature to make interpolation an explicit act. Note that, if we had the syntax you describe, we actually achieve the opposite effect: string interpolation is now the default, and implicit, and actually cheaper (to type) than the safer template alternative. This is a bit of a red herring, I think. The second problem is that interpolation literals can sometimes be deceiving. Consider this example: String.format("Hello, my name is %s{name}"); // can you spot the bug? Where String::format has a new overload which accepts a StringTemplate. Basically, since here we forgot the leading ?$? (or whatever char that is), the whole thing is just a big interpolation. Semantically equivalent to: String.format("Hello, my name is %s" + name); // whoops! This will fail, as String::format will be waiting for an argument (a string), but none is provided. So: | Exception java.util.MissingFormatArgumentException: Format specifier '%s' | at Formatter.format (Formatter.java:2672) | at Formatter.format (Formatter.java:2609) | at String.format (String.java:2897) | at (#2:1) This is a very odd (and new!) failure mode, that I?m sure is gonna surprise developers. Maurizio On 14/03/2024 15:08, Guy Steele wrote: Second thoughts about how to explain a string interpolation literal: On Mar 13, 2024, at 2:02?PM, Guy Steele wrote: . . . ????????? String is not a subtype of StringTemplate; they are disjoint types. $?foo? is a (trivial) string template literal ?foo? is a string literal $?Hello, \{x}? is a (nontrivial) string template literal ?Hello, \{x}? is a shorthand (expanded by the compiler) for `String.of($?Hello, \{x}?)` ????????? Given that the intent is that String.of (or whatever we want to call it?possibly the `interpolation` instance method of class `StringTemplate` rather than a static method `String.of`) should just do standard string concatenation, we might be better off just saying that a string interpolation literal is expanded by the compiler into uses of ?+?; for example, ?Hello, \{x}.? (I have added a period to the example to make the point clearer) is expanded into ?Hello, ? + x + ?.? and in general ?c0\{e1}c1\{e2}c2?\{en}cn? (where each ck is a possibly empty sequence of string characters and each ek is an expression) is expanded into ?c0? + (e1) + ?c1? + (e2) + ?c2? + ? + (en) + ?cn? The point is that, with this definition, ?c0\{e1}c1\{e2}c2?\{en}cn? is a constant expression iff every ek is a constant expression. This is handy for interpolating constant variables into a string that is itself intended to be constant. ?Guy ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Fri Mar 15 02:10:07 2024 From: guy.steele at oracle.com (Guy Steele) Date: Fri, 15 Mar 2024 02:10:07 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: <330FB2A7-3154-4CC5-AA34-D4ECBFBC713C@oracle.com> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com> <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com> <330FB2A7-3154-4CC5-AA34-D4ECBFBC713C@oracle.com> Message-ID: <7873F0D5-E053-4541-B2A5-2E41B536DD8D@oracle.com> Oh, I think I get it now; I misinterpreted "The compiler might require a prefix here? to mean "The compiler might require a prefix on a literal that is a method argument?, but I now see, from your later sentence "Basically, requiring all literals that have embedded expression to have a prefix . . .? that maybe you just want to adjust the syntax of literals to be roughly what Clement suggested: ??? plain string literal, cannot contain \{?}, type is String INTERPOLATION??? string interpolation, may contain \{?}, type is String TEMPLATE??? string template, , may contain \{?}, type is StringTemplate where the precise syntax for the prefixed INTERPOLATION and TEMPLATE is to be determined. Do I understand your proposal correctly now? ?Guy On Mar 14, 2024, at 9:05?PM, Guy Steele wrote: Thanks for these derails, but they don?t quite answer my question: how does the compiler makes the decision to require the prefix? Specifically, is it done purely by examining the types of the literals (in which case the existing story, about how method overloading decides which of several methods with the same name to call, is adequate), or are you imagining some additional ad-hoc mechanism that is somehow examining the syntax of method arguments (in which case some care will be needed to ensure that it interacts properly with the rest of the method overloading resolution mechanism)? I ask because, given your explanation below, I am not seeing how types alone can do the job?but maybe I am missing something. ?Guy On Mar 14, 2024, at 6:15?PM, Maurizio Cimadamore wrote: On 14/03/2024 22:05, Guy Steele wrote: Is your intent that a string interpolation literal would have a type other than String? If so, I agree that this is a third option?with the consequence that each API designer now needs to contemplate three-way overloading. If that is not your intent, then I am not seeing how the prefix helps?so please explain? Let's go back to the example I mentioned: String.format("Hello, my name is %s\{name}"); // can you spot the bug? There's a string with an embedded expression here. The compiler might require a prefix here (e.g. do you want a string, or a string template?). If no prefix is added (as in the above code) it might just be an error, and this won't compile. This means that if I do: String.format(INTERPOLATED"Hello, my name is %s\{name}"); I will select String.format(String, Object...) - but I will do so deliberately - it's not just what happens "by default" (as was the case before). Or, if I want the template version, I do: String.format(TEMPLATE"Hello, my name is %s\{name}"); Basically, requiring all literals that have embedded expression to have a prefix removes the problem of defaulting on the String side of the fence. Then, personally I'd also prefer if the default was actually on the StringTemplate side of the fence, so that the above was actually identical to this: String.format("Hello, my name is %s\{name}"); // ok, this is a template Note that these two prefixes might also come in handy when disambiguating a literal with no embedded expressions. Only, in that case the default would point the other way. To summarize: * template literal with arguments -> defaults to StringTemplate. User can ask interpolation explicitly, by adding a prefix * template literal w/o arguments -> defaults to String. User can ask a degenerate template explicitly, by adding a prefix This doesn't sound too bad, and it feels like it has the defaults pointing the right way? Maurizio Thanks, Guy On Mar 14, 2024, at 6:00?PM, Maurizio Cimadamore wrote: On 14/03/2024 19:39, Guy Steele wrote: This is a very important example to consider. I observe, however, that there are at least two possible ways to avoid the unpleasant surprise: (1) Don't have string interpolation literals, because accidentally using a string interpolation literal instead of a string template literals can result in invoking the wrong overload of a method. (2) Don?t overload methods so as to accept either a string or a string template. I agree with your analysis, but note that there is also a third option: (3) make it so that both string interpolation literal and string template literal have a prefix. I believe that is enough to solve the issue (because the program I wrote would no longer compile: the compiler would require an explicit prefix). Maurizio If we were to take approach (2), then: (a) We would keep `println` as is, and not allow it to accept a template, but that?s okay?if you thought you wanted a template, what you really want is plan old string interpolation, and the type checking will make sure you don't use the wrong one. (b) A SQL processor would accept a template but not a string?if you thought you wanted string interpolation, what you really want is a template, and the type checking will make sure you don't use the wrong one. (c) I think `format` is a special case that we tend to get hung up on, and I think that, in this particular branch of the design space we are exploring, perhaps a name other than `String.format` should be chosen for the method that does string formatting on templates. Possible names are `StringTemplate.format` and `String.format$`, but I will leave further bikeshedding on this to others. I do recognize that this move will not enable the type system per se to absolutely prevent programmers from writing String.format("Hello, my name is %s{name}"); // can you spot the bug? but, as Clement has observed, such cases will probably provoke a warning about a mismatch between the number of arguments and the number of %-specifiers that require parameters, so maybe overloading would be okay anyway for `String.format`. Anyway, my point is that whether to overload a method to accept either a string or a string template can be evaluated on a case-by-case basis according to a small number of principles that I think we could enumerate and explain pretty easily. ?Guy On Mar 14, 2024, at 1:40?PM, Maurizio Cimadamore wrote: Not to pour too much cold water on the idea of having string interpolation literal, but I?d like to mention a few points here. First, it was a deliberate design goal of the string template feature to make interpolation an explicit act. Note that, if we had the syntax you describe, we actually achieve the opposite effect: string interpolation is now the default, and implicit, and actually cheaper (to type) than the safer template alternative. This is a bit of a red herring, I think. The second problem is that interpolation literals can sometimes be deceiving. Consider this example: String.format("Hello, my name is %s{name}"); // can you spot the bug? Where String::format has a new overload which accepts a StringTemplate. Basically, since here we forgot the leading ?$? (or whatever char that is), the whole thing is just a big interpolation. Semantically equivalent to: String.format("Hello, my name is %s" + name); // whoops! This will fail, as String::format will be waiting for an argument (a string), but none is provided. So: | Exception java.util.MissingFormatArgumentException: Format specifier '%s' | at Formatter.format (Formatter.java:2672) | at Formatter.format (Formatter.java:2609) | at String.format (String.java:2897) | at (#2:1) This is a very odd (and new!) failure mode, that I?m sure is gonna surprise developers. Maurizio On 14/03/2024 15:08, Guy Steele wrote: Second thoughts about how to explain a string interpolation literal: On Mar 13, 2024, at 2:02?PM, Guy Steele wrote: . . . ????????? String is not a subtype of StringTemplate; they are disjoint types. $?foo? is a (trivial) string template literal ?foo? is a string literal $?Hello, \{x}? is a (nontrivial) string template literal ?Hello, \{x}? is a shorthand (expanded by the compiler) for `String.of($?Hello, \{x}?)` ????????? Given that the intent is that String.of (or whatever we want to call it?possibly the `interpolation` instance method of class `StringTemplate` rather than a static method `String.of`) should just do standard string concatenation, we might be better off just saying that a string interpolation literal is expanded by the compiler into uses of ?+?; for example, ?Hello, \{x}.? (I have added a period to the example to make the point clearer) is expanded into ?Hello, ? + x + ?.? and in general ?c0\{e1}c1\{e2}c2?\{en}cn? (where each ck is a possibly empty sequence of string characters and each ek is an expression) is expanded into ?c0? + (e1) + ?c1? + (e2) + ?c2? + ? + (en) + ?cn? The point is that, with this definition, ?c0\{e1}c1\{e2}c2?\{en}cn? is a constant expression iff every ek is a constant expression. This is handy for interpolating constant variables into a string that is itself intended to be constant. ?Guy ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From maurizio.cimadamore at oracle.com Fri Mar 15 09:56:42 2024 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 15 Mar 2024 09:56:42 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: <7873F0D5-E053-4541-B2A5-2E41B536DD8D@oracle.com> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com> <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com> <330FB2A7-3154-4CC5-AA34-D4ECBFBC713C@oracle.com> <7873F0D5-E053-4541-B2A5-2E41B536DD8D@oracle.com> Message-ID: <794eaa0d-f244-43a3-af5b-7ecf11ac8a33@oracle.com> On 15/03/2024 02:10, Guy Steele wrote: > Oh, I think I get it now; I misinterpreted "The compiler might require > a prefix here? to mean "The compiler might require a prefix on a > literal that is a method argument?, but I now see, from your later > sentence "Basically, requiring all literals that have embedded > expression to have a prefix . . .? that maybe you just want to adjust > the syntax of literals to be roughly what Clement suggested: > > ??? ? ? ? ? ? ? ? ? ? ? ? ? ? ? plain string literal, cannot contain > \{?}, type is String > INTERPOLATION??? ? ? string interpolation, may contain \{?}, type is > String > TEMPLATE??? ? ? ?string template, , may contain \{?}, type is > StringTemplate > > where the precise syntax for the prefixed INTERPOLATION and TEMPLATE > is to be determined. Do I understand your proposal correctly now? Yes, with the further tweak that the prefix (with syntax TBD) might be omitted in the "obvious cases" (but kept for clarity): * "Hello" w/o prefix is just String * "Hello \{world}" without prefix is just StringTemplate Does this help? (I'm basically trying to get to a world where use of prefix will be relatively rare, as common cases have the right defaults). Maurizio > > ?Guy > >> On Mar 14, 2024, at 9:05?PM, Guy Steele wrote: >> >> Thanks for these derails, but they don?t quite answer my question: >> how does the compiler makes the decision to require the prefix? >> Specifically, is it done purely by examining the types of the >> literals (in which case the existing story, about how method >> overloading decides which of several methods with the same name to >> call, is adequate), or are you imagining some additional ad-hoc >> mechanism that is somehow examining the syntax of method arguments >> (in which case some care will be needed to ensure that it interacts >> properly with the rest of the method overloading resolution >> mechanism)? I ask because, given your explanation below, I am not >> seeing how types alone can do the job?but maybe I am missing something. >> >> ?Guy >> >>> On Mar 14, 2024, at 6:15?PM, Maurizio Cimadamore >>> wrote: >>> >>> >>> On 14/03/2024 22:05, Guy Steele wrote: >>>> Is your intent that a string interpolation literal would have a >>>> type other than String? If so, I agree that this is a third >>>> option?with the consequence that each API designer now needs to >>>> contemplate three-way overloading. >>>> >>>> If that is not your intent, then I am not seeing how the prefix >>>> helps?so please explain? >>> >>> Let's go back to the example I mentioned: >>> >>> |String.format("Hello, my name is %s\{name}"); // can you spot the bug? | >>> There's a string with an embedded expression here. The compiler >>> might require a prefix here (e.g. do you want a string, or a string >>> template?). If no prefix is added (as in the above code) it might >>> just be an error, and this won't compile. >>> >>> This means that if I do: >>> |String.format(INTERPOLATED"Hello, my name is %s\{name}"); | >>> >>> I will select String.format(String, Object...) - but I will do so >>> deliberately - it's not just what happens "by default" (as was the >>> case before). >>> >>> Or, if I want the template version, I do: >>> >>> |String.format(TEMPLATE"Hello, my name is %s\{name}");| >>> >>> Basically, requiring all literals that have embedded expression to >>> have a prefix removes the problem of defaulting on the String side >>> of the fence. Then, personally I'd also prefer if the default was >>> actually on the StringTemplate side of the fence, so that the above >>> was actually identical to this: >>> >>> |String.format("Hello, my name is %s\{name}"); // ok, this is a template| >>> >>> Note that these two prefixes might also come in handy when >>> disambiguating a literal with no embedded expressions. Only, in that >>> case the default would point the other way. >>> >>> To summarize: >>> >>> * template literal with arguments -> defaults to StringTemplate. >>> User can ask interpolation explicitly, by adding a prefix >>> * template literal w/o arguments -> defaults to String. User can >>> ask a degenerate template explicitly, by adding a prefix >>> >>> This doesn't sound too bad, and it feels like it has the defaults >>> pointing the right way? >>> >>> Maurizio >>> >>>> Thanks, >>>> Guy >>>> >>>>> On Mar 14, 2024, at 6:00?PM, Maurizio Cimadamore >>>>> wrote: >>>>> >>>>> >>>>> On 14/03/2024 19:39, Guy Steele wrote: >>>>>> This is a very important example to consider. I observe, however, >>>>>> that there are at least two possible ways to avoid the unpleasant >>>>>> surprise: >>>>>> >>>>>> (1) Don't have string interpolation literals, because >>>>>> accidentally using a string interpolation literal instead of a >>>>>> string template literals can result in invoking the wrong >>>>>> overload of a method. >>>>>> >>>>>> (2) Don?t overload methods so as to accept either a string or a >>>>>> string template. >>>>> >>>>> I agree with your analysis, but note that there is also a third >>>>> option: >>>>> >>>>> (3) make it so that both string interpolation literal and string >>>>> template literal have a prefix. >>>>> >>>>> I believe that is enough to solve the issue (because the program I >>>>> wrote would no longer compile: the compiler would require an >>>>> explicit prefix). >>>>> >>>>> Maurizio >>>>> >>>>>> >>>>>> If we were to take approach (2), then: >>>>>> >>>>>> (a) We would keep `println` as is, and not allow it to accept a >>>>>> template, but that?s okay?if you thought you wanted a template, >>>>>> what you really want is plan old string interpolation, and the >>>>>> type checking will make sure you don't use the wrong one. >>>>>> >>>>>> (b) A SQL processor would accept a template but not a string?if >>>>>> you thought you wanted string interpolation, what you really want >>>>>> is a template, and the type checking will make sure you don't use >>>>>> the wrong one. >>>>>> >>>>>> (c) I think `format` is a special case that we tend to get hung >>>>>> up on, and I think that, in this particular branch of the design >>>>>> space we are exploring, perhaps a name other than `String.format` >>>>>> should be chosen for the method that does string formatting on >>>>>> templates. Possible names are `StringTemplate.format` and >>>>>> `String.format$`, but I will leave further bikeshedding on this >>>>>> to others. I do recognize that this move will not enable the type >>>>>> system per se to absolutely prevent programmers from writing >>>>>> |String.format("Hello, my name is %s{name}"); // can you spot the >>>>>> bug? | >>>>>> but, as Clement has observed, such cases will probably provoke a >>>>>> warning about a mismatch between the number of arguments and the >>>>>> number of %-specifiers that require parameters, so maybe >>>>>> overloading would be okay anyway for `String.format`. >>>>>> >>>>>> Anyway, my point is that whether to overload a method to accept >>>>>> either a string or a string template can be evaluated on a >>>>>> case-by-case basis according to a small number of principles that >>>>>> I think we could enumerate and explain pretty easily. >>>>>> >>>>>> ?Guy >>>>>> >>>>>>> On Mar 14, 2024, at 1:40?PM, Maurizio Cimadamore >>>>>>> wrote: >>>>>>> >>>>>>> Not to pour too much cold water on the idea of having string >>>>>>> interpolation literal, but I?d like to mention a few points here. >>>>>>> >>>>>>> First, it was a deliberate design goal of the string template >>>>>>> feature to make interpolation an explicit act. Note that, if we >>>>>>> had the syntax you describe, we actually achieve the opposite >>>>>>> effect: string interpolation is now the default, and implicit, >>>>>>> and actually /cheaper/ (to type) than the safer template >>>>>>> alternative. This is a bit of a red herring, I think. >>>>>>> >>>>>>> The second problem is that interpolation literals can sometimes >>>>>>> be deceiving. Consider this example: >>>>>>> >>>>>>> |String.format("Hello, my name is %s{name}"); // can you spot >>>>>>> the bug? | >>>>>>> >>>>>>> Where |String::format| has a new overload which accepts a >>>>>>> StringTemplate. >>>>>>> >>>>>>> Basically, since here we forgot the leading ?$? (or whatever >>>>>>> char that is), the whole thing is just a big interpolation. >>>>>>> Semantically equivalent to: >>>>>>> >>>>>>> |String.format("Hello, my name is %s" + name); // whoops! | >>>>>>> >>>>>>> This will fail, as |String::format| will be waiting for an >>>>>>> argument (a string), but none is provided. So: >>>>>>> >>>>>>> || Exception java.util.MissingFormatArgumentException: Format >>>>>>> specifier '%s' | at Formatter.format (Formatter.java:2672) | at >>>>>>> Formatter.format (Formatter.java:2609) | at String.format >>>>>>> (String.java:2897) | at (#2:1) | >>>>>>> >>>>>>> This is a very odd (and new!) failure mode, that I?m sure is >>>>>>> gonna surprise developers. >>>>>>> >>>>>>> Maurizio >>>>>>> >>>>>>> On 14/03/2024 15:08, Guy Steele wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>>> Second thoughts about how to explain a string interpolation literal: >>>>>>>> >>>>>>>>> On Mar 13, 2024, at 2:02?PM, Guy Steele wrote: >>>>>>>>> . . . >>>>>>>>> >>>>>>>>> ????????? >>>>>>>>> String is not a subtype of StringTemplate; they are disjoint types. >>>>>>>>> >>>>>>>>> $?foo? is a (trivial) string template literal >>>>>>>>> ?foo? is a string literal >>>>>>>>> $?Hello, \{x}? is a (nontrivial) string template literal >>>>>>>>> ?Hello, \{x}? is a shorthand (expanded by the compiler) for `String.of($?Hello, \{x}?)` >>>>>>>>> ????????? >>>>>>>> Given that the intent is that String.of (or whatever we want to call it?possibly the `interpolation` instance method of class `StringTemplate` rather than a static method `String.of`) should just do standard string concatenation, we might be better off just saying that a string interpolation literal is expanded by the compiler into uses of ?+?; for example, >>>>>>>> >>>>>>>> ?Hello, \{x}.? >>>>>>>> >>>>>>>> (I have added a period to the example to make the point clearer) is expanded into >>>>>>>> >>>>>>>> ?Hello, ? + x + ?.? >>>>>>>> >>>>>>>> and in general >>>>>>>> >>>>>>>> ?c0\{e1}c1\{e2}c2?\{en}cn? >>>>>>>> >>>>>>>> (where each ck is a possibly empty sequence of string characters and each ek is an expression) is expanded into >>>>>>>> >>>>>>>> ?c0? + (e1) + ?c1? + (e2) + ?c2? + ? + (en) + ?cn? >>>>>>>> >>>>>>>> The point is that, with this definition, ?c0\{e1}c1\{e2}c2?\{en}cn? is a constant expression iff every ek is a constant expression. This is handy for interpolating constant variables into a string that is itself intended to be constant. >>>>>>>> >>>>>>>> ?Guy >>>>>>>> >>>>>>> >>>>>>> >>>>>>> ? >>>>>> >>>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From asviraspossible at gmail.com Fri Mar 15 13:48:56 2024 From: asviraspossible at gmail.com (Victor Nazarov) Date: Fri, 15 Mar 2024 14:48:56 +0100 Subject: Update on String Templates (JEP 459) In-Reply-To: <794eaa0d-f244-43a3-af5b-7ecf11ac8a33@oracle.com> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com> <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com> <330FB2A7-3154-4CC5-AA34-D4ECBFBC713C@oracle.com> <7873F0D5-E053-4541-B2A5-2E41B536DD8D@oracle.com> <794eaa0d-f244-43a3-af5b-7ecf11ac8a33@oracle.com> Message-ID: Hello experts, I'm not sure if we need one more voice in this thread, but maybe my summary can be a small contribution. I've read the whole thread and I saw only two goals that were named for the StringTemplates-feature. 1) is safety as explained very thoroughly by Maurizio Cimadamore, and another is 2) avoiding proliferation of String-literal sublanguages as advocated by Brian Goetz As Maurizion Cimadamore explain in one of the message, from the safety point of view, the only solutions from those mentioned in the thread that fit the bill are a) either special syntax for string-templates that is distinct from plain-strings, or b) automatic promotion from string-*literal* (without any placeholders inside) into StringTemplate. If we take into account the goal stated by Brian Goetz, then we can see that (b) looks better than (a), because we avoid differently looking language elements. The problem with (b) though is overload selection and many other problems that Maurizio Cimadamore stated already in the original message of this thread. My observation is that if all these problems are completely new, then it's probably hard to choose the right poison, but my opinion is that Java language already had these problems before and solved them, so why not solve this problem with StringTemplates the exact same way. Already solved problems are numeric-types, let us look at the relationship between int and long: * numeric-literal that fit in 32-bits can be both int and long * numeric-literal outside the 32-bit range can only be long * m(int),m(long) with numeric-literal that can be both int and long selects int-overload * n = i works as when n is long and i is int * i = n is compile-time error * i = (int) n succeeds, when i is int and n is long * n instanceof int soon to succeed on long variable n as long as n fits within 32-bits * i instanceof long succeeds when i is int * additionally numeric-literal can use "l" or "L" suffix to denote that it is really long, this can be used to tweak overload-selection I think the above can be translated almost word for word to StringTemplates world: * stringy-literal that doesn't have holes-with-values can be both String and StringTemplate * stringy-literal that has holes-with-values can only be StringTemplate * m(String),m(StringTemplate) with stringy-literal that can be both String and StringTemplate selects String-overload * t = s works as when t is StringTemplate and s is String * s = t is compile-time error * s = (String) t succeeds, when s is String and t is StringTemplate (and does string concatenation) * t instanceof String succeeds on StringTemplate variable t as long as t doesn't have any holes-with-values * s instanceof StringTemplate succeeds when s is String * additionally stringy-literal can use "t" or "T" *suffix* to denote that it is really a template, this can be used to tweak overload-selection and to certify, that some processing of values is expected For me the table for String-StringTemplates satisfies both (1) and (2) goals and feels natural for Java-language, because most of these rules have been present in the language for more than 20 years already. -- Victor Nazarov On Fri, Mar 15, 2024 at 12:59?PM Maurizio Cimadamore < maurizio.cimadamore at oracle.com> wrote: > > On 15/03/2024 02:10, Guy Steele wrote: > > Oh, I think I get it now; I misinterpreted "The compiler might require a > prefix here? to mean "The compiler might require a prefix on a literal that > is a method argument?, but I now see, from your later sentence "Basically, > requiring all literals that have embedded expression to have a prefix . . > .? that maybe you just want to adjust the syntax of literals to be roughly > what Clement suggested: > > ??? plain string literal, cannot contain \{?}, > type is String > INTERPOLATION??? string interpolation, may contain \{?}, type is String > TEMPLATE??? string template, , may contain \{?}, type is > StringTemplate > > where the precise syntax for the prefixed INTERPOLATION and TEMPLATE is to > be determined. Do I understand your proposal correctly now? > > Yes, with the further tweak that the prefix (with syntax TBD) might be > omitted in the "obvious cases" (but kept for clarity): > > * "Hello" w/o prefix is just String > * "Hello \{world}" without prefix is just StringTemplate > > Does this help? (I'm basically trying to get to a world where use of > prefix will be relatively rare, as common cases have the right defaults). > > Maurizio > > > ?Guy > > On Mar 14, 2024, at 9:05?PM, Guy Steele > wrote: > > Thanks for these derails, but they don?t quite answer my question: how > does the compiler makes the decision to require the prefix? Specifically, > is it done purely by examining the types of the literals (in which case the > existing story, about how method overloading decides which of several > methods with the same name to call, is adequate), or are you imagining some > additional ad-hoc mechanism that is somehow examining the syntax of method > arguments (in which case some care will be needed to ensure that it > interacts properly with the rest of the method overloading resolution > mechanism)? I ask because, given your explanation below, I am not seeing > how types alone can do the job?but maybe I am missing something. > > ?Guy > > On Mar 14, 2024, at 6:15?PM, Maurizio Cimadamore > wrote: > > > On 14/03/2024 22:05, Guy Steele wrote: > > Is your intent that a string interpolation literal would have a type other > than String? If so, I agree that this is a third option?with the > consequence that each API designer now needs to contemplate three-way > overloading. > > If that is not your intent, then I am not seeing how the prefix helps?so > please explain? > > Let's go back to the example I mentioned: > > String.format("Hello, my name is %s\{name}"); // can you spot the bug? > > There's a string with an embedded expression here. The compiler might > require a prefix here (e.g. do you want a string, or a string template?). > If no prefix is added (as in the above code) it might just be an error, and > this won't compile. > > This means that if I do: > > String.format(INTERPOLATED"Hello, my name is %s\{name}"); > > > I will select String.format(String, Object...) - but I will do so > deliberately - it's not just what happens "by default" (as was the case > before). > > Or, if I want the template version, I do: > > String.format(TEMPLATE"Hello, my name is %s\{name}"); > > > Basically, requiring all literals that have embedded expression to have a > prefix removes the problem of defaulting on the String side of the fence. > Then, personally I'd also prefer if the default was actually on the > StringTemplate side of the fence, so that the above was actually identical > to this: > > String.format("Hello, my name is %s\{name}"); // ok, this is a template > > > Note that these two prefixes might also come in handy when disambiguating > a literal with no embedded expressions. Only, in that case the default > would point the other way. > > To summarize: > > - template literal with arguments -> defaults to StringTemplate. User > can ask interpolation explicitly, by adding a prefix > - template literal w/o arguments -> defaults to String. User can ask a > degenerate template explicitly, by adding a prefix > > This doesn't sound too bad, and it feels like it has the defaults pointing > the right way? > > Maurizio > > Thanks, > Guy > > On Mar 14, 2024, at 6:00?PM, Maurizio Cimadamore > wrote: > > > On 14/03/2024 19:39, Guy Steele wrote: > > This is a very important example to consider. I observe, however, that > there are at least two possible ways to avoid the unpleasant surprise: > > (1) Don't have string interpolation literals, because accidentally using a > string interpolation literal instead of a string template literals can > result in invoking the wrong overload of a method. > > (2) Don?t overload methods so as to accept either a string or a string > template. > > I agree with your analysis, but note that there is also a third option: > > (3) make it so that both string interpolation literal and string template > literal have a prefix. > > I believe that is enough to solve the issue (because the program I wrote > would no longer compile: the compiler would require an explicit prefix). > > Maurizio > > > If we were to take approach (2), then: > > (a) We would keep `println` as is, and not allow it to accept a template, > but that?s okay?if you thought you wanted a template, what you really want > is plan old string interpolation, and the type checking will make sure you > don't use the wrong one. > > (b) A SQL processor would accept a template but not a string?if you > thought you wanted string interpolation, what you really want is a > template, and the type checking will make sure you don't use the wrong one. > > (c) I think `format` is a special case that we tend to get hung up on, and > I think that, in this particular branch of the design space we are > exploring, perhaps a name other than `String.format` should be chosen for > the method that does string formatting on templates. Possible names are > `StringTemplate.format` and `String.format$`, but I will leave further > bikeshedding on this to others. I do recognize that this move will not > enable the type system per se to absolutely prevent programmers from writing > > String.format("Hello, my name is %s{name}"); // can you spot the bug? > > but, as Clement has observed, such cases will probably provoke a warning > about a mismatch between the number of arguments and the number of > %-specifiers that require parameters, so maybe overloading would be okay > anyway for `String.format`. > > Anyway, my point is that whether to overload a method to accept either a > string or a string template can be evaluated on a case-by-case basis > according to a small number of principles that I think we could enumerate > and explain pretty easily. > > ?Guy > > On Mar 14, 2024, at 1:40?PM, Maurizio Cimadamore > wrote: > > Not to pour too much cold water on the idea of having string interpolation > literal, but I?d like to mention a few points here. > > First, it was a deliberate design goal of the string template feature to > make interpolation an explicit act. Note that, if we had the syntax you > describe, we actually achieve the opposite effect: string interpolation is > now the default, and implicit, and actually *cheaper* (to type) than the > safer template alternative. This is a bit of a red herring, I think. > > The second problem is that interpolation literals can sometimes be > deceiving. Consider this example: > > String.format("Hello, my name is %s{name}"); // can you spot the bug? > > Where String::format has a new overload which accepts a StringTemplate. > > Basically, since here we forgot the leading ?$? (or whatever char that > is), the whole thing is just a big interpolation. Semantically equivalent > to: > > String.format("Hello, my name is %s" + name); // whoops! > > This will fail, as String::format will be waiting for an argument (a > string), but none is provided. So: > > | Exception java.util.MissingFormatArgumentException: Format specifier '%s' > | at Formatter.format (Formatter.java:2672) > | at Formatter.format (Formatter.java:2609) > | at String.format (String.java:2897) > | at (#2:1) > > This is a very odd (and new!) failure mode, that I?m sure is gonna > surprise developers. > > Maurizio > > On 14/03/2024 15:08, Guy Steele wrote: > > > Second thoughts about how to explain a string interpolation literal: > > > On Mar 13, 2024, at 2:02?PM, Guy Steele wrote: > . . . > > ????????? > String is not a subtype of StringTemplate; they are disjoint types. > > $?foo? is a (trivial) string template literal > ?foo? is a string literal > $?Hello, \{x}? is a (nontrivial) string template literal > ?Hello, \{x}? is a shorthand (expanded by the compiler) for `String.of($?Hello, \{x}?)` > ????????? > > Given that the intent is that String.of (or whatever we want to call it?possibly the `interpolation` instance method of class `StringTemplate` rather than a static method `String.of`) should just do standard string concatenation, we might be better off just saying that a string interpolation literal is expanded by the compiler into uses of ?+?; for example, > > ?Hello, \{x}.? > > (I have added a period to the example to make the point clearer) is expanded into > > ?Hello, ? + x + ?.? > > and in general > > ?c0\{e1}c1\{e2}c2?\{en}cn? > > (where each ck is a possibly empty sequence of string characters and each ek is an expression) is expanded into > > ?c0? + (e1) + ?c1? + (e2) + ?c2? + ? + (en) + ?cn? > > The point is that, with this definition, ?c0\{e1}c1\{e2}c2?\{en}cn? is a constant expression iff every ek is a constant expression. This is handy for interpolating constant variables into a string that is itself intended to be constant. > > ?Guy > > > > > ? > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From maurizio.cimadamore at oracle.com Fri Mar 15 14:54:01 2024 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Fri, 15 Mar 2024 14:54:01 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com> <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com> <330FB2A7-3154-4CC5-AA34-D4ECBFBC713C@oracle.com> <7873F0D5-E053-4541-B2A5-2E41B536DD8D@oracle.com> <794eaa0d-f244-43a3-af5b-7ecf11ac8a33@oracle.com> Message-ID: <40a9ecc1-f9e7-4fc5-a81e-33a468d568d8@oracle.com> Hi Is not all that rosy :-) Comments inline On 15/03/2024 13:48, Victor Nazarov wrote: > I think the above can be translated almost word for word to > StringTemplates world: > > * stringy-literal that doesn't have holes-with-values can be both > String and StringTemplate > *?stringy-literal that has holes-with-values can only be StringTemplate > * m(String),m(StringTemplate) with?stringy-literal that can be both > String and StringTemplate selects String-overload > * t = s works as when t is StringTemplate and s is String I assume you mean ?s is a /constant/ String? here. > * s = t is compile-time error > * s = (String) t succeeds, when s is String and t is StringTemplate > (and does string concatenation) Ok, here I note that you are defining cast conversion from StringTemplate to String as always successful (via interpolation). > * t instanceof String succeeds on StringTemplate variable t as long as > t doesn't have any holes-with-values This is inconsistent. You now have cases where ?t instanceof String? returns false, but where (String)t succeds. > * s instanceof StringTemplate succeeds when s is String Again, probably you mean ?constant String? here. > * additionally?stringy-literal can use "t" or "T" *suffix* to denote > that it is really a template, this can be used to tweak > overload-selection and to certify, that some processing of values is > expected Overall, while I agree this is not completely terrible, we are signing up for a lot of work here. There?s new conversion, relationship with pattern matching and instanceof to figure out, possible issues with overload resolution and inference. For instance, yesterday I mentioned this example: |List ls = List.of("Hello"); | Which won?t work. One way to look at it, is that it?s as broken as: |List ls = List.of(1); | But another way to look at it is that we?re adding more complexity to a part of the language that already is shaky. To me that feels like a big risk, especially given that the "payoff" is to leave an extra ?t? out at the beginning of the template. In orther words, we should be careful about right-sizing complexity. Also, regarding: > 2) avoiding proliferation of String-literal sublanguages as advocated > by Brian Goetz I don?t read that in the same way as you do. I think what Brian meant is that anything inside quotes should be uniform. We would not like to have different kinds of rules for escaping etc. depending on what kind of literal you use. In that sense, sticking a ?t? in front is no different from using ??? to denote that what?s coming is a text block. Maurizio ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Fri Mar 15 16:07:35 2024 From: guy.steele at oracle.com (Guy Steele) Date: Fri, 15 Mar 2024 16:07:35 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: <794eaa0d-f244-43a3-af5b-7ecf11ac8a33@oracle.com> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com> <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com> <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com> <330FB2A7-3154-4CC5-AA34-D4ECBFBC713C@oracle.com> <7873F0D5-E053-4541-B2A5-2E41B536DD8D@oracle.com> <794eaa0d-f244-43a3-af5b-7ecf11ac8a33@oracle.com> Message-ID: <942AC667-A175-4F72-9495-684F2FF9236E@oracle.com> On Mar 15, 2024, at 5:56?AM, Maurizio Cimadamore wrote:On 15/03/2024 02:10, Guy Steele wrote: Oh, I think I get it now; I misinterpreted "The compiler might require a prefix here? to mean "The compiler might require a prefix on a literal that is a method argument?, but I now see, from your later sentence "Basically, requiring all literals that have embedded expression to have a prefix . . .? that maybe you just want to adjust the syntax of literals to be roughly what Clement suggested: ??? plain string literal, cannot contain \{?}, type is String INTERPOLATION??? string interpolation, may contain \{?}, type is String TEMPLATE??? string template, , may contain \{?}, type is StringTemplate where the precise syntax for the prefixed INTERPOLATION and TEMPLATE is to be determined. Do I understand your proposal correctly now? Yes, with the further tweak that the prefix (with syntax TBD) might be omitted in the "obvious cases" (but kept for clarity): * "Hello" w/o prefix is just String * "Hello \{world}" without prefix is just StringTemplate Does this help? (I'm basically trying to get to a world where use of prefix will be relatively rare, as common cases have the right defaults). Yes, this helps immensely. So in the model you suggest, string templates would mostly not need prefixes, but in the example I raised where one might foresee editing templates so as to cross into or out of the edge case of zero template expressions, I could choose, if I wish, to write SQL.process(TEMPLATE?CREATE TABLE foo;?); SQL.process(TEMPLATE?ALTER TABLE foo ADD name varchar(40);?); SQL.process(TEMPLATE?ALTER TABLE foo ADD title varchar(30);?); SQL.process(TEMPLATE?INSERT INTO foo (name, title) VALUES (?Guy?, ?Hacker?);?); SQL.process(TEMPLATE?INSERT INTO foo (name, title) VALUES (\{other name}, \{other job});?); rather than SQL.process(?CREATE TABLE foo;?); SQL.process(?ALTER TABLE foo ADD name varchar(40);?); SQL.process(?ALTER TABLE foo ADD title varchar(30);?); SQL.process(?INSERT INTO foo (name, title) VALUES (?Guy?, ?Hacker?);?); SQL.process(TEMPLATE?INSERT INTO foo (name, title) VALUES (\{other name}, \{other job});?); That makes sense to me in this two-prefix model. Then again, now that I ponder the space of use cases, it may be that, despite my initial enthusiasm, having a separate string interpolation syntax may not carry its weight if its uses are relatively rare. We always have the option of using a string template and then applying an interpolation processor (which might be spelled `String.of(