From kevinb9n at gmail.com  Fri Mar  1 19:09:08 2024
From: kevinb9n at gmail.com (Kevin Bourrillion)
Date: Fri, 1 Mar 2024 11:09:08 -0800
Subject: Draft JEP: Derived Record Creation (Preview)
Message-ID: <CAFMUOF7m=7enqHJ8ezvGb8VHDr2wGRbr+Os69QY69uZqMVuM6A@mail.gmail.com>

Hi Gavin,

My response is mostly just to add grist to the mill for a feature that
looks great already. You might perhaps feel some of the points are worth
working into the proposal.

I have some angst over the term "derived" for this. It's not wrong, but is
an entirely different meaning from the one I encounter regularly: a
"derived field" being one that caches a value computed deterministically
from the other field values (a feature that record classes notably don't
support... sadly).

I think the more basic term is "modified", and I think it works: "creating
modified records". In the vernacular I think most people do understand that
a genetically "modified" soybean doesn't necessarily mean that any
particular bean was changed, only that it is an altered version of what it
would otherwise have been. "You can't *modify* a record instance, but you
can get a *modified* instance based on it." This feels to me like a good
reuse of existing terminology.


Suppose we want to evolve the state by doubling the x coordinate of a Point
> oldLoc, resulting in Point newLoc:
>
> Point newLoc = new Point(oldLoc.x()*2, oldLoc.y(), oldLoc.z());
> This code, while straightforward, is laborious. Deriving newLoc from
> oldLoc means extracting every component of oldLoc, whether it changes or
> not, and providing a value for every component of newLoc, even if unchanged
> from oldLoc. It would be a constant tax on productivity if developers had
> to repeatedly deconstruct one record value (extract all its components) in
> order to instantiate a new record value with mostly the same components.
>

It's also bug-prone in multiple ways.

It also is the worst-case maintenance scenario, the "any-every". When
adding, removing, or renaming *any* record component, *every* statement
like this throughout the codebase has to be changed. (True of record
constructor calls too, but there's not much we could do about that short of
optional parameters.)


However, wither methods have two problems. First, they add boilerplate to
> the record class,
>

Some boilerplate is relatively innocuous but this is
*high-maintenance* boilerplate,
which we have to carefully keep in sync with the record's component
declarations. Boo.


> Record values can be nested, where components are themselves record
> values. Derived instance creation expressions can be nested in order to
> transform nested record values. For example:
>
> record Marker(Point loc, String label, Icon icon) { }
>
> Marker m = new Marker(new Point(...), ..., ...);
> Marker scaled = m with { loc = loc with { x *= 2; y *= 2; z *= 2; }};
>

In fact, this is such a common need (in my experience), and what you have
to do today is such a horror show, that you might want to illustrate it as
part of the value proposition of the feature.


> Derived instance creation expressions can be used in record classes to
> simplify the implementation of basic operations. For example:
>
> record Complex(double re, double im) {
>     Complex conjugate() { return this with { im = -im; }; }
>     Complex realOnly()  { return this with { im = 0; }; }
>     Complex imOnly()    { return this with { re = 0; }; }
> }
>

This is very nice because now `conjugate()` has no relationship with `re`
at all, just as it should be. And it makes the essence of what each method
is for crystal-clear.


> Any assignment statements that occur within the transformation block have
> the following constraint: If the left-hand side of the assignment is an
> unqualified name, that name must be either (i) the name of a local
> component variable, or (ii) the name of a local variable that is declared
> explicitly in the transformation block.
>

And because there's no way to qualify a local variable from the surrounding
scope, reassigning such variables is simply impossible within this block.
Right?

That's "no great loss" of course, although I'm missing why the restriction
is necessary. The notion of variables that (at least in userspeak) are "in
scope for reading but not for writing" seems weird; does it have precedent?


The transformation block need only express the parts of the state being
> modified. If the transformation block is empty then the result of the
> derived instance creation expression is a copy of the value of the origin
> expression (the expression on the left-hand side).
>

This could be interpreted as saying that in this case the record's
constructor isn't even run, which I suspect isn't what you mean, and which
could make a difference (if best practices aren't being followed). Do you
need to say anything at all about this case?


If the origin value is null then evaluation of the derived instance
> creation expression completes abruptly with a NullPointerException.
>

... and if it isn't, then we can talk about the "origin instance" it refers
to.

I'd suggest avoiding the term "origin value" completely except for the
above, preferring to talk about the origin instance instead. I think that's
the way to be as clear as possible that none of what we're talking about
here cares whether the record class is a value class or not. But we can
dissect this further if need be.


Before executing the contents of the transformation block, a number of
> implicit local variable declaration statements are executed. These local
> variable declaration statements are derived from each record component in
> the header of the record class R, in order, as follows:
>
> The local variable declaration has the same name and declared type as the
> record component.
>

Overall, there have been several references here to the record class R, but
I would think it's the record *type* we really need to talk about. That
type post-substitution is what determines these variable types, no?

That also suggests we need to discuss wildcard capture here - or is that
addressed elsewhere?


A new instance of record class R is created as if by evaluating a new class
> instance creation expression (new) with the compile-time type of the origin
> expression and an argument list containing the local component variables,
> if any, in the order that they appear in the header of record class R.
>

Likewise, should this talk about type arguments too? This would I think
mean duplicating them from the record type but doing whatever fancy
footwork is required to deal with wildcards? (I assume that record
constructors themselves can't be generic.)

Implied in all this: I would think a record type like `MyRecord<String, ?>`
*should* be usable with `with` (of course, trying to assign to some
variables inside the transformation block isn't going to go well, but
likely the user just isn't referring to those variables at all in this
case).


> The use of a derived instance creation expression:
> can be thought of a switch expression:
>

imho this would be useful to state earlier!

What goes wrong if we think of this feature as *exactly* desugaring to that
switch code?


The structure and behavior of the transformation block in a derived
> instance creation expression is similar to the body of a compact
> constructor in a record class. Both have the same control flow restrictions
> (must complete normally or throw an exception); both have a set of
> pre-initialized variables in scope, which are expected to be mutated by the
> block; and both take the final values of those variables and pass them as
> arguments to a constructor invocation.
>

This would've been useful to state earlier too, to me anyway. The only
difference I thought of is that one can refer to `this` inside the
constructor (uh, right?) but there is no syntax to access the origin
expression in the transformation block. And that seems as it should be.


Alternatives
> Instead of supporting an expression form for use-site creation of new
> record values, we could support it at the declaration site with some form
> of special support for wither methods. We prefer the flexibility of
> use-site creation, whereas declaring wither methods would add bloat to
> record class declarations, which currently enjoy a high degree of
> succinctness.
>

This would also introduce a lot of potential for unpredictability. The
whole deal with records is that they act in highly predictable ways.

~~

Nano-scale details aside... this will be a very helpful feature for working
with records and I hope it happens!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240301/31c1f330/attachment.htm>

From brian.goetz at oracle.com  Sat Mar  2 16:08:37 2024
From: brian.goetz at oracle.com (Brian Goetz)
Date: Sat, 2 Mar 2024 11:08:37 -0500
Subject: Draft JEP: Derived Record Creation (Preview)
In-Reply-To: <CAFMUOF7m=7enqHJ8ezvGb8VHDr2wGRbr+Os69QY69uZqMVuM6A@mail.gmail.com>
References: <CAFMUOF7m=7enqHJ8ezvGb8VHDr2wGRbr+Os69QY69uZqMVuM6A@mail.gmail.com>
Message-ID: <1386995c-0735-41fa-9aae-b74473643477@oracle.com>


>     Any assignment statements that occur within the transformation
>     block have the following constraint: If the left-hand side of the
>     assignment is an unqualified name, that name must be either (i)
>     the name of a local component variable, or (ii) the name of a
>     local variable that is declared explicitly in the transformation
>     block.
>
>
> And because there's no way to qualify a local variable from the 
> surrounding scope, reassigning such variables is simply impossible 
> within this block. Right?
>
> That's "no great loss" of course, although I'm missing why the 
> restriction is necessary. The notion of variables that (at least in 
> userspeak) are "in scope for reading but not for writing" seems weird; 
> does it have precedent?

There is some precedent with lambdas/inner classes, where you can only 
access effectively final locals, though that wasn't really in our mind 
when we crafted this restriction.

The motivation for the restriction is twofold:

 ?- This is a functional idiom (think "state monad"), side-effecting the 
environment would be weird.? (Of course, you could launder side-effects 
through any of the usual means, including probably using a qualified 
acess (Foo.x = 3; this.y = 4), but you shouldn't.)

 ?- We intend to extend this to classes in the future.? This idiom is 
basically "take apart with deconstructor + transform state + reconstruct 
with constructor".? There's an overload selection problem buried in 
there, and the names of variables involved in the transform may be 
important inputs to that selection decision.

We're not sure that we'll want to do overload selection nominally in 
this manner, but we're not ready to say "we will never be able to"; 
having this restriction in place keeps the flexibility to do so.

>
>     The transformation block need only express the parts of the state
>     being modified. If the transformation block is empty then the
>     result of the derived instance creation expression is a copy of
>     the value of the origin expression (the expression on the
>     left-hand side).
>
>
> This could be interpreted as saying that in this case the record's 
> constructor isn't even run, which I suspect isn't what you mean, and 
> which could make a difference (if best practices aren't being 
> followed). Do you need to say anything at all about this case?

I interpret this question as "is the result guaranteed to have a 
distinct identity from the origin expression."? (Obviously, for value 
types, the answer is "huh, what's identity?")? But we probably do want 
to say that the constructor is always invoked to produce the result, 
even if the block is empty; that "copy" is more of an analogy.

>
> Overall, there have been several references here to the record class 
> R, but I would think it's the record /type/?we really need to talk 
> about. That type post-substitution is what determines these variable 
> types, no?

Yes.? It is probably a little more complicated than "the static type of 
the origin expression is the static type of the with expression", 
because of, as you say, wildcards (and other weirdo types).? You 
probably have to do an upward projection on the type of the origin 
expression, or something like that.

>     The use of a derived instance creation expression:
>     can be thought of a switch expression:
>
>
> imho this would be useful to state earlier!
>
> What goes wrong if we think of this feature as /exactly/?desugaring to 
> that switch code?

The set of statements permissible in the two contexts is probably 
slightly different; you can do a `yield <expression>` in the RHS of a 
switch case, but not in a reconstruction block.? Probably other subtle 
reasons too.? Our experience with "specify by syntactic expansion" 
frequently runs into annoying roadblocks because of things that are 
expressible in one context but not in the desugared context, or vice versa.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240302/4948d318/attachment-0001.htm>

From ccherlin at gmail.com  Thu Mar  7 17:31:02 2024
From: ccherlin at gmail.com (Clement Cherlin)
Date: Thu, 7 Mar 2024 11:31:02 -0600
Subject: Generic StringTemplates?
Message-ID: <CALEU8=znOVXBkAz70XTiW2m-u+WNg0Wg+rH-YHhkhufO2Gq3gA@mail.gmail.com>

I can envision some use cases for StringTemplate where I would want to
restrict the type of the interpolated expressions, either for static
type safety, or for the convenience of providing an implicit static
type to interpolated lambda expressions, or both.

For example, consider a Processor that takes Suppliers as interpolated
values and returns a Supplier, allowing for lazy evaluation:

public static final StringTemplate.Processor<Supplier<String>,
RuntimeException> LAZY =
    stringTemplate -> () -> StringTemplate.interpolate(
    stringTemplate.fragments(),
    stringTemplate.values().stream()
        .map(o -> ((Supplier<?>)o).get()).toList());

Using such a processor is awkward and not type-safe, requiring both an
unchecked cast in the processor and an explicit cast in every value
expression:

final Supplier<String> lazy = LAZY."Now: \{(Supplier<Instant>) Instant::now}";

You can use various workarounds, like defining a static method that
coerces its argument to Supplier<T>, but String Templates are supposed
to reduce existing boilerplate, not create new boilerplate.

Could StringTemplate have a type argument? I envision something like

public interface StringTemplate<T> {
    List<T> values();
    public interface Processor<T, R, E extends Throwable> {
        R process(StringTemplate<? extends T> stringTemplate) throws E;
        ...
    }
    ...
}

To support creating generic StringTemplates, the following syntax
could be legal, returning a StringTemplate with static type
StringTemplate<type>.

RAW.<type>"template with \{value}s of type..."

Without a type argument, the return value would have static type
StringTemplate<Object>.

Cheers,
Clement Cherlin

From brian.goetz at oracle.com  Fri Mar  8 18:35:03 2024
From: brian.goetz at oracle.com (Brian Goetz)
Date: Fri, 8 Mar 2024 18:35:03 +0000
Subject: Update on String Templates (JEP 459)
Message-ID: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>


Time to check in with where were are with String Templates.  We?ve gone through two rounds of preview, and have received some feedback.

As a reminder, the primary goal of gathering feedback is to learn things about the design or implementation that we don?t already know.  This could be bug reports, experience reports, code review, careful analysis, novel alternatives, etc.    And the best feedback usually comes from using the feature ?in anger? ? trying to actually write code with it.  (?Some people would prefer a different syntax? or ?some people would prefer we focused on string interpolation only? fall squarely in the ?things we already knew? camp.)

In the course of using this feature in the `jextract` project, we did learn quite a few things we didn?t already know, and this was conclusive enough that it has motivated us to adjust our approach in this feature.  Specifically, the role of processors is ?outsized? to the value they offer, and, after further exploration, we now believe it is possible to achieve the goals of the feature without an explicit ?processor? abstraction at all!  This is a very positive development.

First, I want to affirm that that the goals of the project have not changed.  From JEP 459:

Goals

? Simplify the writing of Java programs by making it easy to express strings that include values computed at run time.
? Enhance the readability of expressions that mix text and expressions, whether the text fits on a single source line (as with string literals) or spans several source lines (as with text blocks).
? Improve the security of Java programs that compose strings from user-provided values and pass them to other systems (e.g., building queries for databases) by supporting validation and transformation of both the template and the values of its embedded expressions.
? Retain flexibility by allowing Java libraries to define the formatting syntax used in string templates.
? Simplify the use of APIs that accept strings written in non-Java languages (e.g., SQL, XML, and JSON).
? Enable the creation of non-string values computed from literal text and embedded expressions without having to transit through an intermediate string representation.

Non-Goals
? It is not a goal to introduce syntactic sugar for Java's string concatenation operator (+), since that would circumvent the goal of validation.
? It is not a goal to deprecate or remove the StringBuilder and StringBuffer classes, which have traditionally been used for complex or programmatic string composition.

Another thing that has not changed is our view on the syntax for embedding expressions.  While many people did express the opinion of ?why not ?just' do what Kotlin/Scala does?, this issue was more than fully explored during the initial design round.  (In fact, while syntax disagreements are often purely subjective, this one was far more clear ? the $-syntax is objectively worse, and would be doubly so if injected into an existing language where there were already string literals in the wild.  This has all been more than adequately covered elsewhere, so I won?t rehash it here.)


Now, let?s talk about what we do think should change: the role of processors and the StringTemplate type.

Processors were envisioned as a means to abstract the transformation of templates to their final form (whether string, or something else.)  However, Java already has a well established means of abstracting behavior: methods.   (In fact, a processor application can be viewed as merely a new syntax for a method call.)  Our experience using the feature highlighted the question: When converting a SQL query expressed as a template to the form required by the database (such as PreparedStatement), why do we need to say:

  DB.?? template ??

When we could use an ordinary Java library:

  Query q = Query.of(??template??)

Indeed, one of the worst things about having processors in the language is that API designers are put in the difficult situation of not knowing whether to write a processor or an ordinary API, and often have to make that choice before the consequences are fully understood.  (To add to this, processors raise similar questions at the use site.) But the real criticism here is that template capture and processing are complected, when they should be separate, composable features.

This motivated us to revisit some of the reasons why processors were so central to the initial design in the first place.  And it turned out, this choice had been influenced ? perhaps overly so ? by early implementation experiments.  (One of the background design goals was to enable expensive operations like `String::format` to be (much) cheaper.  Without digressing too deeply on performance, String::format can be more than an order of magnitude worse than the equivalent concatenation operation, and this in turn sometimes motivates developers to use worse idioms for formatting.  The FMT processor brough that cost back in line with the equivalent concatenation.)  These early experiments biased the design towards needing to know the processor at the point of template capture, but upon reexamination we realized that there are other ways to achieve the desired performance goals without requiring processors to be known at capture time.  This, in turn, enabled us to revisit a point in the design space we had transited through earlier, where string templates were ?just a new kind of literal? and the job performed by processors could instead be performed by ordinary APIs.

At this point, a simpler design and implementation emerged that met the semantic, correctness, and performance goals: template literals (?Hello \{name}?) are simply the literal form of StringTemplate:

  StringTemplate st = ?Hello \{name}?;

String and StringTemplate remain unrelated types.  (We explored a number of ways to interconvert them, but they caused more trouble than they solved.)  Processing of string templates, including interpolation, is done by ordinary APIs that deal in StringTemplate, aided by some clever implementation tricks to ensure good performance.

For APIs where interpolation is known to be safe in the domain, such as PrintWriter, APIs can make that choice on behalf of the domain, by providing overloads to embody this design choice:

   void println(String) { ? }
   void println(StringTemplate) { ? interpolate and delegate to println(String) ?. }

The upshot is that for interpolation-safe APIs like println, we can use a template directly without giving up any safety:

   System.out.println(?Hello \{name}?);

In this example, the string template evaluates to StringTemplate, not String (no implicit interpolation), and chooses the StringTemplate overload of println, which in turn chooses how to process the template.  This stays true to the design principle that interpolation is dangerous enough that it should be an explicit choice in the code ? but it allows that choice to be made by libraries when the library is comfortable doing so.

Similarly, the FMT processor is replaced by an overload of String::format that interprets templates with embedded format specifiers (e.g., ?%d?):

  String format(String formatString, Object? parameters) { ? same as today ? }
  String format(StringTemplate template) {... equivalent of FMT ...}

And users can call this as:

  String s = String.format(?Hello %12s\{name}?);

Here, the String::format API has chosen to interpret string templates according to the rules previously specified in the FMT processor (not ordinary interpolation), but that choice is embedded in the library semantics so no further explicit choice at the use site is required.  The user already chose to pass it to String::format; that?s all the processing selection that is needed.

Where APIs do not express a choice of what template expansion means, users continue to be free to process them explicitly before passing them, using APIs that do (such as String::format or ordinary interpolation.).

The result is:

- The need for use-site "goop" (previously, the processor name; now, static or instance methods to process a template) goes away entirely when dealing with libraries that are already template-friendly.
- Even with libraries that require use-site goop, it is no more intrusive than before, and can be reduced over time as APIs get with the program.
- StringTemplate is just another type that APIs can support if they want.  The "DB" processor becomes an ordinary factory method that accepts a string template or an ordinary builder API.
- APIs now can have _more_ control over the timing and meaning of template processing, because we are not biasing so strongly towards early processing.
- It becomes easier to abstract over template processing (i.e., combine or manipulate templates as templates before processing)
- Interpolation remains an explicit choice, but ST-aware libraries can make this choice on behalf of the user.
- The language feature and API surface get considerably smaller, which is good.  Core JDK APIs (e.g., println, format, exception constructors) get upgraded to work with string templates.

The remaining question that everyone is probably asking is: ?so how do we do interpolation.?  The answer there is ?ordinary library methods?.  This might be a static method (String.join(StringTemplate)) or an instance method (template.join()), shed to be painted (but please, not right now.).

This is a sketch of direction, so feel free to pose questions/comments on the direction.  We?ll discuss the details as we go.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240308/72bafa90/attachment-0001.htm>

From ccherlin at gmail.com  Fri Mar  8 21:22:41 2024
From: ccherlin at gmail.com (Clement Cherlin)
Date: Fri, 8 Mar 2024 15:22:41 -0600
Subject: Update on String Templates (JEP 459)
In-Reply-To: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
Message-ID: <CALEU8=yqFwY1fbFxyexfQSZ-OTHMoJXKquCL=ovVt2JMySfURA@mail.gmail.com>

On Fri, Mar 8, 2024 at 2:54?PM Brian Goetz <brian.goetz at oracle.com> wrote:
>
>
> Time to check in with where were are with String Templates.  We?ve gone through two rounds of preview, and have received some feedback.
>
> As a reminder, the primary goal of gathering feedback is to learn things about the design or implementation that we don?t already know.  This could be bug reports, experience reports, code review, careful analysis, novel alternatives, etc.    And the best feedback usually comes from using the feature ?in anger? ? trying to actually write code with it.  (?Some people would prefer a different syntax? or ?some people would prefer we focused on string interpolation only? fall squarely in the ?things we already knew? camp.)
>
> In the course of using this feature in the `jextract` project, we did learn quite a few things we didn?t already know, and this was conclusive enough that it has motivated us to adjust our approach in this feature.  Specifically, the role of processors is ?outsized? to the value they offer, and, after further exploration, we now believe it is possible to achieve the goals of the feature without an explicit ?processor? abstraction at all!  This is a very positive development.
>
> First, I want to affirm that that the goals of the project have not changed.  From JEP 459:
>
> Goals
>
> ? Simplify the writing of Java programs by making it easy to express strings that include values computed at run time.
> ? Enhance the readability of expressions that mix text and expressions, whether the text fits on a single source line (as with string literals) or spans several source lines (as with text blocks).
> ? Improve the security of Java programs that compose strings from user-provided values and pass them to other systems (e.g., building queries for databases) by supporting validation and transformation of both the template and the values of its embedded expressions.
> ? Retain flexibility by allowing Java libraries to define the formatting syntax used in string templates.
> ? Simplify the use of APIs that accept strings written in non-Java languages (e.g., SQL, XML, and JSON).
> ? Enable the creation of non-string values computed from literal text and embedded expressions without having to transit through an intermediate string representation.
>
> Non-Goals
> ? It is not a goal to introduce syntactic sugar for Java's string concatenation operator (+), since that would circumvent the goal of validation.
> ? It is not a goal to deprecate or remove the StringBuilder and StringBuffer classes, which have traditionally been used for complex or programmatic string composition.
>
> Another thing that has not changed is our view on the syntax for embedding expressions.  While many people did express the opinion of ?why not ?just' do what Kotlin/Scala does?, this issue was more than fully explored during the initial design round.  (In fact, while syntax disagreements are often purely subjective, this one was far more clear ? the $-syntax is objectively worse, and would be doubly so if injected into an existing language where there were already string literals in the wild.  This has all been more than adequately covered elsewhere, so I won?t rehash it here.)
>
>
> Now, let?s talk about what we do think should change: the role of processors and the StringTemplate type.
>
> Processors were envisioned as a means to abstract the transformation of templates to their final form (whether string, or something else.)  However, Java already has a well established means of abstracting behavior: methods.   (In fact, a processor application can be viewed as merely a new syntax for a method call.)  Our experience using the feature highlighted the question: When converting a SQL query expressed as a template to the form required by the database (such as PreparedStatement), why do we need to say:
>
>   DB.?? template ??
>
> When we could use an ordinary Java library:
>
>   Query q = Query.of(??template??)
>
> Indeed, one of the worst things about having processors in the language is that API designers are put in the difficult situation of not knowing whether to write a processor or an ordinary API, and often have to make that choice before the consequences are fully understood.  (To add to this, processors raise similar questions at the use site.) But the real criticism here is that template capture and processing are complected, when they should be separate, composable features.
>
> This motivated us to revisit some of the reasons why processors were so central to the initial design in the first place.  And it turned out, this choice had been influenced ? perhaps overly so ? by early implementation experiments.  (One of the background design goals was to enable expensive operations like `String::format` to be (much) cheaper.  Without digressing too deeply on performance, String::format can be more than an order of magnitude worse than the equivalent concatenation operation, and this in turn sometimes motivates developers to use worse idioms for formatting.  The FMT processor brough that cost back in line with the equivalent concatenation.)  These early experiments biased the design towards needing to know the processor at the point of template capture, but upon reexamination we realized that there are other ways to achieve the desired performance goals without requiring processors to be known at capture time.  This, in turn, enabled us to revisit a point in the design space we had transited through earlier, where string templates were ?just a new kind of literal? and the job performed by processors could instead be performed by ordinary APIs.
>
> At this point, a simpler design and implementation emerged that met the semantic, correctness, and performance goals: template literals (?Hello \{name}?) are simply the literal form of StringTemplate:
>
>   StringTemplate st = ?Hello \{name}?;
>
> String and StringTemplate remain unrelated types.  (We explored a number of ways to interconvert them, but they caused more trouble than they solved.)  Processing of string templates, including interpolation, is done by ordinary APIs that deal in StringTemplate, aided by some clever implementation tricks to ensure good performance.
>
> For APIs where interpolation is known to be safe in the domain, such as PrintWriter, APIs can make that choice on behalf of the domain, by providing overloads to embody this design choice:
>
>    void println(String) { ? }
>    void println(StringTemplate) { ? interpolate and delegate to println(String) ?. }
>
> The upshot is that for interpolation-safe APIs like println, we can use a template directly without giving up any safety:
>
>    System.out.println(?Hello \{name}?);
>
> In this example, the string template evaluates to StringTemplate, not String (no implicit interpolation), and chooses the StringTemplate overload of println, which in turn chooses how to process the template.  This stays true to the design principle that interpolation is dangerous enough that it should be an explicit choice in the code ? but it allows that choice to be made by libraries when the library is comfortable doing so.
>
> Similarly, the FMT processor is replaced by an overload of String::format that interprets templates with embedded format specifiers (e.g., ?%d?):
>
>   String format(String formatString, Object? parameters) { ? same as today ? }
>   String format(StringTemplate template) {... equivalent of FMT ...}
>
> And users can call this as:
>
>   String s = String.format(?Hello %12s\{name}?);
>
> Here, the String::format API has chosen to interpret string templates according to the rules previously specified in the FMT processor (not ordinary interpolation), but that choice is embedded in the library semantics so no further explicit choice at the use site is required.  The user already chose to pass it to String::format; that?s all the processing selection that is needed.
>
> Where APIs do not express a choice of what template expansion means, users continue to be free to process them explicitly before passing them, using APIs that do (such as String::format or ordinary interpolation.).
>
> The result is:
>
> - The need for use-site "goop" (previously, the processor name; now, static or instance methods to process a template) goes away entirely when dealing with libraries that are already template-friendly.
> - Even with libraries that require use-site goop, it is no more intrusive than before, and can be reduced over time as APIs get with the program.
> - StringTemplate is just another type that APIs can support if they want.  The "DB" processor becomes an ordinary factory method that accepts a string template or an ordinary builder API.
> - APIs now can have _more_ control over the timing and meaning of template processing, because we are not biasing so strongly towards early processing.
> - It becomes easier to abstract over template processing (i.e., combine or manipulate templates as templates before processing)
> - Interpolation remains an explicit choice, but ST-aware libraries can make this choice on behalf of the user.
> - The language feature and API surface get considerably smaller, which is good.  Core JDK APIs (e.g., println, format, exception constructors) get upgraded to work with string templates.
>
> The remaining question that everyone is probably asking is: ?so how do we do interpolation.?  The answer there is ?ordinary library methods?.  This might be a static method (String.join(StringTemplate)) or an instance method (template.join()), shed to be painted (but please, not right now.).
>
> This is a sketch of direction, so feel free to pose questions/comments on the direction.  We?ll discuss the details as we go.

So this new approach is to make all template expressions return the
same unprocessed value that RAW."..." did previously? Excellent news!

I would still like a way to apply a static type to the interpolated
values at the point the StringTemplate is constructed, for reasons
such as constructing ergonomic DSLs using StringTemplates, and
implicitly typing lambdas.

My previous LAZY example simplifies to (presuming generic
StringTemplates are supported):

public static Supplier<String> lazy(StringTemplate<Supplier<?>>
stringTemplate) {
    return () -> Bikeshed.paint(
            stringTemplate.fragments(),
            stringTemplate.values().stream()
                    .map(Supplier::get)
                    .toList());
}

Usage:
Supplier<String> now = lazy("Now: \{ Instant::now }");

... some time later

System.out.println(now.get());

I imagine the existing type inference infrastructure is sufficient to
automatically derive the correct generic type of the template
expression (which will usually be StringTemplate<Object>) in almost
all cases.

Cheers,
Clement Cherlin

From amaembo at gmail.com  Sat Mar  9 11:48:26 2024
From: amaembo at gmail.com (Tagir Valeev)
Date: Sat, 9 Mar 2024 12:48:26 +0100
Subject: Update on String Templates (JEP 459)
In-Reply-To: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
Message-ID: <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>

The idea is interesting. There's a thing that disturbs me though.
Currently, proc."string" and proc."string \{template}" are uniformly
processed, and the processor may not care much about whether it's a string
or a template: both can be processed uniformly. After this change, removing
the last embedded expression from the template (e.g., after inlining a
constant) will implicitly change the type of the literal from
StringTemplate to String. This may either cause a compilation error, or
silently bind to another overload which may or may not behave like a
template overload with a single-fragment-template. For API authors, this
means that every method accepting StringTemplate should have a counterpart
accepting String. The logic inside both methods would likely be very
similar, so probably both will eventually call a third private method. For
API user, it could be unclear how to call a method accepting StringTemplate
if I have simple string in hands but there's no String method (or it does
slightly different thing due to poor API design). Should I use some ugly
construct like "This is a string but the API wants a template, so I append
an empty embedded expression\{""}"?

Note that we already have an inspection that warns about kinda useless
templates like STR."Hello \{"world"}" suggesting to replace them with
STR."Hello world". Such an inspection would not work after the proposed
change, as the expression type will differ.

I can still imagine that StringTemplate could be an interface providing
methods like fragments() and values() (like now), but String may implement
it, returning an empty list from values() and List.of(this) from
fragments(). Of course, method names could differ, to fit the String class
better. It would not be a problem if they are more verbose like
stringTemplateValues() and stringTemplateFragments(), because they are used
not very often. Anyway, not bikeshedding now. This would allow API
designers to provide only StringTemplate accepting method, and API users
should not think much when string template is suddenly becomes a string.
Also, automatic refactorings from StringTemplate to String, like shown
above, will still work.

Another advantage is that IDEs may suggest converting string concatenation
into template if the expected type is StringTemplate. With current
proposal, we may suggest to convert "Hello "+name into "Hello
\{name}".join(), which is a questionable improvement. However, if we are at
method call argument position, where the expected type is StringTemplate,
then we can suggest simply "Hello \{name}", which is much better.
Otherwise, we would need to check whether there's a
StringTemplate-accepting overload and hope that it does the same thing.

By the way, I assume that we agree on the toString() implementation of
non-String-StringTemplate: it should be a technical debug string, like now,
not doing the automatic interpolation. There would be a discrepancy with
String-StringTemplate if my suggestion is accepted, but I think it's not a
big problem.

With best regards,
Tagir Valeev.

On Fri, Mar 8, 2024 at 7:35?PM Brian Goetz <brian.goetz at oracle.com> wrote:

>
> Time to check in with where were are with String Templates.  We?ve gone
> through two rounds of preview, and have received some feedback.
>
> As a reminder, the primary goal of gathering feedback is to learn things
> about the design or implementation that we don?t already know.  This could
> be bug reports, experience reports, code review, careful analysis, novel
> alternatives, etc.    And the best feedback usually comes from using the
> feature ?in anger? ? trying to actually write code with it.  (?Some
> people would prefer a different syntax? or ?some people would prefer we
> focused on string interpolation only? fall squarely in the ?things we
> already knew? camp.)
>
> In the course of using this feature in the `jextract` project, we did
> learn quite a few things we didn?t already know, and this was conclusive
> enough that it has motivated us to adjust our approach in this feature.
>  Specifically, the role of processors is ?outsized? to the value they
> offer, and, after further exploration, we now believe it is possible to
> achieve the goals of the feature without an explicit ?processor?
> abstraction at all!  This is a very positive development.
>
> First, I want to affirm that that the goals of the project have not
> changed.  From JEP 459:
>
> Goals
>
> ? Simplify the writing of Java programs by making it easy to express
> strings that include values computed at run time.
> ? Enhance the readability of expressions that mix text and expressions,
> whether the text fits on a single source line (as with string literals) or
> spans several source lines (as with text blocks).
> ? Improve the security of Java programs that compose strings from
> user-provided values and pass them to other systems (e.g., building queries
> for databases) by supporting validation and transformation of both the
> template and the values of its embedded expressions.
> ? Retain flexibility by allowing Java libraries to define the formatting
> syntax used in string templates.
> ? Simplify the use of APIs that accept strings written in non-Java
> languages (e.g., SQL, XML, and JSON).
> ? Enable the creation of non-string values computed from literal text and
> embedded expressions without having to transit through an intermediate
> string representation.
>
> Non-Goals
> ? It is not a goal to introduce syntactic sugar for Java's string
> concatenation operator (+), since that would circumvent the goal of
> validation.
> ? It is not a goal to deprecate or remove the StringBuilder and
> StringBuffer classes, which have traditionally been used for complex or
> programmatic string composition.
>
> Another thing that has not changed is our view on the syntax for embedding
> expressions.  While many people did express the opinion of ?why not ?just'
> do what Kotlin/Scala does?, this issue was more than fully explored during
> the initial design round.  (In fact, while syntax disagreements are often
> purely subjective, this one was far more clear ? the $-syntax is
> objectively worse, and would be doubly so if injected into an existing
> language where there were already string literals in the wild.  This has
> all been more than adequately covered elsewhere, so I won?t rehash it here.)
>
>
> Now, let?s talk about what we do think should change: the role of
> processors and the StringTemplate type.
>
> Processors were envisioned as a means to abstract the transformation of
> templates to their final form (whether string, or something else.)
>  However, Java already has a well established means of abstracting
> behavior: methods.   (In fact, a processor application can be viewed as
> merely a new syntax for a method call.)  Our experience using the feature
> highlighted the question: When converting a SQL query expressed as a
> template to the form required by the database (such as PreparedStatement),
> why do we need to say:
>
>   DB.?? template ??
>
> When we could use an ordinary Java library:
>
>   Query q = Query.of(??template??)
>
> Indeed, one of the worst things about having processors in the language is
> that API designers are put in the difficult situation of not knowing
> whether to write a processor or an ordinary API, and often have to make
> that choice before the consequences are fully understood.  (To add to this,
> processors raise similar questions at the use site.) But the real criticism
> here is that template capture and processing are complected, when they
> should be separate, composable features.
>
> This motivated us to revisit some of the reasons why processors were so
> central to the initial design in the first place.  And it turned out, this
> choice had been influenced ? perhaps overly so ? by early implementation
> experiments.  (One of the background design goals was to enable expensive
> operations like `String::format` to be (much) cheaper.  Without digressing
> too deeply on performance, String::format can be more than an order of
> magnitude worse than the equivalent concatenation operation, and this in
> turn sometimes motivates developers to use worse idioms for formatting.
> The FMT processor brough that cost back in line with the equivalent
> concatenation.)  These early experiments biased the design towards needing
> to know the processor at the point of template capture, but upon
> reexamination we realized that there are other ways to achieve the desired
> performance goals without requiring processors to be known at capture
> time.  This, in turn, enabled us to revisit a point in the design space we
> had transited through earlier, where string templates were ?just a new kind
> of literal? and the job performed by processors could instead be performed
> by ordinary APIs.
>
> At this point, a simpler design and implementation emerged that met the
> semantic, correctness, and performance goals: template literals (?Hello
> \{name}?) are simply the literal form of StringTemplate:
>
>   StringTemplate st = ?Hello \{name}?;
>
> String and StringTemplate remain unrelated types.  (We explored a number
> of ways to interconvert them, but they caused more trouble than they
> solved.)  Processing of string templates, including interpolation, is done
> by ordinary APIs that deal in StringTemplate, aided by some clever
> implementation tricks to ensure good performance.
>
> For APIs where interpolation is known to be safe in the domain, such as
> PrintWriter, APIs can make that choice on behalf of the domain, by
> providing overloads to embody this design choice:
>
>    void println(String) { ? }
>    void println(StringTemplate) { ? interpolate and delegate to
> println(String) ?. }
>
> The upshot is that for interpolation-safe APIs like println, we can use a
> template directly without giving up any safety:
>
>    System.out.println(?Hello \{name}?);
>
> In this example, the string template evaluates to StringTemplate, not
> String (no implicit interpolation), and chooses the StringTemplate overload
> of println, which in turn chooses how to process the template.  This
> stays true to the design principle that interpolation is dangerous enough
> that it should be an explicit choice in the code ? but it allows that
> choice to be made by libraries when the library is comfortable doing so.
>
> Similarly, the FMT processor is replaced by an overload of String::format
> that interprets templates with embedded format specifiers (e.g., ?%d?):
>
>   String format(String formatString, Object? parameters) { ? same as today
> ? }
>   String format(StringTemplate template) {... equivalent of FMT ...}
>
> And users can call this as:
>
>   String s = String.format(?Hello %12s\{name}?);
>
> Here, the String::format API has chosen to interpret string templates
> according to the rules previously specified in the FMT processor (not
> ordinary interpolation), but that choice is embedded in the library
> semantics so no further explicit choice at the use site is required.  The
> user already chose to pass it to String::format; that?s all the processing
> selection that is needed.
>
> Where APIs do not express a choice of what template expansion means, users
> continue to be free to process them explicitly before passing them, using
> APIs that do (such as String::format or ordinary interpolation.).
>
> The result is:
>
> - The need for use-site "goop" (previously, the processor name; now,
> static or instance methods to process a template) goes away entirely when
> dealing with libraries that are already template-friendly.
> - Even with libraries that require use-site goop, it is no more intrusive
> than before, and can be reduced over time as APIs get with the program.
> - StringTemplate is just another type that APIs can support if they want.
> The "DB" processor becomes an ordinary factory method that accepts a string
> template or an ordinary builder API.
> - APIs now can have _more_ control over the timing and meaning of template
> processing, because we are not biasing so strongly towards early processing.
> - It becomes easier to abstract over template processing (i.e., combine or
> manipulate templates as templates before processing)
> - Interpolation remains an explicit choice, but ST-aware libraries can
> make this choice on behalf of the user.
> - The language feature and API surface get considerably smaller, which is
> good.  Core JDK APIs (e.g., println, format, exception constructors) get
> upgraded to work with string templates.
>
> The remaining question that everyone is probably asking is: ?so how do we
> do interpolation.?  The answer there is ?ordinary library methods?.  This
> might be a static method (String.join(StringTemplate)) or an instance
> method (template.join()), shed to be painted (but please, not right now.).
>
> This is a sketch of direction, so feel free to pose questions/comments on
> the direction.  We?ll discuss the details as we go.
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240309/00b7e093/attachment-0001.htm>

From brian.goetz at oracle.com  Sat Mar  9 17:03:32 2024
From: brian.goetz at oracle.com (Brian Goetz)
Date: Sat, 9 Mar 2024 17:03:32 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
Message-ID: <636B984E-A544-4155-81D1-8752037A973B@oracle.com>

The idea is interesting. There's a thing that disturbs me though. Currently, proc."string" and proc."string \{template}" are uniformly processed, and the processor may not care much about whether it's a string or a template: both can be processed uniformly.

Yes, this is one of the tradeoffs of this evolution (and was one of the advantages of the processor-is-required version.)  The PROC. Is a strong syntactic hint that whatever comes next is a template, even if it has zero holes.  In the current proposal, we have string literals and string template literals, and there are cases where we would like to use a String as the degenerate form of a template.  As mentioned, we transited through ?processors are optional, if no processor, its a string template? earlier in the design, and this was one of the reasons we thought that making the processor required all the time was preferable.  But now that processors are _gone_, the calculus shifts.

We experimented with various ways to address this, including ?String extends StringTemplate?, a boxing conversion from String to StringTemplate, and alternate literal forms like t??? that says ?its a template, dammit?.  But in the end, these either create new problems, or just don?t carry their weight.  So instead, we?ll just make sure there are conversion methods (StringTemplate::of, String::asTemplate) that users can insert to say what they mean.

For API authors, this means that every method accepting StringTemplate should have a counterpart accepting String.

I think this is overstated.  If you have a ST-accepting method only and pass a string, compiler diagnostics will remind you to convert it.  (And all of this discussion is about string *literals*; ordinary string expressions will still require explicit conversion, and should.). Many API points may choose to have both, but I don?t think this rises nearly to the level of a requirement.

I can still imagine that StringTemplate could be an interface providing methods like fragments() and values() (like now), but String may implement it, returning an empty list from values() and List.of(this) from fragments().

As mentioned, we explored this, but I think this cure is worse than the disease.  At root, this is a workaround for ?a string *literal* with no holes might want to be a template.?  I don?t think it makes sense to interpret *all* strings as templates.  And this led to some annoying overload selection / inference decisions (tinkering with the super types of String ripples throughout the JDK.).

By the way, I assume that we agree on the toString() implementation of non-String-StringTemplate: it should be a technical debug string, like now, not doing the automatic interpolation. There would be a discrepancy with String-StringTemplate if my suggestion is accepted, but I think it's not a big problem.

100%.  Interpolation is always an explicit choice of how to convert a ST to a String.


With best regards,
Tagir Valeev.

On Fri, Mar 8, 2024 at 7:35?PM Brian Goetz <brian.goetz at oracle.com<mailto:brian.goetz at oracle.com>> wrote:

Time to check in with where were are with String Templates.  We?ve gone through two rounds of preview, and have received some feedback.

As a reminder, the primary goal of gathering feedback is to learn things about the design or implementation that we don?t already know.  This could be bug reports, experience reports, code review, careful analysis, novel alternatives, etc.    And the best feedback usually comes from using the feature ?in anger? ? trying to actually write code with it.  (?Some people would prefer a different syntax? or ?some people would prefer we focused on string interpolation only? fall squarely in the ?things we already knew? camp.)

In the course of using this feature in the `jextract` project, we did learn quite a few things we didn?t already know, and this was conclusive enough that it has motivated us to adjust our approach in this feature.  Specifically, the role of processors is ?outsized? to the value they offer, and, after further exploration, we now believe it is possible to achieve the goals of the feature without an explicit ?processor? abstraction at all!  This is a very positive development.

First, I want to affirm that that the goals of the project have not changed.  From JEP 459:

Goals

? Simplify the writing of Java programs by making it easy to express strings that include values computed at run time.
? Enhance the readability of expressions that mix text and expressions, whether the text fits on a single source line (as with string literals) or spans several source lines (as with text blocks).
? Improve the security of Java programs that compose strings from user-provided values and pass them to other systems (e.g., building queries for databases) by supporting validation and transformation of both the template and the values of its embedded expressions.
? Retain flexibility by allowing Java libraries to define the formatting syntax used in string templates.
? Simplify the use of APIs that accept strings written in non-Java languages (e.g., SQL, XML, and JSON).
? Enable the creation of non-string values computed from literal text and embedded expressions without having to transit through an intermediate string representation.

Non-Goals
? It is not a goal to introduce syntactic sugar for Java's string concatenation operator (+), since that would circumvent the goal of validation.
? It is not a goal to deprecate or remove the StringBuilder and StringBuffer classes, which have traditionally been used for complex or programmatic string composition.

Another thing that has not changed is our view on the syntax for embedding expressions.  While many people did express the opinion of ?why not ?just' do what Kotlin/Scala does?, this issue was more than fully explored during the initial design round.  (In fact, while syntax disagreements are often purely subjective, this one was far more clear ? the $-syntax is objectively worse, and would be doubly so if injected into an existing language where there were already string literals in the wild.  This has all been more than adequately covered elsewhere, so I won?t rehash it here.)


Now, let?s talk about what we do think should change: the role of processors and the StringTemplate type.

Processors were envisioned as a means to abstract the transformation of templates to their final form (whether string, or something else.)  However, Java already has a well established means of abstracting behavior: methods.   (In fact, a processor application can be viewed as merely a new syntax for a method call.)  Our experience using the feature highlighted the question: When converting a SQL query expressed as a template to the form required by the database (such as PreparedStatement), why do we need to say:

  DB.?? template ??

When we could use an ordinary Java library:

  Query q = Query.of(??template??)

Indeed, one of the worst things about having processors in the language is that API designers are put in the difficult situation of not knowing whether to write a processor or an ordinary API, and often have to make that choice before the consequences are fully understood.  (To add to this, processors raise similar questions at the use site.) But the real criticism here is that template capture and processing are complected, when they should be separate, composable features.

This motivated us to revisit some of the reasons why processors were so central to the initial design in the first place.  And it turned out, this choice had been influenced ? perhaps overly so ? by early implementation experiments.  (One of the background design goals was to enable expensive operations like `String::format` to be (much) cheaper.  Without digressing too deeply on performance, String::format can be more than an order of magnitude worse than the equivalent concatenation operation, and this in turn sometimes motivates developers to use worse idioms for formatting.  The FMT processor brough that cost back in line with the equivalent concatenation.)  These early experiments biased the design towards needing to know the processor at the point of template capture, but upon reexamination we realized that there are other ways to achieve the desired performance goals without requiring processors to be known at capture time.  This, in turn, enabled us to revisit a point in the design space we had transited through earlier, where string templates were ?just a new kind of literal? and the job performed by processors could instead be performed by ordinary APIs.

At this point, a simpler design and implementation emerged that met the semantic, correctness, and performance goals: template literals (?Hello \{name}?) are simply the literal form of StringTemplate:

  StringTemplate st = ?Hello \{name}?;

String and StringTemplate remain unrelated types.  (We explored a number of ways to interconvert them, but they caused more trouble than they solved.)  Processing of string templates, including interpolation, is done by ordinary APIs that deal in StringTemplate, aided by some clever implementation tricks to ensure good performance.

For APIs where interpolation is known to be safe in the domain, such as PrintWriter, APIs can make that choice on behalf of the domain, by providing overloads to embody this design choice:

   void println(String) { ? }
   void println(StringTemplate) { ? interpolate and delegate to println(String) ?. }

The upshot is that for interpolation-safe APIs like println, we can use a template directly without giving up any safety:

   System.out.println(?Hello \{name}?);

In this example, the string template evaluates to StringTemplate, not String (no implicit interpolation), and chooses the StringTemplate overload of println, which in turn chooses how to process the template.  This stays true to the design principle that interpolation is dangerous enough that it should be an explicit choice in the code ? but it allows that choice to be made by libraries when the library is comfortable doing so.

Similarly, the FMT processor is replaced by an overload of String::format that interprets templates with embedded format specifiers (e.g., ?%d?):

  String format(String formatString, Object? parameters) { ? same as today ? }
  String format(StringTemplate template) {... equivalent of FMT ...}

And users can call this as:

  String s = String.format(?Hello %12s\{name}?);

Here, the String::format API has chosen to interpret string templates according to the rules previously specified in the FMT processor (not ordinary interpolation), but that choice is embedded in the library semantics so no further explicit choice at the use site is required.  The user already chose to pass it to String::format; that?s all the processing selection that is needed.

Where APIs do not express a choice of what template expansion means, users continue to be free to process them explicitly before passing them, using APIs that do (such as String::format or ordinary interpolation.).

The result is:

- The need for use-site "goop" (previously, the processor name; now, static or instance methods to process a template) goes away entirely when dealing with libraries that are already template-friendly.
- Even with libraries that require use-site goop, it is no more intrusive than before, and can be reduced over time as APIs get with the program.
- StringTemplate is just another type that APIs can support if they want.  The "DB" processor becomes an ordinary factory method that accepts a string template or an ordinary builder API.
- APIs now can have _more_ control over the timing and meaning of template processing, because we are not biasing so strongly towards early processing.
- It becomes easier to abstract over template processing (i.e., combine or manipulate templates as templates before processing)
- Interpolation remains an explicit choice, but ST-aware libraries can make this choice on behalf of the user.
- The language feature and API surface get considerably smaller, which is good.  Core JDK APIs (e.g., println, format, exception constructors) get upgraded to work with string templates.

The remaining question that everyone is probably asking is: ?so how do we do interpolation.?  The answer there is ?ordinary library methods?.  This might be a static method (String.join(StringTemplate)) or an instance method (template.join()), shed to be painted (but please, not right now.).

This is a sketch of direction, so feel free to pose questions/comments on the direction.  We?ll discuss the details as we go.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240309/6c035b48/attachment-0001.htm>

From guy.steele at oracle.com  Sat Mar  9 20:45:57 2024
From: guy.steele at oracle.com (Guy Steele)
Date: Sat, 9 Mar 2024 20:45:57 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <636B984E-A544-4155-81D1-8752037A973B@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <636B984E-A544-4155-81D1-8752037A973B@oracle.com>
Message-ID: <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com>


Sent from my iPhone

> On Mar 9, 2024, at 12:03?PM, Brian Goetz <brian.goetz at oracle.com> wrote:
> . . .
> At root, this is a workaround for ?a string *literal* with no holes might want to be a template.?  I don?t think it makes sense to interpret *all* strings as templates.  And this led to some annoying overload selection / inference decisions (tinkering with the super types of String ripples throughout the JDK.). 

Right, we don?t want to interpret all strings as templates, and we don?t want to have a conversion that lets any String expression to be converted to a template. 

But what about a more targeted conversion?

Recall that assignment conversion has a special case that allows a narrowing primitive conversion under certain circumstances where the right-hand side is a constant expression, thus allowing assignment of, for example, the constant literal 1 (nominally of type int) to a variable of type byte, short, or char.

Have you considered allowing conversion of _constant_ expressions of type String to templates in assignment contexts and invocation contexts? (Presumably this could be implemented by having the compiler automatically wrap the constant expression within an invocation of something like StringTemplate.of(?).) This would of course kick in for a method invocation only if there is no applicable overloading that does not need the conversion. 

As a rule I don?t like enlarging the can of worms known as ?special-case conversions?, but I think this would have sufficient utility that it would be worth doing, especially given the precedent that the compiler already knows a great deal of special information about class Java.lang.String. 


From brian.goetz at oracle.com  Sat Mar  9 23:52:19 2024
From: brian.goetz at oracle.com (Brian Goetz)
Date: Sat, 9 Mar 2024 23:52:19 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <636B984E-A544-4155-81D1-8752037A973B@oracle.com>
 <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com>
Message-ID: <B6E205D1-C0BE-4506-A06C-7DF03A3910C1@oracle.com>

Maurizio did prototype almost exactly this: treat compile-time constant string expressions as poly expressions whose standalone type is String but which could be treated as ST as well.  I?ll let him recap the details, but I think the upshot was that we had overload selection problems with m(String) vs m(StringTemplate), as both were applicable, unless we wanted to treat this as akin to a boxing conversion where we preferred ?unboxed? overloads as we do with loose vs strict method invocation contexts.  

There were other things we tried too for the special case of string literals as degenerate templates ? I?ll let Maurizio give the details, because I?m sure I will have forgotten one or two.

> On Mar 9, 2024, at 12:45 PM, Guy Steele <guy.steele at oracle.com> wrote:
> 
> 
> Sent from my iPhone
> 
>> On Mar 9, 2024, at 12:03?PM, Brian Goetz <brian.goetz at oracle.com> wrote:
>> . . .
>> At root, this is a workaround for ?a string *literal* with no holes might want to be a template.?  I don?t think it makes sense to interpret *all* strings as templates.  And this led to some annoying overload selection / inference decisions (tinkering with the super types of String ripples throughout the JDK.). 
> 
> Right, we don?t want to interpret all strings as templates, and we don?t want to have a conversion that lets any String expression to be converted to a template. 
> 
> But what about a more targeted conversion?
> 
> Recall that assignment conversion has a special case that allows a narrowing primitive conversion under certain circumstances where the right-hand side is a constant expression, thus allowing assignment of, for example, the constant literal 1 (nominally of type int) to a variable of type byte, short, or char.
> 
> Have you considered allowing conversion of _constant_ expressions of type String to templates in assignment contexts and invocation contexts? (Presumably this could be implemented by having the compiler automatically wrap the constant expression within an invocation of something like StringTemplate.of(?).) This would of course kick in for a method invocation only if there is no applicable overloading that does not need the conversion. 
> 
> As a rule I don?t like enlarging the can of worms known as ?special-case conversions?, but I think this would have sufficient utility that it would be worth doing, especially given the precedent that the compiler already knows a great deal of special information about class Java.lang.String. 
> 


From guy.steele at oracle.com  Sun Mar 10 01:38:34 2024
From: guy.steele at oracle.com (Guy Steele)
Date: Sun, 10 Mar 2024 01:38:34 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <B6E205D1-C0BE-4506-A06C-7DF03A3910C1@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <636B984E-A544-4155-81D1-8752037A973B@oracle.com>
 <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com>
 <B6E205D1-C0BE-4506-A06C-7DF03A3910C1@oracle.com>
Message-ID: <E463738E-3796-427B-8B46-FE3B1853BBD8@oracle.com>


> On Mar 9, 2024, at 6:52?PM, Brian Goetz <brian.goetz at oracle.com> wrote:
> 
> Maurizio did prototype almost exactly this: treat compile-time constant string expressions as poly expressions whose standalone type is String but which could be treated as ST as well.  I?ll let him recap the details, but I think the upshot was that we had overload selection problems with m(String) vs m(StringTemplate), as both were applicable, unless we wanted to treat this as akin to a boxing conversion where we preferred ?unboxed? overloads as we do with loose vs strict method invocation contexts. 

Yep, that is exactly what you would have to do: give preference to overloads that do not require the conversion. I don't doubt that this would require special finagling in the compiler?s overload resolution code, since it is not perfectly analogous to anything already in  the language.
> 
> There were other things we tried too for the special case of string literals as degenerate templates ? I?ll let Maurizio give the details, because I?m sure I will have forgotten one or two.

I?m all ears!

>> On Mar 9, 2024, at 12:45 PM, Guy Steele <guy.steele at oracle.com> wrote:
>> 
>> 
>> Sent from my iPhone
>> 
>>> On Mar 9, 2024, at 12:03?PM, Brian Goetz <brian.goetz at oracle.com> wrote:
>>> . . .
>>> At root, this is a workaround for ?a string *literal* with no holes might want to be a template.?  I don?t think it makes sense to interpret *all* strings as templates.  And this led to some annoying overload selection / inference decisions (tinkering with the super types of String ripples throughout the JDK.).
>> 
>> Right, we don?t want to interpret all strings as templates, and we don?t want to have a conversion that lets any String expression to be converted to a template. 
>> 
>> But what about a more targeted conversion?
>> 
>> Recall that assignment conversion has a special case that allows a narrowing primitive conversion under certain circumstances where the right-hand side is a constant expression, thus allowing assignment of, for example, the constant literal 1 (nominally of type int) to a variable of type byte, short, or char.
>> 
>> Have you considered allowing conversion of _constant_ expressions of type String to templates in assignment contexts and invocation contexts? (Presumably this could be implemented by having the compiler automatically wrap the constant expression within an invocation of something like StringTemplate.of(?).) This would of course kick in for a method invocation only if there is no applicable overloading that does not need the conversion. 
>> 
>> As a rule I don?t like enlarging the can of worms known as ?special-case conversions?, but I think this would have sufficient utility that it would be worth doing, especially given the precedent that the compiler already knows a great deal of special information about class Java.lang.String. 
>> 
> 


From attila.kelemen85 at gmail.com  Sun Mar 10 20:41:43 2024
From: attila.kelemen85 at gmail.com (Attila Kelemen)
Date: Sun, 10 Mar 2024 21:41:43 +0100
Subject: Update on String Templates (JEP 459)
In-Reply-To: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
Message-ID: <CAKDaPBevSJhZyYp_Z_DksLV6MBqk0+VB7HskkEiHy0Y1o0baxA@mail.gmail.com>

If the string processing burden is now pushed to the consumer API side,
then wouldn't it be worthwhile to make `StringTemplate` simpler given that
this means a lot more people are forced to implement processors? I mean
that having two lists where you have to alternate between the two is rather
unintuitive which is proven by the fact that it forces `StringTemplate` to
do the empty string hacks to support alternating between the two lists.

Given that we have these nice pattern matching syntaxes, wouldn't it be
much nicer to make `StringTemplate` to be a simple wrapper for a
`List<StringTemplate.Part>`, where `StringTemplate.Part` is a sealed
interface implemented by `String` and `StringTemplate.ValueRef` (or
whatever equivalent). In this case, you could just write a processor with a
simple loop like this:

```
var sb = new StringBuilder();
st.parts().forEach(part -> {
  switch (part) {
    case String -> sb.append(part);
    case StringTemplate.ValueRef ->
sb.append(formatValue(valueRef.value()));
  }
})
```

A processor logic would be just much more easier to read than the double
iterator counterpart (and in my opinion even easier than trying to use the
stencil). An added benefit is that there would be little need to ban a
character from ST in this case. Of course, the flip side is that we would
need all values to be wrapped, but that doesn't seem like a high cost to me
(especially if `ValueRef` would eventually be a value type, then I'm
guessing this extra cost would be possible to be mostly optimized away),
because it is unlikely to have so many values in an ST for this to matter.
Not to mention that having double iterators would have additional cost as
well.

Attila


Brian Goetz <brian.goetz at oracle.com> ezt ?rta (id?pont: 2024. m?rc. 8., P,
21:54):

>
> Time to check in with where were are with String Templates.  We?ve gone
> through two rounds of preview, and have received some feedback.
>
> As a reminder, the primary goal of gathering feedback is to learn things
> about the design or implementation that we don?t already know.  This could
> be bug reports, experience reports, code review, careful analysis, novel
> alternatives, etc.    And the best feedback usually comes from using the
> feature ?in anger? ? trying to actually write code with it.  (?Some
> people would prefer a different syntax? or ?some people would prefer we
> focused on string interpolation only? fall squarely in the ?things we
> already knew? camp.)
>
> In the course of using this feature in the `jextract` project, we did
> learn quite a few things we didn?t already know, and this was conclusive
> enough that it has motivated us to adjust our approach in this feature.
>  Specifically, the role of processors is ?outsized? to the value they
> offer, and, after further exploration, we now believe it is possible to
> achieve the goals of the feature without an explicit ?processor?
> abstraction at all!  This is a very positive development.
>
> First, I want to affirm that that the goals of the project have not
> changed.  From JEP 459:
>
> Goals
>
> ? Simplify the writing of Java programs by making it easy to express
> strings that include values computed at run time.
> ? Enhance the readability of expressions that mix text and expressions,
> whether the text fits on a single source line (as with string literals) or
> spans several source lines (as with text blocks).
> ? Improve the security of Java programs that compose strings from
> user-provided values and pass them to other systems (e.g., building queries
> for databases) by supporting validation and transformation of both the
> template and the values of its embedded expressions.
> ? Retain flexibility by allowing Java libraries to define the formatting
> syntax used in string templates.
> ? Simplify the use of APIs that accept strings written in non-Java
> languages (e.g., SQL, XML, and JSON).
> ? Enable the creation of non-string values computed from literal text and
> embedded expressions without having to transit through an intermediate
> string representation.
>
> Non-Goals
> ? It is not a goal to introduce syntactic sugar for Java's string
> concatenation operator (+), since that would circumvent the goal of
> validation.
> ? It is not a goal to deprecate or remove the StringBuilder and
> StringBuffer classes, which have traditionally been used for complex or
> programmatic string composition.
>
> Another thing that has not changed is our view on the syntax for embedding
> expressions.  While many people did express the opinion of ?why not ?just'
> do what Kotlin/Scala does?, this issue was more than fully explored during
> the initial design round.  (In fact, while syntax disagreements are often
> purely subjective, this one was far more clear ? the $-syntax is
> objectively worse, and would be doubly so if injected into an existing
> language where there were already string literals in the wild.  This has
> all been more than adequately covered elsewhere, so I won?t rehash it here.)
>
>
> Now, let?s talk about what we do think should change: the role of
> processors and the StringTemplate type.
>
> Processors were envisioned as a means to abstract the transformation of
> templates to their final form (whether string, or something else.)
>  However, Java already has a well established means of abstracting
> behavior: methods.   (In fact, a processor application can be viewed as
> merely a new syntax for a method call.)  Our experience using the feature
> highlighted the question: When converting a SQL query expressed as a
> template to the form required by the database (such as PreparedStatement),
> why do we need to say:
>
>   DB.?? template ??
>
> When we could use an ordinary Java library:
>
>   Query q = Query.of(??template??)
>
> Indeed, one of the worst things about having processors in the language is
> that API designers are put in the difficult situation of not knowing
> whether to write a processor or an ordinary API, and often have to make
> that choice before the consequences are fully understood.  (To add to this,
> processors raise similar questions at the use site.) But the real criticism
> here is that template capture and processing are complected, when they
> should be separate, composable features.
>
> This motivated us to revisit some of the reasons why processors were so
> central to the initial design in the first place.  And it turned out, this
> choice had been influenced ? perhaps overly so ? by early implementation
> experiments.  (One of the background design goals was to enable expensive
> operations like `String::format` to be (much) cheaper.  Without digressing
> too deeply on performance, String::format can be more than an order of
> magnitude worse than the equivalent concatenation operation, and this in
> turn sometimes motivates developers to use worse idioms for formatting.
> The FMT processor brough that cost back in line with the equivalent
> concatenation.)  These early experiments biased the design towards needing
> to know the processor at the point of template capture, but upon
> reexamination we realized that there are other ways to achieve the desired
> performance goals without requiring processors to be known at capture
> time.  This, in turn, enabled us to revisit a point in the design space we
> had transited through earlier, where string templates were ?just a new kind
> of literal? and the job performed by processors could instead be performed
> by ordinary APIs.
>
> At this point, a simpler design and implementation emerged that met the
> semantic, correctness, and performance goals: template literals (?Hello
> \{name}?) are simply the literal form of StringTemplate:
>
>   StringTemplate st = ?Hello \{name}?;
>
> String and StringTemplate remain unrelated types.  (We explored a number
> of ways to interconvert them, but they caused more trouble than they
> solved.)  Processing of string templates, including interpolation, is done
> by ordinary APIs that deal in StringTemplate, aided by some clever
> implementation tricks to ensure good performance.
>
> For APIs where interpolation is known to be safe in the domain, such as
> PrintWriter, APIs can make that choice on behalf of the domain, by
> providing overloads to embody this design choice:
>
>    void println(String) { ? }
>    void println(StringTemplate) { ? interpolate and delegate to
> println(String) ?. }
>
> The upshot is that for interpolation-safe APIs like println, we can use a
> template directly without giving up any safety:
>
>    System.out.println(?Hello \{name}?);
>
> In this example, the string template evaluates to StringTemplate, not
> String (no implicit interpolation), and chooses the StringTemplate overload
> of println, which in turn chooses how to process the template.  This
> stays true to the design principle that interpolation is dangerous enough
> that it should be an explicit choice in the code ? but it allows that
> choice to be made by libraries when the library is comfortable doing so.
>
> Similarly, the FMT processor is replaced by an overload of String::format
> that interprets templates with embedded format specifiers (e.g., ?%d?):
>
>   String format(String formatString, Object? parameters) { ? same as today
> ? }
>   String format(StringTemplate template) {... equivalent of FMT ...}
>
> And users can call this as:
>
>   String s = String.format(?Hello %12s\{name}?);
>
> Here, the String::format API has chosen to interpret string templates
> according to the rules previously specified in the FMT processor (not
> ordinary interpolation), but that choice is embedded in the library
> semantics so no further explicit choice at the use site is required.  The
> user already chose to pass it to String::format; that?s all the processing
> selection that is needed.
>
> Where APIs do not express a choice of what template expansion means, users
> continue to be free to process them explicitly before passing them, using
> APIs that do (such as String::format or ordinary interpolation.).
>
> The result is:
>
> - The need for use-site "goop" (previously, the processor name; now,
> static or instance methods to process a template) goes away entirely when
> dealing with libraries that are already template-friendly.
> - Even with libraries that require use-site goop, it is no more intrusive
> than before, and can be reduced over time as APIs get with the program.
> - StringTemplate is just another type that APIs can support if they want.
> The "DB" processor becomes an ordinary factory method that accepts a string
> template or an ordinary builder API.
> - APIs now can have _more_ control over the timing and meaning of template
> processing, because we are not biasing so strongly towards early processing.
> - It becomes easier to abstract over template processing (i.e., combine or
> manipulate templates as templates before processing)
> - Interpolation remains an explicit choice, but ST-aware libraries can
> make this choice on behalf of the user.
> - The language feature and API surface get considerably smaller, which is
> good.  Core JDK APIs (e.g., println, format, exception constructors) get
> upgraded to work with string templates.
>
> The remaining question that everyone is probably asking is: ?so how do we
> do interpolation.?  The answer there is ?ordinary library methods?.  This
> might be a static method (String.join(StringTemplate)) or an instance
> method (template.join()), shed to be painted (but please, not right now.).
>
> This is a sketch of direction, so feel free to pose questions/comments on
> the direction.  We?ll discuss the details as we go.
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240310/72b2ed0a/attachment-0001.htm>

From maurizio.cimadamore at oracle.com  Mon Mar 11 12:15:51 2024
From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore)
Date: Mon, 11 Mar 2024 12:15:51 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <B6E205D1-C0BE-4506-A06C-7DF03A3910C1@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <636B984E-A544-4155-81D1-8752037A973B@oracle.com>
 <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com>
 <B6E205D1-C0BE-4506-A06C-7DF03A3910C1@oracle.com>
Message-ID: <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com>

Hi all,
we tried mainly three approaches to allow smoother interop between 
strings and string templates: (a) make String a subclass of 
StringTemplate. Or (b) make constant strings bs /convertible/ to string 
templates. Or, (c) use target-typing. All these approaches have some 
issues, discussed below.

The first approach is slightly simpler, because it can be achieved 
entirely outside of the Java language. Unfortunately, adding ?String 
implements StringTemplate? adds overload ambiguities in cases such as this:

|format(StringTemplate) // 1 format(String, Object...) // 2 |

This is actually a very important case, as we predice that 
StringTemplate will serve as a great replacement for methods out there 
accepting a string/Object? pack.

Unfortunatly, if String <: StringTemplate, this means that calling 
format with a string literal will resolve to (1), not (2) as before. The 
problem here is that (2) is not even applicable during the two overload 
resolution phases (which is only allowed to use subtyping and 
conversions, respectively), as it is a varargs method. Because of this, 
(1) will now take the precedence, as that?s not varargs. While for 
String::format this is probably harmless, changing results of overload 
selection is something that should be done with care (esp. if different 
overloads have different return types), as it could lead to source 
compatibility issues.

On top of these issues, making all strings be string templates has the 
disadvantage of also considering ?messy? strings obtained via 
concatenation of non-constant values string templates too, which seems bad.

To overcome these issues, we attempetd to add an implicit conversion 
from /constant/ strings to StringTemplate. As it was observed by Guy, in 
case of ambiguities, the non-converting variants (e.g. m(String)) would 
be preferred. That said, in the above example (with varargs) we would 
still get a potentially incompatible change - as a string literal would 
be applicable in (1) before (2) is even considered, so the same concerns 
surrounding overload resolution changes would remain.

Another thing that came up is that conversions automatically bring in 
casting conversions. E.g. if you can go from A to B using assignment 
conversion, you can typically go the same direction using casting 
conversion. This raises two issues. The first is that casting conversion 
is generally a symmetric type relationship (e.g. if you can cast from A 
to B, then you can cast from B to A), while here we?re mostly discussing 
about one direction. But this is, perhaps, not a big deal - after all, 
?constant strings? don?t have a denotable type, so perhaps it should 
come to no surprise that you can?t use them as a /target/ type for a cast.

The second ?issue? is that casting conversion brings about patterns, as 
that?s how pattern applicability is defined. For instance:

|switch("Hello") { case StringTemplate st ... } |

To make this work we would need at least to tweak exhaustiveness 
(otherwise javac would think the above switch is not exhaustive, and ask 
you to add a default). Secondly, some tweaks to the runtime tests would 
be required also. Not impossible, but would require some more work to 
make sure we?re ok with this direction.

Another issue with the conversion is that it would expose a sharp edge 
in the current overload resolution and inference machinery. For 
instance, this program doesn?t compile correctly:

|List<Integer> li = List.of(1, 1L) |

Similarly, this program would also not compile correctly:

|List<StringTemplate> li = List.of("Hello", "Hello \{world}"); |

The last possibility would be to say that a string literal is a /poly 
expression/. As such, a string literal can be typed to either String or 
StringTemplate depending on the target type (for instance, this is close 
to how int literals also work).

This approach would still suffer from the same incompatible overload 
changes with varargs method as the other approaches. But, by avoiding to 
add a conversion, it makes things a little easier: for instance, in the 
case of pattern matching, nothing needs to be done, as the string 
literal will be turned into a string template /before/ the switch even 
takes place (meaning that existing exhaustiveness and runtime checks 
would still work). But, there?s still dragons and irregularities when it 
comes to inference - for instance:

|List<StringTemplate> lst = List.of("hello", "world"); |

This would not type-check: we need a target-type to know which way the 
literal is going (List::of just accepts a type-variable X). Note that 
overload resolution happens at a time where the target-type is not 
known, so here we?d probably pick X = String, which will then fail to 
type-check against the target.

Another issue with target-typing is that if you have two overloads:

|m(String) m(StringTemplate) |

And you call this with a string literal, you get an ambiguity: you can 
go both ways, but String and StringTemplate are unrelated types, so we 
can?t pick one as ?most specific?. This issue could be addressed, in 
principle, by adding an ad-hoc most specific rule that, in case of an 
ambiguity, always gave precedence to String over StringTemplate. We do a 
similar trick for lambda expressions, where if two method accepts 
similarly looking functional interface, we give precedence to the 
non-boxing one.

Anyway, the general message here is that it?s a bit of a ?pick your 
posion? situation. Adding a more fluid relationship between string and 
templates is definitively possible, but there are risks that this will 
impact negatively other areas of the language, risks that would need to 
be assessed very carefully.

Another, simpler, option we consider was to use some kind of prefix to 
mark a string template literal (e.g. make that explicit, instead of 
resorting to language wizardry). That works, but has the disadvantage of 
breaking the spell that there is only ?one string literal?, which is 
something we have worked quite hard to achieve.

Cheers
Maurizio

On 09/03/2024 23:52, Brian Goetz wrote:

> I?ll let Maurizio give the details, because I?m sure I will have forgotten one or two.

?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240311/bd4aba8e/attachment-0001.htm>

From ccherlin at gmail.com  Mon Mar 11 13:54:46 2024
From: ccherlin at gmail.com (Clement Cherlin)
Date: Mon, 11 Mar 2024 08:54:46 -0500
Subject: Update on String Templates (JEP 459)
In-Reply-To: <CAKDaPBevSJhZyYp_Z_DksLV6MBqk0+VB7HskkEiHy0Y1o0baxA@mail.gmail.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAKDaPBevSJhZyYp_Z_DksLV6MBqk0+VB7HskkEiHy0Y1o0baxA@mail.gmail.com>
Message-ID: <CALEU8=x_e55MbJvyh-sfS5MAE=JMesiYkyXbeiGCedPEYeq0sw@mail.gmail.com>

On Sun, Mar 10, 2024 at 3:42?PM Attila Kelemen
<attila.kelemen85 at gmail.com> wrote:
>
> If the string processing burden is now pushed to the consumer API side, then wouldn't it be worthwhile to make `StringTemplate` simpler given that this means a lot more people are forced to implement processors? I mean that having two lists where you have to alternate between the two is rather unintuitive which is proven by the fact that it forces `StringTemplate` to do the empty string hacks to support alternating between the two lists.
>
> Given that we have these nice pattern matching syntaxes, wouldn't it be much nicer to make `StringTemplate` to be a simple wrapper for a `List<StringTemplate.Part>`, where `StringTemplate.Part` is a sealed interface implemented by `String` and `StringTemplate.ValueRef` (or whatever equivalent). In this case, you could just write a processor with a simple loop like this:
>
> ```
> var sb = new StringBuilder();
> st.parts().forEach(part -> {
>   switch (part) {
>     case String -> sb.append(part);
>     case StringTemplate.ValueRef -> sb.append(formatValue(valueRef.value()));
>   }
> })
> ```
>
> A processor logic would be just much more easier to read than the double iterator counterpart (and in my opinion even easier than trying to use the stencil). An added benefit is that there would be little need to ban a character from ST in this case. Of course, the flip side is that we would need all values to be wrapped, but that doesn't seem like a high cost to me (especially if `ValueRef` would eventually be a value type, then I'm guessing this extra cost would be possible to be mostly optimized away), because it is unlikely to have so many values in an ST for this to matter. Not to mention that having double iterators would have additional cost as well.
>
> Attila

I like where you're going, but I think it can be done in a more
straightforward and simple way by moving the process() method to
StringTemplate and having it take a pair of Consumers:

public interface StringTemplate {
    default void process(Consumer<String> fragmentConsumer,
Consumer<Object>[1] valueConsumer) {
        // iterate through both lists, alternately calling
fragmentConsumer and valueConsumer
    }
}

[1] or Consumer<? super T>, see my previous posts about generic string
templates.

Using that method would look like:

var sb = new StringBuilder();
st.process(
    sb::append,
    value -> sb.append(formatValue(value))
);

Cheers,
Clement Cherlin

From maurizio.cimadamore at oracle.com  Mon Mar 11 14:28:28 2024
From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore)
Date: Mon, 11 Mar 2024 14:28:28 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <1848509621.25722001.1710162089780.JavaMail.zimbra@univ-eiffel.fr>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <636B984E-A544-4155-81D1-8752037A973B@oracle.com>
 <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com>
 <B6E205D1-C0BE-4506-A06C-7DF03A3910C1@oracle.com>
 <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com>
 <1848509621.25722001.1710162089780.JavaMail.zimbra@univ-eiffel.fr>
Message-ID: <d4ca02c0-3884-429f-a9e2-eff506c6867c@oracle.com>


On 11/03/2024 13:01, Remi Forax wrote:
> It's not a real boxing conversion, because it's a one way conversion, 
> i.e. there is a boxing conversion between StringTemplate to String but 
> no boxing conversion from String to StringTemplate. We can add it, but 
> i do not think it's necessary given that with a String s, it can 
> always be converted to a StringTemplate using t"\{s}".

This approach goes against the goal of making template -> string 
conversion explicit.

While turning a string into a template is totally safe (after all, a 
string is a degenerate case of a template with no values), the reverse 
is not true: there are many ways to go from a template to a string and 
either the user (at the use site) or the library (at the decl site) will 
have to "say what they mean". Now, you might disagree with this, but, as 
stated by Brian, this change is not about relaxing the design goals in 
JEP 465.

This is why, in my email, I'm specifically only speaking about the 
String -> StringTemplate direction.

> Apart from the fact that adding overloads in a lot of existing 
> projects looks like a sisiphus task, doing the conversion at use site 
> also as the advantage of allowing the compiler generates an 
> invokedynamic at use site so the boxing from a StringTemplate to a 
> String will be as fast as the string concatenation using '+' (see 
> Duncan email on amber-dev).
We can make things fast in other ways. For instance, given that string 
interpolation will be rather common, we might cache the string 
interpolation MH in the literal directly (after all, such literal is 
associated with an indy callsite). Other, more dynamic, approaches are 
possible too. I believe Jim might provide more details on how exactly 
this can be achieved, but I think that for now it would be better not to 
let the performance considerations drive the discussion.

Maurizio


From forax at univ-mlv.fr  Mon Mar 11 13:01:29 2024
From: forax at univ-mlv.fr (Remi Forax)
Date: Mon, 11 Mar 2024 14:01:29 +0100 (CET)
Subject: Update on String Templates (JEP 459)
In-Reply-To: <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <636B984E-A544-4155-81D1-8752037A973B@oracle.com>
 <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com>
 <B6E205D1-C0BE-4506-A06C-7DF03A3910C1@oracle.com>
 <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com>
Message-ID: <1848509621.25722001.1710162089780.JavaMail.zimbra@univ-eiffel.fr>

Hello, 

> Another, simpler, option we consider was to use some kind of prefix to mark a string template literal (e.g. make that explicit, instead of resorting to language wizardry). That works, but has the disadvantage of breaking the spell that there is only ?one string literal?, which is something we have worked quite hard to achieve. 

I vote for making string templates explicit. 
Yes, it can be seen as complex as first because not everything is a String, but at the same time, I believe it makes the conversion rules far easier to understand. 

For me, we already have a lot of methods that takes a String as parameter, so seeing String as a StringTemplate is not really a solution because it means adding a lot of overloads with all of the incompatibilities you describe. 
I see the conversion of a StringTemplate to a String as a boxing conversion, if the current type is a StringTemplate and the target type is a String, the compiler will generate a code that is equivalent to calling ".interpolate()" implicitly. 
It's not a real boxing conversion, because it's a one way conversion, i.e. there is a boxing conversion between StringTemplate to String but no boxing conversion from String to StringTemplate. We can add it, but i do not think it's necessary given that with a String s, it can always be converted to a StringTemplate using t"\{s}". 

One question can be if we prefer a callsite conversion like the boxing conversion described above or a declaration site conversion, i.e. ask all developers if they want to support StringTemplate to add a new overload. 
Apart from the fact that adding overloads in a lot of existing projects looks like a sisiphus task, doing the conversion at use site also as the advantage of allowing the compiler generates an invokedynamic at use site so the boxing from a StringTemplate to a String will be as fast as the string concatenation using '+' (see Duncan email on amber-dev). 

regards, 
R?mi 

> From: "Maurizio Cimadamore" <maurizio.cimadamore at oracle.com>
> To: "Brian Goetz" <brian.goetz at oracle.com>, "Guy Steele" <guy.steele at oracle.com>
> Cc: "Tagir Valeev" <amaembo at gmail.com>, "amber-spec-experts"
> <amber-spec-experts at openjdk.org>
> Sent: Monday, March 11, 2024 1:15:51 PM
> Subject: Re: Update on String Templates (JEP 459)

> Hi all,
> we tried mainly three approaches to allow smoother interop between strings and
> string templates: (a) make String a subclass of StringTemplate. Or (b) make
> constant strings bs convertible to string templates. Or, (c) use target-typing.
> All these approaches have some issues, discussed below.

> The first approach is slightly simpler, because it can be achieved entirely
> outside of the Java language. Unfortunately, adding ?String implements
> StringTemplate? adds overload ambiguities in cases such as this:
> format(StringTemplate) // 1
> format(String, Object...) // 2

> This is actually a very important case, as we predice that StringTemplate will
> serve as a great replacement for methods out there accepting a string/Object?
> pack.

> Unfortunatly, if String <: StringTemplate, this means that calling format with a
> string literal will resolve to (1), not (2) as before. The problem here is that
> (2) is not even applicable during the two overload resolution phases (which is
> only allowed to use subtyping and conversions, respectively), as it is a
> varargs method. Because of this, (1) will now take the precedence, as that?s
> not varargs. While for String::format this is probably harmless, changing
> results of overload selection is something that should be done with care (esp.
> if different overloads have different return types), as it could lead to source
> compatibility issues.

> On top of these issues, making all strings be string templates has the
> disadvantage of also considering ?messy? strings obtained via concatenation of
> non-constant values string templates too, which seems bad.

> To overcome these issues, we attempetd to add an implicit conversion from
> constant strings to StringTemplate. As it was observed by Guy, in case of
> ambiguities, the non-converting variants (e.g. m(String)) would be preferred.
> That said, in the above example (with varargs) we would still get a potentially
> incompatible change - as a string literal would be applicable in (1) before (2)
> is even considered, so the same concerns surrounding overload resolution
> changes would remain.

> Another thing that came up is that conversions automatically bring in casting
> conversions. E.g. if you can go from A to B using assignment conversion, you
> can typically go the same direction using casting conversion. This raises two
> issues. The first is that casting conversion is generally a symmetric type
> relationship (e.g. if you can cast from A to B, then you can cast from B to A),
> while here we?re mostly discussing about one direction. But this is, perhaps,
> not a big deal - after all, ?constant strings? don?t have a denotable type, so
> perhaps it should come to no surprise that you can?t use them as a target type
> for a cast.

> The second ?issue? is that casting conversion brings about patterns, as that?s
> how pattern applicability is defined. For instance:
> switch("Hello") {
>     case StringTemplate st ...
> }

> To make this work we would need at least to tweak exhaustiveness (otherwise
> javac would think the above switch is not exhaustive, and ask you to add a
> default). Secondly, some tweaks to the runtime tests would be required also.
> Not impossible, but would require some more work to make sure we?re ok with
> this direction.

> Another issue with the conversion is that it would expose a sharp edge in the
> current overload resolution and inference machinery. For instance, this program
> doesn?t compile correctly:
> List<Integer> li = List.of(1, 1L)

> Similarly, this program would also not compile correctly:
> List<StringTemplate> li = List.of("Hello", "Hello \{world}");

> The last possibility would be to say that a string literal is a poly expression
> . As such, a string literal can be typed to either String or StringTemplate
> depending on the target type (for instance, this is close to how int literals
> also work).

> This approach would still suffer from the same incompatible overload changes
> with varargs method as the other approaches. But, by avoiding to add a
> conversion, it makes things a little easier: for instance, in the case of
> pattern matching, nothing needs to be done, as the string literal will be
> turned into a string template before the switch even takes place (meaning that
> existing exhaustiveness and runtime checks would still work). But, there?s
> still dragons and irregularities when it comes to inference - for instance:
> List<StringTemplate> lst = List.of("hello", "world");

> This would not type-check: we need a target-type to know which way the literal
> is going (List::of just accepts a type-variable X). Note that overload
> resolution happens at a time where the target-type is not known, so here we?d
> probably pick X = String, which will then fail to type-check against the
> target.

> Another issue with target-typing is that if you have two overloads:
> m(String)
> m(StringTemplate)

> And you call this with a string literal, you get an ambiguity: you can go both
> ways, but String and StringTemplate are unrelated types, so we can?t pick one
> as ?most specific?. This issue could be addressed, in principle, by adding an
> ad-hoc most specific rule that, in case of an ambiguity, always gave precedence
> to String over StringTemplate. We do a similar trick for lambda expressions,
> where if two method accepts similarly looking functional interface, we give
> precedence to the non-boxing one.

> Anyway, the general message here is that it?s a bit of a ?pick your posion?
> situation. Adding a more fluid relationship between string and templates is
> definitively possible, but there are risks that this will impact negatively
> other areas of the language, risks that would need to be assessed very
> carefully.

> Another, simpler, option we consider was to use some kind of prefix to mark a
> string template literal (e.g. make that explicit, instead of resorting to
> language wizardry). That works, but has the disadvantage of breaking the spell
> that there is only ?one string literal?, which is something we have worked
> quite hard to achieve.

> Cheers
> Maurizio

> On 09/03/2024 23:52, Brian Goetz wrote:

>> I?ll let Maurizio give the details, because I?m sure I will have forgotten one
>> or two.

> ?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240311/e9b97e9d/attachment-0001.htm>

From ccherlin at gmail.com  Mon Mar 11 15:50:52 2024
From: ccherlin at gmail.com (Clement Cherlin)
Date: Mon, 11 Mar 2024 10:50:52 -0500
Subject: Update on String Templates (JEP 459)
In-Reply-To: <1848509621.25722001.1710162089780.JavaMail.zimbra@univ-eiffel.fr>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <636B984E-A544-4155-81D1-8752037A973B@oracle.com>
 <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com>
 <B6E205D1-C0BE-4506-A06C-7DF03A3910C1@oracle.com>
 <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com>
 <1848509621.25722001.1710162089780.JavaMail.zimbra@univ-eiffel.fr>
Message-ID: <CALEU8=wVVKtKjGKwgYJrRHRRM32EKfJszygga_SnsSRc5525Eg@mail.gmail.com>

On Mon, Mar 11, 2024 at 9:37?AM Remi Forax <forax at univ-mlv.fr> wrote:
>
> Hello,
>
> > Another, simpler, option we consider was to use some kind of prefix to mark a string template literal (e.g. make that explicit, instead of resorting to language wizardry). That works, but has the disadvantage of breaking the spell that there is only ?one string literal?, which is something we have worked quite hard to achieve.
>
> I vote for making string templates explicit.
> Yes, it can be seen as complex as first because not everything is a String, but at the same time, I believe it makes the conversion rules far easier to understand.

I agree, and suggest `backquotes` (and ```triple backquotes``` for
Template Blocks) to denote template strings. They were already
considered for Raw String Literals, which implies they're a viable
option. Raw String Literals didn't get adopted, so the character
remains available for use.

>
> For me, we already have a lot of methods that takes a String as parameter, so seeing String as a StringTemplate is not really a solution because it means adding a lot of overloads with all of the incompatibilities you describe.
> I see the conversion of a StringTemplate to a String as a boxing conversion, if the current type is a StringTemplate and the target type is a String, the compiler will generate a code that is equivalent to calling ".interpolate()" implicitly.
> It's not a real boxing conversion, because it's a one way conversion, i.e. there is a boxing conversion between StringTemplate to String but no boxing conversion from String to StringTemplate. We can add it, but i do not think it's necessary given that with a String s, it can always be converted to a StringTemplate using t"\{s}".
>
> One question can be if we prefer a callsite conversion like the boxing conversion described above or a declaration site conversion, i.e. ask all developers if they want to support StringTemplate to add a new overload.
> Apart from the fact that adding overloads in a lot of existing projects looks like a sisiphus task, doing the conversion at use site also as the advantage of allowing the compiler generates an invokedynamic at use site so the boxing from a StringTemplate to a String will be as fast as the string concatenation using '+' (see Duncan email on amber-dev).
>
> regards,
> R?mi
>
> ________________________________

I do not think implicit conversion from either String to String
Template or String Template to String is wise, given the complications
with overload resolution and potential for surprising undesired
effects cited below. When a method doesn't accept a String Template,
you simply process it to String first (or another type, remember that
a StringTemplate doesn't have to be converted to a String!). This is
no different than what you would do with a StringBuilder or other
String-like-but-not-actually-String object.

Cheers,
Clement

> From: "Maurizio Cimadamore" <maurizio.cimadamore at oracle.com>
> To: "Brian Goetz" <brian.goetz at oracle.com>, "Guy Steele" <guy.steele at oracle.com>
> Cc: "Tagir Valeev" <amaembo at gmail.com>, "amber-spec-experts" <amber-spec-experts at openjdk.org>
> Sent: Monday, March 11, 2024 1:15:51 PM
> Subject: Re: Update on String Templates (JEP 459)
>
> Hi all,
> we tried mainly three approaches to allow smoother interop between strings and string templates: (a) make String a subclass of StringTemplate. Or (b) make constant strings bs convertible to string templates. Or, (c) use target-typing. All these approaches have some issues, discussed below.
>
> The first approach is slightly simpler, because it can be achieved entirely outside of the Java language. Unfortunately, adding ?String implements StringTemplate? adds overload ambiguities in cases such as this:
>
> format(StringTemplate) // 1
> format(String, Object...) // 2
>
> This is actually a very important case, as we predice that StringTemplate will serve as a great replacement for methods out there accepting a string/Object? pack.
>
> Unfortunatly, if String <: StringTemplate, this means that calling format with a string literal will resolve to (1), not (2) as before. The problem here is that (2) is not even applicable during the two overload resolution phases (which is only allowed to use subtyping and conversions, respectively), as it is a varargs method. Because of this, (1) will now take the precedence, as that?s not varargs. While for String::format this is probably harmless, changing results of overload selection is something that should be done with care (esp. if different overloads have different return types), as it could lead to source compatibility issues.
>
> On top of these issues, making all strings be string templates has the disadvantage of also considering ?messy? strings obtained via concatenation of non-constant values string templates too, which seems bad.
>
> To overcome these issues, we attempetd to add an implicit conversion from constant strings to StringTemplate. As it was observed by Guy, in case of ambiguities, the non-converting variants (e.g. m(String)) would be preferred. That said, in the above example (with varargs) we would still get a potentially incompatible change - as a string literal would be applicable in (1) before (2) is even considered, so the same concerns surrounding overload resolution changes would remain.
>
> Another thing that came up is that conversions automatically bring in casting conversions. E.g. if you can go from A to B using assignment conversion, you can typically go the same direction using casting conversion. This raises two issues. The first is that casting conversion is generally a symmetric type relationship (e.g. if you can cast from A to B, then you can cast from B to A), while here we?re mostly discussing about one direction. But this is, perhaps, not a big deal - after all, ?constant strings? don?t have a denotable type, so perhaps it should come to no surprise that you can?t use them as a target type for a cast.
>
> The second ?issue? is that casting conversion brings about patterns, as that?s how pattern applicability is defined. For instance:
>
> switch("Hello") {
>     case StringTemplate st ...
> }
>
> To make this work we would need at least to tweak exhaustiveness (otherwise javac would think the above switch is not exhaustive, and ask you to add a default). Secondly, some tweaks to the runtime tests would be required also. Not impossible, but would require some more work to make sure we?re ok with this direction.
>
> Another issue with the conversion is that it would expose a sharp edge in the current overload resolution and inference machinery. For instance, this program doesn?t compile correctly:
>
>  List<Integer> li = List.of(1, 1L)
>
> Similarly, this program would also not compile correctly:
>
>  List<StringTemplate> li = List.of("Hello", "Hello \{world}");
>
> The last possibility would be to say that a string literal is a poly expression. As such, a string literal can be typed to either String or StringTemplate depending on the target type (for instance, this is close to how int literals also work).
>
> This approach would still suffer from the same incompatible overload changes with varargs method as the other approaches. But, by avoiding to add a conversion, it makes things a little easier: for instance, in the case of pattern matching, nothing needs to be done, as the string literal will be turned into a string template before the switch even takes place (meaning that existing exhaustiveness and runtime checks would still work). But, there?s still dragons and irregularities when it comes to inference - for instance:
>
> List<StringTemplate> lst = List.of("hello", "world");
>
> This would not type-check: we need a target-type to know which way the literal is going (List::of just accepts a type-variable X). Note that overload resolution happens at a time where the target-type is not known, so here we?d probably pick X = String, which will then fail to type-check against the target.
>
> Another issue with target-typing is that if you have two overloads:
>
> m(String)
> m(StringTemplate)
>
> And you call this with a string literal, you get an ambiguity: you can go both ways, but String and StringTemplate are unrelated types, so we can?t pick one as ?most specific?. This issue could be addressed, in principle, by adding an ad-hoc most specific rule that, in case of an ambiguity, always gave precedence to String over StringTemplate. We do a similar trick for lambda expressions, where if two method accepts similarly looking functional interface, we give precedence to the non-boxing one.
>
> Anyway, the general message here is that it?s a bit of a ?pick your posion? situation. Adding a more fluid relationship between string and templates is definitively possible, but there are risks that this will impact negatively other areas of the language, risks that would need to be assessed very carefully.
>
> Another, simpler, option we consider was to use some kind of prefix to mark a string template literal (e.g. make that explicit, instead of resorting to language wizardry). That works, but has the disadvantage of breaking the spell that there is only ?one string literal?, which is something we have worked quite hard to achieve.
>
> Cheers
> Maurizio
>
> On 09/03/2024 23:52, Brian Goetz wrote:
>
> I?ll let Maurizio give the details, because I?m sure I will have forgotten one or two.
>
>

From archie.cobbs at gmail.com  Mon Mar 11 16:07:58 2024
From: archie.cobbs at gmail.com (Archie Cobbs)
Date: Mon, 11 Mar 2024 11:07:58 -0500
Subject: Update on String Templates (JEP 459)
In-Reply-To: <1848509621.25722001.1710162089780.JavaMail.zimbra@univ-eiffel.fr>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <636B984E-A544-4155-81D1-8752037A973B@oracle.com>
 <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com>
 <B6E205D1-C0BE-4506-A06C-7DF03A3910C1@oracle.com>
 <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com>
 <1848509621.25722001.1710162089780.JavaMail.zimbra@univ-eiffel.fr>
Message-ID: <CANSoFxuX4UysrD-uERP_HagPbbVndP4CDw5ENUF4mGFvMHOV9A@mail.gmail.com>

On Mon, Mar 11, 2024 at 9:37?AM Remi Forax <forax at univ-mlv.fr> wrote:

> I vote for making string templates explicit.
>

Caveat: I've been following this discussion only loosely so I'm likely to
say something stupid/ignorant/redundant; if so please ignore.

But I am tending to agree with Remi. The recent simplifications Brian
described are a definite improvement, but now we're left with a new
question:

What is the advantage of having the language literals for String and
StringTemplate look so confusingly similar?

Reversing that question, I'm not seeing the big downside of having a simple
prefix for literals like this:

    var s = "this is a string";
    var st1 = $"this is a (degenerate) template";
    var st2 = $"this is also a \{template}";
    var x = "this is a \{lexical_error}";
    myobj.someOverloadedMethod($"this is definitely a template");
    myobj.someOverloadedMethod("this is definitely a string!");     // no
need to consult javadoc here

Seems like the trade-off is straightforward:

Cost: one character
Benefit: instant disambiguation clarity in the developer's mind

At least, it makes the whole API design/overload question straightforward.

Put another way, StringTemplates are a cool new language feature, and as
such it seems like they deserve a "first-class" allotment in the syntax of
the language.

-Archie
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240311/2cad84cc/attachment.htm>

From brian.goetz at oracle.com  Mon Mar 11 17:36:49 2024
From: brian.goetz at oracle.com (Brian Goetz)
Date: Mon, 11 Mar 2024 17:36:49 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <CANSoFxuX4UysrD-uERP_HagPbbVndP4CDw5ENUF4mGFvMHOV9A@mail.gmail.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <636B984E-A544-4155-81D1-8752037A973B@oracle.com>
 <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com>
 <B6E205D1-C0BE-4506-A06C-7DF03A3910C1@oracle.com>
 <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com>
 <1848509621.25722001.1710162089780.JavaMail.zimbra@univ-eiffel.fr>
 <CANSoFxuX4UysrD-uERP_HagPbbVndP4CDw5ENUF4mGFvMHOV9A@mail.gmail.com>
Message-ID: <EA547C4E-D8A6-4B4D-9591-626301966769@oracle.com>

The overlap between string literals and string template literals is indeed a tricky one, and bears some review of the options.  Obviously string templates and strings have some things in common (its in the name!), but they are also different and evaluate to different types.  So how ?same? or ?different? should they look?

Simplistic arguments in favor of ?different?:
 - Ambiguity is bad, clarity is good
 - String / string template literals can be both wide and tall; having to examine the entirety to know which it is could be confusing
 - Simpler for compiler and specification writers

Simplistic arguments in favor of ?same?:
 - Will be perceived as ?fussy? or distracting
 - Users are already grumpy that we?re not doing ?string interpolation? and calling it a day
 - Most of the time, it is perfectly obvious which one it is
 - Have to make up yet another new and unfamiliar syntax to disambiguate, think of the bike shedding

There are probably others, but none of these seem like slam-dunks one way or the other.

There are a few choices here:

 - Keep the current syntax approach
 - Give STs a new syntax
 - Give both STs and string literals an _optional_ new syntax, such as I_IZ_STRING??? and TEMPLATZ???, but allow the current approach when disambiguation is not needed

The last seeks a compromise between the current path and the desire for explicitness.  Suppose we allowed s??? and t??? literals, where the sigils were optional.  What then?

Obviously in the cases which are currently ambiguous-seeming, users could disambiguate explicitly.  The prefix sigil means no one has to ?buffer? when interpreting the code.  That?s nice.  Having two ways to write classical string literals might confuse people who haven?t seen them before, or stimulate unproductive ?style wars?.  That?s probably not too big a problem here.

Overall, though, I am not so enthused about creating yet another new lexical mechanism for having different kinds of stringy things.  The value is ? meh, and it seems an attractive nuisance.  In other languages with multiple ?flavors? of string, there is a tendency to proliferate more flavors.  (Raw strings, anyone?).

My take is that this is something that is bothering us a lot because it is new, but I?m skeptical that it carries its weight.


On Mar 11, 2024, at 9:07 AM, Archie Cobbs <archie.cobbs at gmail.com<mailto:archie.cobbs at gmail.com>> wrote:

On Mon, Mar 11, 2024 at 9:37?AM Remi Forax <forax at univ-mlv.fr<mailto:forax at univ-mlv.fr>> wrote:
I vote for making string templates explicit.

Caveat: I've been following this discussion only loosely so I'm likely to say something stupid/ignorant/redundant; if so please ignore.

But I am tending to agree with Remi. The recent simplifications Brian described are a definite improvement, but now we're left with a new question:

What is the advantage of having the language literals for String and StringTemplate look so confusingly similar?

Reversing that question, I'm not seeing the big downside of having a simple prefix for literals like this:

    var s = "this is a string";
    var st1 = $"this is a (degenerate) template";
    var st2 = $"this is also a \{template}";
    var x = "this is a \{lexical_error}";
    myobj.someOverloadedMethod($"this is definitely a template");
    myobj.someOverloadedMethod("this is definitely a string!");     // no need to consult javadoc here

Seems like the trade-off is straightforward:

Cost: one character
Benefit: instant disambiguation clarity in the developer's mind

At least, it makes the whole API design/overload question straightforward.

Put another way, StringTemplates are a cool new language feature, and as such it seems like they deserve a "first-class" allotment in the syntax of the language.

-Archie


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240311/e3e15f11/attachment-0001.htm>

From guy.steele at oracle.com  Mon Mar 11 18:24:31 2024
From: guy.steele at oracle.com (Guy Steele)
Date: Mon, 11 Mar 2024 18:24:31 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <EA547C4E-D8A6-4B4D-9591-626301966769@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <636B984E-A544-4155-81D1-8752037A973B@oracle.com>
 <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com>
 <B6E205D1-C0BE-4506-A06C-7DF03A3910C1@oracle.com>
 <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com>
 <1848509621.25722001.1710162089780.JavaMail.zimbra@univ-eiffel.fr>
 <CANSoFxuX4UysrD-uERP_HagPbbVndP4CDw5ENUF4mGFvMHOV9A@mail.gmail.com>
 <EA547C4E-D8A6-4B4D-9591-626301966769@oracle.com>
Message-ID: <5922B9E8-BB56-4212-8F43-48A676AB0E25@oracle.com>

My thinking pretty much matched Brian?s analysis below, until I saw Archie?s examples and thought about them. Four points:

(1) I would add one more simplistic argument in favor of ?different?: I see some value in a reader of code having fair warning, quite visible and up front, that what looks like a string may actually contain executable code (possibly having side effects). (Maybe this is related to what Brian meant by "The prefix sigil means no one has to ?buffer' when interpreting the code.".)

I think we do want such a warning, if present, to be concise but hard to overlook, and I think the choice of ?$? fits that bill admirably. (Pro: The character ?$? is associated with string interpolation in a number of other languages, including C#, Dart, Groovy, JavaScript, Julia, Kotlin, PHP, TCL, TypeScript, and Visual Basic. Con: Of the languages just listed, those that use ?$? before the opening double quote are C# and Visual Basic, and the proposed Java syntax is not otherwise identical to the syntax of C# and Visual Basic, which enclose expressions in _unescaped_ braces.)

(2) Because ?$? is an identifier in Java, it suggests that we can hold open a possible future where we allow other string-prefix sigils having the syntax of an identifier, but without really committing to that generality at this time.

(3) Because ?$? is a _discouraged_ identifier in Java (see JLS ?3.8: "The dollar sign should be used only in mechanically generated source code or, rarely, to access pre-existing names on legacy systems.?), in practice all occurrences of dollar signs would in fact flag string templates.

(4) Archie's suggestion does not create an alternate syntax for the Plain Old String Literals we have had in Java since its inception.

For these reasons, I recommend that Archie?s suggestion (and perhaps also the C#/Visual Basic variation) be given careful (re-)consideration at this time.

On Mar 11, 2024, at 1:36?PM, Brian Goetz <brian.goetz at oracle.com> wrote:

The overlap between string literals and string template literals is indeed a tricky one, and bears some review of the options.  Obviously string templates and strings have some things in common (its in the name!), but they are also different and evaluate to different types.  So how ?same? or ?different? should they look?

Simplistic arguments in favor of ?different?:
 - Ambiguity is bad, clarity is good
 - String / string template literals can be both wide and tall; having to examine the entirety to know which it is could be confusing
 - Simpler for compiler and specification writers

Simplistic arguments in favor of ?same?:
 - Will be perceived as ?fussy? or distracting
 - Users are already grumpy that we?re not doing ?string interpolation? and calling it a day
 - Most of the time, it is perfectly obvious which one it is
 - Have to make up yet another new and unfamiliar syntax to disambiguate, think of the bike shedding

There are probably others, but none of these seem like slam-dunks one way or the other.

There are a few choices here:

 - Keep the current syntax approach
 - Give STs a new syntax
 - Give both STs and string literals an _optional_ new syntax, such as I_IZ_STRING??? and TEMPLATZ???, but allow the current approach when disambiguation is not needed

The last seeks a compromise between the current path and the desire for explicitness.  Suppose we allowed s??? and t??? literals, where the sigils were optional.  What then?

Obviously in the cases which are currently ambiguous-seeming, users could disambiguate explicitly.  The prefix sigil means no one has to ?buffer? when interpreting the code.  That?s nice.  Having two ways to write classical string literals might confuse people who haven?t seen them before, or stimulate unproductive ?style wars?.  That?s probably not too big a problem here.

Overall, though, I am not so enthused about creating yet another new lexical mechanism for having different kinds of stringy things.  The value is ? meh, and it seems an attractive nuisance.  In other languages with multiple ?flavors? of string, there is a tendency to proliferate more flavors.  (Raw strings, anyone?).

My take is that this is something that is bothering us a lot because it is new, but I?m skeptical that it carries its weight.


On Mar 11, 2024, at 9:07 AM, Archie Cobbs <archie.cobbs at gmail.com<mailto:archie.cobbs at gmail.com>> wrote:

On Mon, Mar 11, 2024 at 9:37?AM Remi Forax <forax at univ-mlv.fr<mailto:forax at univ-mlv.fr>> wrote:
I vote for making string templates explicit.

Caveat: I've been following this discussion only loosely so I'm likely to say something stupid/ignorant/redundant; if so please ignore.

But I am tending to agree with Remi. The recent simplifications Brian described are a definite improvement, but now we're left with a new question:

What is the advantage of having the language literals for String and StringTemplate look so confusingly similar?

Reversing that question, I'm not seeing the big downside of having a simple prefix for literals like this:

    var s = "this is a string";
    var st1 = $"this is a (degenerate) template";
    var st2 = $"this is also a \{template}";
    var x = "this is a \{lexical_error}";
    myobj.someOverloadedMethod($"this is definitely a template");
    myobj.someOverloadedMethod("this is definitely a string!");     // no need to consult javadoc here

Seems like the trade-off is straightforward:

Cost: one character
Benefit: instant disambiguation clarity in the developer's mind

At least, it makes the whole API design/overload question straightforward.

Put another way, StringTemplates are a cool new language feature, and as such it seems like they deserve a "first-class" allotment in the syntax of the language.

-Archie


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240311/c72943a7/attachment-0001.htm>

From alex.buckley at oracle.com  Mon Mar 11 20:24:51 2024
From: alex.buckley at oracle.com (Alex Buckley)
Date: Mon, 11 Mar 2024 13:24:51 -0700
Subject: Update on String Templates (JEP 459)
In-Reply-To: <EA547C4E-D8A6-4B4D-9591-626301966769@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <636B984E-A544-4155-81D1-8752037A973B@oracle.com>
 <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com>
 <B6E205D1-C0BE-4506-A06C-7DF03A3910C1@oracle.com>
 <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com>
 <1848509621.25722001.1710162089780.JavaMail.zimbra@univ-eiffel.fr>
 <CANSoFxuX4UysrD-uERP_HagPbbVndP4CDw5ENUF4mGFvMHOV9A@mail.gmail.com>
 <EA547C4E-D8A6-4B4D-9591-626301966769@oracle.com>
Message-ID: <1d47196c-2597-46d5-af21-7fc360b35775@oracle.com>

On 3/11/2024 10:36 AM, Brian Goetz wrote:
> Overall, though, I am not so enthused about creating yet another new 
> lexical mechanism for having different kinds of stringy things.
All strings -- not just string literals, and not just constant 
expressions of type String -- can be composed with +. Is there an 
equivalent composition operator for string templates? (That is, all 
values of type StringTemplate, not just template literals.)

I ask because the more lexical similarity between a template literal and 
a string literal, the more I think people will try to use + with two 
template literals, or with one template literal and one string literal. 
AIUI the result will be a surprise:

String s = "Hello" + "\{x}";
   // Second operand to + undergoes string conversion a.k.a. toString()
print(s);  // Hello0x12345678

Alex

From james.laskey at oracle.com  Mon Mar 11 20:47:10 2024
From: james.laskey at oracle.com (Jim Laskey)
Date: Mon, 11 Mar 2024 20:47:10 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <1d47196c-2597-46d5-af21-7fc360b35775@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <636B984E-A544-4155-81D1-8752037A973B@oracle.com>
 <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com>
 <B6E205D1-C0BE-4506-A06C-7DF03A3910C1@oracle.com>
 <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com>
 <1848509621.25722001.1710162089780.JavaMail.zimbra@univ-eiffel.fr>
 <CANSoFxuX4UysrD-uERP_HagPbbVndP4CDw5ENUF4mGFvMHOV9A@mail.gmail.com>
 <EA547C4E-D8A6-4B4D-9591-626301966769@oracle.com>
 <1d47196c-2597-46d5-af21-7fc360b35775@oracle.com>
Message-ID: <4F9D3A91-FD9E-41C6-99DB-59E1C8587FA0@oracle.com>

String plus isn?t needed. Just as templates remove the need for string plus, the combining of string templates and strings can be done with nested embedded expressions.
?

> On Mar 11, 2024, at 5:25?PM, Alex Buckley <alex.buckley at oracle.com> wrote:
> 
> ?On 3/11/2024 10:36 AM, Brian Goetz wrote:
>> Overall, though, I am not so enthused about creating yet another new lexical mechanism for having different kinds of stringy things.
> All strings -- not just string literals, and not just constant expressions of type String -- can be composed with +. Is there an equivalent composition operator for string templates? (That is, all values of type StringTemplate, not just template literals.)
> 
> I ask because the more lexical similarity between a template literal and a string literal, the more I think people will try to use + with two template literals, or with one template literal and one string literal. AIUI the result will be a surprise:
> 
> String s = "Hello" + "\{x}";
>  // Second operand to + undergoes string conversion a.k.a. toString()
> print(s);  // Hello0x12345678
> 
> Alex

From alex.buckley at oracle.com  Mon Mar 11 21:40:07 2024
From: alex.buckley at oracle.com (Alex Buckley)
Date: Mon, 11 Mar 2024 14:40:07 -0700
Subject: Update on String Templates (JEP 459)
In-Reply-To: <4F9D3A91-FD9E-41C6-99DB-59E1C8587FA0@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <636B984E-A544-4155-81D1-8752037A973B@oracle.com>
 <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com>
 <B6E205D1-C0BE-4506-A06C-7DF03A3910C1@oracle.com>
 <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com>
 <1848509621.25722001.1710162089780.JavaMail.zimbra@univ-eiffel.fr>
 <CANSoFxuX4UysrD-uERP_HagPbbVndP4CDw5ENUF4mGFvMHOV9A@mail.gmail.com>
 <EA547C4E-D8A6-4B4D-9591-626301966769@oracle.com>
 <1d47196c-2597-46d5-af21-7fc360b35775@oracle.com>
 <4F9D3A91-FD9E-41C6-99DB-59E1C8587FA0@oracle.com>
Message-ID: <2cfe511b-b729-43c8-8348-479b9d6efbbf@oracle.com>

Given that few APIs will take StringTemplate on Day 1, I've been 
wondering how people will approach making string templates interoperate 
with APIs that only take String. They'll try the converts-to-String 
functionality of + --  "" + <<string template>>  -- and find it's a dead 
end.

BTW, it's always been true that there's no empty template literal, but 
with the removal of template processors there will be more "standalone" 
StringTemplate variables, and this will be a common error:

StringTemplate st = "";  // Error, can't assign String to StringTemplate


I also wondered if it could ever mean anything to compose two string 
templates with +. It feels similar to embedding a string template in a 
template literal. If the two templates being composed are in the same 
"language" then perhaps the Java language could help combine them.

Alex

On 3/11/2024 1:47 PM, Jim Laskey wrote:
> String plus isn?t needed. Just as templates remove the need for string plus, the combining of string templates and strings can be done with nested embedded expressions.
> ?
> 
>> On Mar 11, 2024, at 5:25?PM, Alex Buckley <alex.buckley at oracle.com> wrote:
>>
>> ?On 3/11/2024 10:36 AM, Brian Goetz wrote:
>>> Overall, though, I am not so enthused about creating yet another new lexical mechanism for having different kinds of stringy things.
>> All strings -- not just string literals, and not just constant expressions of type String -- can be composed with +. Is there an equivalent composition operator for string templates? (That is, all values of type StringTemplate, not just template literals.)
>>
>> I ask because the more lexical similarity between a template literal and a string literal, the more I think people will try to use + with two template literals, or with one template literal and one string literal. AIUI the result will be a surprise:
>>
>> String s = "Hello" + "\{x}";
>>   // Second operand to + undergoes string conversion a.k.a. toString()
>> print(s);  // Hello0x12345678
>>
>> Alex

From brian.goetz at oracle.com  Tue Mar 12 17:08:47 2024
From: brian.goetz at oracle.com (Brian Goetz)
Date: Tue, 12 Mar 2024 13:08:47 -0400
Subject: Update on String Templates (JEP 459)
In-Reply-To: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
Message-ID: <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>

OK, so let's summarize the EG discussion so far.? (As a reminder, 
syntax-heavy features like this are even more subject to "armchair 
theorization" than most, so please, take that into account when 
commenting.? As a further reminder, the best thing we could do right now 
is write more API code that manipulates string templates.)

Overall, I think everyone agrees that the "make string templates the 
star of the show" approach is a winning direction.? No one seems too 
busted up at the loss of processors.

I'm going to try and focus for now on "potential problems that might 
prompt further adjustment", rather than specific solutions.

There is some ambient discomfort that the "sublanguage" of a template 
becomes a dynamic property of a template, introducing new opportunities 
for users to make mistakes with unprocessed templates.? (This was 
present before as well using the RAW processor, but much less 
prominent.)? But, I don't think this is a significant issue, its just 
something new to get used to.

Most of the concerns have to do with the visual similarity between 
string literals and template literals.? While this is of course 
intended, there are some concerns that they may be "too similar". 
Concerns raised include:

 ?- In a code-generation scenario that leans on templates, sometimes we 
want to use a string literal as a degenerate form of template.? It may 
be surprising that this doesn't "just work", and alternatives (e.g., 
conversion functions, casting, etc) may have varying degrees of 
discoverability and yuck-factor.

 ?- Given (a) the visual similarity of string and template literals and 
(b) the lenient treatment of concatenation between strings and 
everything else, users may well be tempted to concatenate string 
literals with template literals, and may be surprised at the outcome.

 ?- Because template literals may be broad and wide, and their 
evaluation may involve side effects, we may want to give a lexical 
heads-up of "weird thing coming", rather than having template literals 
be framed more like "strings with benefits."

Have I covered the concerns raised so far?

Before we get too caught up in solutions, let's try to get on the same 
page about which of these are problems that need to be solved right now.


(As a small matter of housekeeping, given that the preview train is 
already rolling, we will soon have to make a decision to (a) withdraw 
the current preview entirely, (b) re-preview the current design even 
though we know it will change, or (c) gain the requisite confidence in a 
new design in time to preview that. From my vantage point, (c) is 
starting to look increasingly unlikely, and I suspect (a) is a better 
choice than (b).? But I bring this up not to start a project management 
discussions, as much as to raise awareness that there are project 
management constraints.)


On 3/8/2024 1:35 PM, Brian Goetz wrote:
>
> Time to check in with where were are with String Templates. ?We?ve 
> gone through two rounds of preview, and have received some feedback.
>
> As a reminder, the primary goal of gathering feedback is to learn 
> things about the design or implementation that we don?t already know. 
> ?This could be bug reports, experience reports, code review, careful 
> analysis, novel alternatives, etc. ? ?And the best feedback usually 
> comes from using the feature??in anger??? trying to actually write 
> code with it. (?Some people would prefer a?different syntax? or??some 
> people would prefer we focused on string interpolation only??fall 
> squarely in the??things we already knew? camp.)
>
> In the course of?using this feature in the `jextract` project, we did 
> learn quite a few things we didn?t already know, and this was 
> conclusive enough that it has motivated us to adjust our approach in 
> this feature. ?Specifically, the role of processors is ?outsized? to 
> the value they offer, and, after further exploration, we now believe 
> it is possible to achieve the goals of the feature without an explicit 
> ?processor? abstraction at all! ?This is a very positive development.
>
> First, I want to affirm that that the goals of the project have not 
> changed. ?From JEP 459:
>
> Goals
>
> ? Simplify the writing of Java programs by making it easy to express 
> strings that include values computed at run time.
> ? Enhance the readability of expressions that mix text and 
> expressions, whether the text fits on a single source line (as with 
> string literals) or spans several source lines (as with text blocks).
> ? Improve the security of Java programs that compose strings from 
> user-provided values and pass them to other systems (e.g., building 
> queries for databases) by supporting validation and transformation of 
> both the template and the values of its embedded expressions.
> ? Retain flexibility by allowing Java libraries to define the 
> formatting syntax used in string templates.
> ? Simplify the use of APIs that accept strings written in non-Java 
> languages (e.g., SQL, XML, and JSON).
> ? Enable the creation of non-string values computed from literal text 
> and embedded expressions without having to transit through an 
> intermediate string representation.
>
> Non-Goals
> ? It is not a goal to introduce syntactic sugar for Java's string 
> concatenation operator (+), since that would circumvent the goal of 
> validation.
> ? It is not a goal to deprecate or remove the StringBuilder and 
> StringBuffer classes, which have traditionally been used for complex 
> or programmatic string composition.
>
> Another thing that has not changed is our view on the syntax for 
> embedding expressions. ?While many people did express the opinion of 
> ?why not ?just' do what Kotlin/Scala does?, this issue was more than 
> fully explored during the initial design round. ?(In fact, while 
> syntax disagreements are often purely subjective, this one was far 
> more clear ? the $-syntax is objectively worse, and would be doubly so 
> if injected into an existing language where there were already string 
> literals in the wild. ?This has all been more than adequately covered 
> elsewhere, so I won?t rehash it here.)
>
>
> Now, let?s talk about what we do think should change: the role of 
> processors and the StringTemplate type.
>
> Processors were envisioned as a means to abstract the transformation 
> of templates to their final form (whether string, or something else.) 
> ?However, Java already has a well established means of abstracting 
> behavior: methods. ? (In fact, a processor application can be viewed 
> as merely a new syntax for a method call.) ?Our experience using the 
> feature highlighted the question: When converting a SQL query 
> expressed as a template to the form required by the database (such as 
> PreparedStatement), why do we need to say:
>
> ??DB.?? template ??
>
> When we could use an ordinary Java library:
>
> ??Query q = Query.of(??template??)
>
> Indeed, one of the worst things about having processors in the 
> language is that API designers are put in the difficult situation of 
> not knowing whether to write a processor or an ordinary API, and often 
> have to make that choice before the consequences are fully understood. 
> ?(To add to this, processors raise similar questions at the use site.) 
> But the real?criticism here is that template capture and processing 
> are complected, when they should be separate, composable features.
>
> This motivated us to revisit some of the reasons why processors were 
> so central to the initial design in the first place. ?And it turned 
> out, this choice had been influenced ? perhaps overly so ? by early 
> implementation experiments. ?(One of the background design goals was 
> to enable expensive operations like `String::format` to be (much) 
> cheaper. ?Without digressing too deeply on performance, String::format 
> can be more than an order of magnitude worse than the equivalent 
> concatenation operation, and this in turn sometimes motivates 
> developers to use worse idioms for formatting. ?The FMT processor 
> brough that cost back in line with the equivalent concatenation.) 
> ?These early experiments biased the design towards needing to know the 
> processor at the point of template capture, but upon reexamination we 
> realized that there are other ways to achieve the desired performance 
> goals without requiring processors to be known at capture time. ?This, 
> in turn, enabled us to revisit a point in the design space we had 
> transited through earlier, where string templates were ?just a new 
> kind of literal? and the job performed by processors could instead be 
> performed by ordinary APIs.
>
> At this point, a simpler design and implementation emerged that met 
> the semantic, correctness, and performance goals: template literals 
> (?Hello \{name}?) are simply the literal form of StringTemplate:
>
> ??StringTemplate st = ?Hello \{name}?;
>
> String and StringTemplate remain unrelated types. ?(We explored a 
> number of ways to interconvert them, but they caused more trouble than 
> they solved.) ?Processing of string templates, including 
> interpolation, is done by ordinary APIs that deal in StringTemplate, 
> aided by some clever implementation tricks to ensure good performance.
>
> For APIs where interpolation is known to be safe in the domain, such 
> as PrintWriter, APIs can make that choice on behalf of the domain, by 
> providing overloads to embody this design choice:
>
> ???void println(String) { ? }
> ???void println(StringTemplate) { ? interpolate and delegate to 
> println(String) ?. }
>
> The upshot is that for interpolation-safe APIs like println, we can 
> use a template directly without giving up any safety:
>
> ???System.out.println(?Hello \{name}?);
>
> In this example, the string template evaluates to StringTemplate, not 
> String (no implicit interpolation), and chooses the StringTemplate 
> overload of println, which in turn chooses how to process the 
> template. This stays true to the design principle that interpolation 
> is dangerous enough that it should be an explicit choice in the code ? 
> but it allows that choice to be made by libraries when the library is 
> comfortable doing so.
>
> Similarly, the FMT processor is replaced by an overload of 
> String::format that interprets templates with embedded format 
> specifiers (e.g., ?%d?):
>
> ??String format(String formatString, Object? parameters) { ? same as 
> today ? }
> ??String format(StringTemplate template) {... equivalent of FMT ...}
>
> And users can call this as:
>
> ??String s = String.format(?Hello %12s\{name}?);
>
> Here, the String::format API has chosen to interpret string templates 
> according to the rules previously specified in the FMT processor (not 
> ordinary interpolation), but that choice is embedded in the library 
> semantics so no further explicit choice at the use site is required. 
> ?The user already chose to pass it to String::format; that?s all the 
> processing selection that is needed.
>
> Where APIs do not express a choice of what template expansion means, 
> users continue to be free to process them explicitly before passing 
> them, using APIs that do (such as String::format or ordinary 
> interpolation.).
>
> The result is:
>
> - The need for use-site "goop" (previously, the processor name; now, 
> static or instance methods to process a template) goes away entirely 
> when dealing with libraries that are already template-friendly.
> - Even with libraries that require use-site goop, it is no more 
> intrusive than before, and can be reduced over time as APIs get with 
> the program.
> - StringTemplate is just another type that APIs can support if they 
> want. ?The "DB" processor becomes an ordinary factory method that 
> accepts a string template or an ordinary builder API.
> - APIs now can have _more_ control over the timing and meaning of 
> template processing, because we are not biasing so strongly towards 
> early processing.
> - It becomes easier to abstract over template processing (i.e., 
> combine or manipulate templates as templates before processing)
> - Interpolation remains an explicit choice, but ST-aware libraries can 
> make this choice on behalf of the user.
> - The language feature and API surface get considerably smaller, which 
> is good. ?Core JDK APIs (e.g., println, format, exception 
> constructors) get upgraded to work with string templates.
>
> The remaining question that everyone is probably asking is: ?so how do 
> we do interpolation.? ?The answer there is ?ordinary library methods?. 
> ?This might be a static method (String.join(StringTemplate)) or an 
> instance method (template.join()), shed to be painted (but please, not 
> right now.).
>
> This is a sketch of direction, so feel free to pose questions/comments 
> on the direction. ?We?ll discuss the details as we go.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240312/c97fde7e/attachment-0001.htm>

From amaembo at gmail.com  Tue Mar 12 17:24:10 2024
From: amaembo at gmail.com (Tagir Valeev)
Date: Tue, 12 Mar 2024 18:24:10 +0100
Subject: Update on String Templates (JEP 459)
In-Reply-To: <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <636B984E-A544-4155-81D1-8752037A973B@oracle.com>
 <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com>
 <B6E205D1-C0BE-4506-A06C-7DF03A3910C1@oracle.com>
 <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com>
Message-ID: <CAE+3fjYP3xo4BPomSuvoG-whTQCNTKs6tK3SOUCuHT5YvZ0iWw@mail.gmail.com>

Hello, Maurizio!

Thank you for the detailed explanation!

On Mon, Mar 11, 2024 at 1:16?PM Maurizio Cimadamore <
maurizio.cimadamore at oracle.com> wrote:

> Hi all,
> we tried mainly three approaches to allow smoother interop between strings
> and string templates: (a) make String a subclass of StringTemplate. Or (b)
> make constant strings bs *convertible* to string templates. Or, (c) use
> target-typing. All these approaches have some issues, discussed below.
>
> The first approach is slightly simpler, because it can be achieved
> entirely outside of the Java language. Unfortunately, adding ?String
> implements StringTemplate? adds overload ambiguities in cases such as this:
>
> format(StringTemplate) // 1
> format(String, Object...) // 2
>
> This is actually a very important case, as we predice that StringTemplate
> will serve as a great replacement for methods out there accepting a
> string/Object? pack.
>
> Unfortunatly, if String <: StringTemplate, this means that calling format
> with a string literal will resolve to (1), not (2) as before. The problem
> here is that (2) is not even applicable during the two overload resolution
> phases (which is only allowed to use subtyping and conversions,
> respectively), as it is a varargs method. Because of this, (1) will now
> take the precedence, as that?s not varargs. While for String::format this
> is probably harmless, changing results of overload selection is something
> that should be done with care (esp. if different overloads have different
> return types), as it could lead to source compatibility issues.
>
I would still like to advocate for String <: StringTemplate solution. I
think that the overloading is not a big problem. Simply making String
implements StringTemplate will not break any of existing code because there
are no APIs yet that accept the StringTemplate instance. The problem may
appear only when an API author actually adds such an overload and does this
in an incompatible way with an existing String overload. This would be an
extremely bad design choice, and the blame goes to the API author. You've
correctly mentioned that for String::format this is harmless because the
API is well-designed. We may suggest in StringTemplate documentation that
the API designers should provide the same behavior for foo(String) and
foo(StringTemplate) when they add an overload.

I must say that we already had an experience of introducing new interfaces
in the hierarchy of widely-used library classes. Closable got AutoClosable
parent, StringBuilder became comparable, and so on. So far, the
compatibility issues introduced were tolerable. Well, probably I'm missing
something but we have preview rounds just for this purpose: to find out the
disadvantages of the approach.


> On top of these issues, making all strings be string templates has the
> disadvantage of also considering ?messy? strings obtained via concatenation
> of non-constant values string templates too, which seems bad.
>
I think that most of the APIs will still provide String overload. E.g., for
preparing an SQL statement, it's a perfectly reasonable scenario to have a
constant string as the input. So prepareStatement(String) will stay along
with prepareStatement(StringTemplate). And people will still be able to use
concatenation. I don't think that the absence of String <: StringTemplate
relation will protect anybody from using the concatenation. On the other
hand, if String actually implements StringTemplate, it will be a very
simple static analysis rule to warn if the concatenation occurs in this
context. If the expected type for concatenation is StringTemplate, then
something is definitely wrong. Without 'String implements StringTemplate',
one will not be able to write a concatenation directly in StringTemplate
context. Instead, String-accepting overload will be used, and the expected
type will be String, so static analyzer will have to guess whether it's
dangerous to use the concatenation here. In short, I think that it's
actually an advantage: we have an additional hint here that concatenation
is undesired. Even compilation warning could be possible to implement.

So, I don't see these points as real disadvantages. I definitely like this
approach much more than adding any kind of implicit conversion or another
literal syntax, which would complicate the specification much more.

With best regards,
Tagir Valeev.

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240312/74123307/attachment.htm>

From brian.goetz at oracle.com  Tue Mar 12 17:32:20 2024
From: brian.goetz at oracle.com (Brian Goetz)
Date: Tue, 12 Mar 2024 13:32:20 -0400
Subject: Does String extend StringTemplate? (Was: Update on String Templates
 (JEP 459))
In-Reply-To: <CAE+3fjYP3xo4BPomSuvoG-whTQCNTKs6tK3SOUCuHT5YvZ0iWw@mail.gmail.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <636B984E-A544-4155-81D1-8752037A973B@oracle.com>
 <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com>
 <B6E205D1-C0BE-4506-A06C-7DF03A3910C1@oracle.com>
 <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com>
 <CAE+3fjYP3xo4BPomSuvoG-whTQCNTKs6tK3SOUCuHT5YvZ0iWw@mail.gmail.com>
Message-ID: <87dd98c8-e0ac-49e0-995a-5466a50219d3@oracle.com>

Splitting off into a separate thread.

I would like to redirect this discussion from the mechanical challenges 
and consequences to the goals and semantics.

If we are considering "String extends StringTemplate", we are making a 
semantic statement that a String *is-a* StringTemplate. While I can 
imagine convincing oneself that this is true "if you look at it right", 
this sets off all my "self-justification" detectors.

So, I recommend we step back and examine why we think this is a good 
idea before we descend into the mechanics.? My suspicion is that this is 
motivated by "I want to be able to automatically use String where a 
StringTemplate is desired", and that this seems a clever-enough hack to 
get there.? (I think we probably also need to drill further, into "why 
do we think it is important to be able to use String where 
StringTemplate is desired", and I suspect further that part of it will 
be "but the APIs are not yet fully equilibrated" (which would be a truly 
bad reason to give String a new supertype.))


On 3/12/2024 1:24 PM, Tagir Valeev wrote:
> Hello, Maurizio!
>
> Thank you for the detailed explanation!
>
> On Mon, Mar 11, 2024 at 1:16?PM Maurizio Cimadamore 
> <maurizio.cimadamore at oracle.com> wrote:
>
>     Hi all,
>     we tried mainly three approaches to allow smoother interop between
>     strings and string templates: (a) make String a subclass of
>     StringTemplate. Or (b) make constant strings bs /convertible/ to
>     string templates. Or, (c) use target-typing. All these approaches
>     have some issues, discussed below.
>
>     The first approach is slightly simpler, because it can be achieved
>     entirely outside of the Java language. Unfortunately, adding
>     ?String implements StringTemplate? adds overload ambiguities in
>     cases such as this:
>
>     |format(StringTemplate) // 1 format(String, Object...) // 2 |
>
>     This is actually a very important case, as we predice that
>     StringTemplate will serve as a great replacement for methods out
>     there accepting a string/Object? pack.
>
>     Unfortunatly, if String <: StringTemplate, this means that calling
>     format with a string literal will resolve to (1), not (2) as
>     before. The problem here is that (2) is not even applicable during
>     the two overload resolution phases (which is only allowed to use
>     subtyping and conversions, respectively), as it is a varargs
>     method. Because of this, (1) will now take the precedence, as
>     that?s not varargs. While for String::format this is probably
>     harmless, changing results of overload selection is something that
>     should be done with care (esp. if different overloads have
>     different return types), as it could lead to source compatibility
>     issues.
>
> I would still like to advocate for String <: StringTemplate solution. 
> I think that the overloading is not a big problem. Simply making 
> String implements StringTemplate will not break any of existing code 
> because there are no APIs yet that accept the StringTemplate instance. 
> The problem may appear only when an API author actually adds such an 
> overload and does this in an incompatible way with an existing String 
> overload. This would be an extremely bad design choice, and the blame 
> goes to the API author. You've correctly mentioned that for 
> String::format this is harmless because the API is well-designed. We 
> may suggest in StringTemplate documentation that the API designers 
> should provide the same behavior for foo(String) and 
> foo(StringTemplate) when they add an overload.
>
> I must say that we already had an experience of introducing new 
> interfaces in the hierarchy of widely-used library classes. Closable 
> got AutoClosable parent, StringBuilder became comparable, and so on. 
> So far, the compatibility issues introduced were tolerable. Well, 
> probably I'm missing something but we have preview rounds just for 
> this purpose: to find out the disadvantages of the approach.
>
>     On top of these issues, making all strings be string templates has
>     the disadvantage of also considering ?messy? strings obtained via
>     concatenation of non-constant values string templates too, which
>     seems bad.
>
> I think that most of the APIs will still provide String overload. 
> E.g., for preparing an SQL statement, it's a perfectly reasonable 
> scenario?to have a constant string as the input. So 
> prepareStatement(String) will stay along with 
> prepareStatement(StringTemplate). And people will still be able to use 
> concatenation. I don't think that the absence of String <: 
> StringTemplate relation will protect anybody from using the 
> concatenation. On the other hand, if String actually implements 
> StringTemplate, it will be a very simple static analysis rule to warn 
> if the concatenation occurs in this context. If the expected type for 
> concatenation is StringTemplate, then something is definitely wrong. 
> Without 'String implements StringTemplate', one will not be able to 
> write a concatenation directly in StringTemplate context. Instead, 
> String-accepting overload will be used, and the expected type will be 
> String, so static analyzer will have to guess whether it's dangerous 
> to use the concatenation here. In short, I think that it's actually an 
> advantage: we have an additional hint here that concatenation is 
> undesired. Even compilation warning could be possible to implement.
>
> So, I don't see these points as real disadvantages. I definitely like 
> this approach much more than adding any kind of implicit conversion or 
> another literal syntax, which would complicate the specification much 
> more.
>
> With best regards,
> Tagir Valeev.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240312/5233e930/attachment-0001.htm>

From guy.steele at oracle.com  Tue Mar 12 17:41:56 2024
From: guy.steele at oracle.com (Guy Steele)
Date: Tue, 12 Mar 2024 17:41:56 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
Message-ID: <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>

Now that Maurizio has provided a delated explanation of prior investigations and some good examples, I am now convinced that the approach of providing a special-case conversion from String to StringTemplate is probably not a good idea.

Then here is the decision tree that I would suggest:

(1) If we decide that we do want, on its own merits, some up-front visual indication that distinguishes string literals from string templates, then it becomes easier to just say that strings and string templates are different beasts, neither a subtype of the other, and in particular (a) $?foo? (for example) is a degenerate string template, which is not the same as the string ?foo?; (b) $?? is a simple way to write an empty string template, in case you need to initialize a variable of type StringTemplate to something non-null; and (c) APIs should consider providing pairs of methods, where in each pair one takes a String argument and the other takes a StringTemplate argument.

(2) If we decide we do not want that visual distinction, then we have the problem of whether ?foo? can be used as both a string and a string template.

(2a) ?foo? is only a string, not a string template. This leads to some of the overloading problems that Maurizio has described, though I note that instead of

   StringTemplate x = ??;

we could recommend

  StringTemplate x = StringTemplate.EMPTY;

where StringTemplate provides a public static member named EMPTY.

(2b) ?foo? can be used as both a string and a string template. In the absence of a special conversion, this would seem to require that String <: StringTemplate as Tagir suggests.


On Mar 12, 2024, at 1:08?PM, Brian Goetz <brian.goetz at oracle.com> wrote:

OK, so let's summarize the EG discussion so far.  (As a reminder, syntax-heavy features like this are even more subject to "armchair theorization" than most, so please, take that into account when commenting.  As a further reminder, the best thing we could do right now is write more API code that manipulates string templates.)

Overall, I think everyone agrees that the "make string templates the star of the show" approach is a winning direction.  No one seems too busted up at the loss of processors.

I'm going to try and focus for now on "potential problems that might prompt further adjustment", rather than specific solutions.

There is some ambient discomfort that the "sublanguage" of a template becomes a dynamic property of a template, introducing new opportunities for users to make mistakes with unprocessed templates.  (This was present before as well using the RAW processor, but much less prominent.)  But, I don't think this is a significant issue, its just something new to get used to.

Most of the concerns have to do with the visual similarity between string literals and template literals.  While this is of course intended, there are some concerns that they may be "too similar".  Concerns raised include:

 - In a code-generation scenario that leans on templates, sometimes we want to use a string literal as a degenerate form of template.  It may be surprising that this doesn't "just work", and alternatives (e.g., conversion functions, casting, etc) may have varying degrees of discoverability and yuck-factor.

 - Given (a) the visual similarity of string and template literals and (b) the lenient treatment of concatenation between strings and everything else, users may well be tempted to concatenate string literals with template literals, and may be surprised at the outcome.

 - Because template literals may be broad and wide, and their evaluation may involve side effects, we may want to give a lexical heads-up of "weird thing coming", rather than having template literals be framed more like "strings with benefits."

Have I covered the concerns raised so far?

Before we get too caught up in solutions, let's try to get on the same page about which of these are problems that need to be solved right now.


(As a small matter of housekeeping, given that the preview train is already rolling, we will soon have to make a decision to (a) withdraw the current preview entirely, (b) re-preview the current design even though we know it will change, or (c) gain the requisite confidence in a new design in time to preview that.  From my vantage point, (c) is starting to look increasingly unlikely, and I suspect (a) is a better choice than (b).  But I bring this up not to start a project management discussions, as much as to raise awareness that there are project management constraints.)


On 3/8/2024 1:35 PM, Brian Goetz wrote:

Time to check in with where were are with String Templates.  We?ve gone through two rounds of preview, and have received some feedback.

As a reminder, the primary goal of gathering feedback is to learn things about the design or implementation that we don?t already know.  This could be bug reports, experience reports, code review, careful analysis, novel alternatives, etc.    And the best feedback usually comes from using the feature ?in anger? ? trying to actually write code with it.  (?Some people would prefer a different syntax? or ?some people would prefer we focused on string interpolation only? fall squarely in the ?things we already knew? camp.)

In the course of using this feature in the `jextract` project, we did learn quite a few things we didn?t already know, and this was conclusive enough that it has motivated us to adjust our approach in this feature.  Specifically, the role of processors is ?outsized? to the value they offer, and, after further exploration, we now believe it is possible to achieve the goals of the feature without an explicit ?processor? abstraction at all!  This is a very positive development.

First, I want to affirm that that the goals of the project have not changed.  From JEP 459:

Goals

? Simplify the writing of Java programs by making it easy to express strings that include values computed at run time.
? Enhance the readability of expressions that mix text and expressions, whether the text fits on a single source line (as with string literals) or spans several source lines (as with text blocks).
? Improve the security of Java programs that compose strings from user-provided values and pass them to other systems (e.g., building queries for databases) by supporting validation and transformation of both the template and the values of its embedded expressions.
? Retain flexibility by allowing Java libraries to define the formatting syntax used in string templates.
? Simplify the use of APIs that accept strings written in non-Java languages (e.g., SQL, XML, and JSON).
? Enable the creation of non-string values computed from literal text and embedded expressions without having to transit through an intermediate string representation.

Non-Goals
? It is not a goal to introduce syntactic sugar for Java's string concatenation operator (+), since that would circumvent the goal of validation.
? It is not a goal to deprecate or remove the StringBuilder and StringBuffer classes, which have traditionally been used for complex or programmatic string composition.

Another thing that has not changed is our view on the syntax for embedding expressions.  While many people did express the opinion of ?why not ?just' do what Kotlin/Scala does?, this issue was more than fully explored during the initial design round.  (In fact, while syntax disagreements are often purely subjective, this one was far more clear ? the $-syntax is objectively worse, and would be doubly so if injected into an existing language where there were already string literals in the wild.  This has all been more than adequately covered elsewhere, so I won?t rehash it here.)


Now, let?s talk about what we do think should change: the role of processors and the StringTemplate type.

Processors were envisioned as a means to abstract the transformation of templates to their final form (whether string, or something else.)  However, Java already has a well established means of abstracting behavior: methods.   (In fact, a processor application can be viewed as merely a new syntax for a method call.)  Our experience using the feature highlighted the question: When converting a SQL query expressed as a template to the form required by the database (such as PreparedStatement), why do we need to say:

  DB.?? template ??

When we could use an ordinary Java library:

  Query q = Query.of(??template??)

Indeed, one of the worst things about having processors in the language is that API designers are put in the difficult situation of not knowing whether to write a processor or an ordinary API, and often have to make that choice before the consequences are fully understood.  (To add to this, processors raise similar questions at the use site.) But the real criticism here is that template capture and processing are complected, when they should be separate, composable features.

This motivated us to revisit some of the reasons why processors were so central to the initial design in the first place.  And it turned out, this choice had been influenced ? perhaps overly so ? by early implementation experiments.  (One of the background design goals was to enable expensive operations like `String::format` to be (much) cheaper.  Without digressing too deeply on performance, String::format can be more than an order of magnitude worse than the equivalent concatenation operation, and this in turn sometimes motivates developers to use worse idioms for formatting.  The FMT processor brough that cost back in line with the equivalent concatenation.)  These early experiments biased the design towards needing to know the processor at the point of template capture, but upon reexamination we realized that there are other ways to achieve the desired performance goals without requiring processors to be known at capture time.  This, in turn, enabled us to revisit a point in the design space we had transited through earlier, where string templates were ?just a new kind of literal? and the job performed by processors could instead be performed by ordinary APIs.

At this point, a simpler design and implementation emerged that met the semantic, correctness, and performance goals: template literals (?Hello \{name}?) are simply the literal form of StringTemplate:

  StringTemplate st = ?Hello \{name}?;

String and StringTemplate remain unrelated types.  (We explored a number of ways to interconvert them, but they caused more trouble than they solved.)  Processing of string templates, including interpolation, is done by ordinary APIs that deal in StringTemplate, aided by some clever implementation tricks to ensure good performance.

For APIs where interpolation is known to be safe in the domain, such as PrintWriter, APIs can make that choice on behalf of the domain, by providing overloads to embody this design choice:

   void println(String) { ? }
   void println(StringTemplate) { ? interpolate and delegate to println(String) ?. }

The upshot is that for interpolation-safe APIs like println, we can use a template directly without giving up any safety:

   System.out.println(?Hello \{name}?);

In this example, the string template evaluates to StringTemplate, not String (no implicit interpolation), and chooses the StringTemplate overload of println, which in turn chooses how to process the template.  This stays true to the design principle that interpolation is dangerous enough that it should be an explicit choice in the code ? but it allows that choice to be made by libraries when the library is comfortable doing so.

Similarly, the FMT processor is replaced by an overload of String::format that interprets templates with embedded format specifiers (e.g., ?%d?):

  String format(String formatString, Object? parameters) { ? same as today ? }
  String format(StringTemplate template) {... equivalent of FMT ...}

And users can call this as:

  String s = String.format(?Hello %12s\{name}?);

Here, the String::format API has chosen to interpret string templates according to the rules previously specified in the FMT processor (not ordinary interpolation), but that choice is embedded in the library semantics so no further explicit choice at the use site is required.  The user already chose to pass it to String::format; that?s all the processing selection that is needed.

Where APIs do not express a choice of what template expansion means, users continue to be free to process them explicitly before passing them, using APIs that do (such as String::format or ordinary interpolation.).

The result is:

- The need for use-site "goop" (previously, the processor name; now, static or instance methods to process a template) goes away entirely when dealing with libraries that are already template-friendly.
- Even with libraries that require use-site goop, it is no more intrusive than before, and can be reduced over time as APIs get with the program.
- StringTemplate is just another type that APIs can support if they want.  The "DB" processor becomes an ordinary factory method that accepts a string template or an ordinary builder API.
- APIs now can have _more_ control over the timing and meaning of template processing, because we are not biasing so strongly towards early processing.
- It becomes easier to abstract over template processing (i.e., combine or manipulate templates as templates before processing)
- Interpolation remains an explicit choice, but ST-aware libraries can make this choice on behalf of the user.
- The language feature and API surface get considerably smaller, which is good.  Core JDK APIs (e.g., println, format, exception constructors) get upgraded to work with string templates.

The remaining question that everyone is probably asking is: ?so how do we do interpolation.?  The answer there is ?ordinary library methods?.  This might be a static method (String.join(StringTemplate)) or an instance method (template.join()), shed to be painted (but please, not right now.).

This is a sketch of direction, so feel free to pose questions/comments on the direction.  We?ll discuss the details as we go.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240312/44672ccf/attachment-0001.htm>

From guy.steele at oracle.com  Tue Mar 12 17:54:03 2024
From: guy.steele at oracle.com (Guy Steele)
Date: Tue, 12 Mar 2024 17:54:03 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>
Message-ID: <B52FF02B-5379-438A-9A23-05519246AB1F@oracle.com>

I think I got my description of (2a) slightly wrong. Let me try again:

?????
(2a) ?foo? is only a string, not a string template. In the absence of a special conversion, once again we are led to recommend that APIs provide pairs of methods, and I think we avoid most of the overloading problems that Maurizio has described. I note that instead of

   StringTemplate x = ??;

we could recommend

  StringTemplate x = StringTemplate.EMPTY;

where StringTemplate provides a public static member named EMPTY. We do have the burden of explaining to users that ?foo? is not a string template.
?????

Now, all that said, I will now provide my best attempt to support the idea that (2b) is better than (2a):

It will be difficult to explain the user a design such that the syntax of strings appears to be an obvious edge case of the syntax of string templates (the case where the number of interpolated expressions is zero) but the semantics of strings are not the obvious and analogous edge case of the semantics of string templates.

(1) avoids this problem by making the syntaxes different. (2b) avoids the problem by making the semantics match. But (2a) totally has this problem.


On Mar 12, 2024, at 1:41?PM, Guy Steele <guy.steele at oracle.com> wrote:

Now that Maurizio has provided a delated explanation of prior investigations and some good examples, I am now convinced that the approach of providing a special-case conversion from String to StringTemplate is probably not a good idea.

Then here is the decision tree that I would suggest:

(1) If we decide that we do want, on its own merits, some up-front visual indication that distinguishes string literals from string templates, then it becomes easier to just say that strings and string templates are different beasts, neither a subtype of the other, and in particular (a) $?foo? (for example) is a degenerate string template, which is not the same as the string ?foo?; (b) $?? is a simple way to write an empty string template, in case you need to initialize a variable of type StringTemplate to something non-null; and (c) APIs should consider providing pairs of methods, where in each pair one takes a String argument and the other takes a StringTemplate argument.

(2) If we decide we do not want that visual distinction, then we have the problem of whether ?foo? can be used as both a string and a string template.

(2a) ?foo? is only a string, not a string template. This leads to some of the overloading problems that Maurizio has described, though I note that instead of

   StringTemplate x = ??;

we could recommend

  StringTemplate x = StringTemplate.EMPTY;

where StringTemplate provides a public static member named EMPTY.

(2b) ?foo? can be used as both a string and a string template. In the absence of a special conversion, this would seem to require that String <: StringTemplate as Tagir suggests.


On Mar 12, 2024, at 1:08?PM, Brian Goetz <brian.goetz at oracle.com> wrote:

OK, so let's summarize the EG discussion so far.  (As a reminder, syntax-heavy features like this are even more subject to "armchair theorization" than most, so please, take that into account when commenting.  As a further reminder, the best thing we could do right now is write more API code that manipulates string templates.)

Overall, I think everyone agrees that the "make string templates the star of the show" approach is a winning direction.  No one seems too busted up at the loss of processors.

I'm going to try and focus for now on "potential problems that might prompt further adjustment", rather than specific solutions.

There is some ambient discomfort that the "sublanguage" of a template becomes a dynamic property of a template, introducing new opportunities for users to make mistakes with unprocessed templates.  (This was present before as well using the RAW processor, but much less prominent.)  But, I don't think this is a significant issue, its just something new to get used to.

Most of the concerns have to do with the visual similarity between string literals and template literals.  While this is of course intended, there are some concerns that they may be "too similar".  Concerns raised include:

 - In a code-generation scenario that leans on templates, sometimes we want to use a string literal as a degenerate form of template.  It may be surprising that this doesn't "just work", and alternatives (e.g., conversion functions, casting, etc) may have varying degrees of discoverability and yuck-factor.

 - Given (a) the visual similarity of string and template literals and (b) the lenient treatment of concatenation between strings and everything else, users may well be tempted to concatenate string literals with template literals, and may be surprised at the outcome.

 - Because template literals may be broad and wide, and their evaluation may involve side effects, we may want to give a lexical heads-up of "weird thing coming", rather than having template literals be framed more like "strings with benefits."

Have I covered the concerns raised so far?

Before we get too caught up in solutions, let's try to get on the same page about which of these are problems that need to be solved right now.


(As a small matter of housekeeping, given that the preview train is already rolling, we will soon have to make a decision to (a) withdraw the current preview entirely, (b) re-preview the current design even though we know it will change, or (c) gain the requisite confidence in a new design in time to preview that.  From my vantage point, (c) is starting to look increasingly unlikely, and I suspect (a) is a better choice than (b).  But I bring this up not to start a project management discussions, as much as to raise awareness that there are project management constraints.)


On 3/8/2024 1:35 PM, Brian Goetz wrote:

Time to check in with where were are with String Templates.  We?ve gone through two rounds of preview, and have received some feedback.

As a reminder, the primary goal of gathering feedback is to learn things about the design or implementation that we don?t already know.  This could be bug reports, experience reports, code review, careful analysis, novel alternatives, etc.    And the best feedback usually comes from using the feature ?in anger? ? trying to actually write code with it.  (?Some people would prefer a different syntax? or ?some people would prefer we focused on string interpolation only? fall squarely in the ?things we already knew? camp.)

In the course of using this feature in the `jextract` project, we did learn quite a few things we didn?t already know, and this was conclusive enough that it has motivated us to adjust our approach in this feature.  Specifically, the role of processors is ?outsized? to the value they offer, and, after further exploration, we now believe it is possible to achieve the goals of the feature without an explicit ?processor? abstraction at all!  This is a very positive development.

First, I want to affirm that that the goals of the project have not changed.  From JEP 459:

Goals

? Simplify the writing of Java programs by making it easy to express strings that include values computed at run time.
? Enhance the readability of expressions that mix text and expressions, whether the text fits on a single source line (as with string literals) or spans several source lines (as with text blocks).
? Improve the security of Java programs that compose strings from user-provided values and pass them to other systems (e.g., building queries for databases) by supporting validation and transformation of both the template and the values of its embedded expressions.
? Retain flexibility by allowing Java libraries to define the formatting syntax used in string templates.
? Simplify the use of APIs that accept strings written in non-Java languages (e.g., SQL, XML, and JSON).
? Enable the creation of non-string values computed from literal text and embedded expressions without having to transit through an intermediate string representation.

Non-Goals
? It is not a goal to introduce syntactic sugar for Java's string concatenation operator (+), since that would circumvent the goal of validation.
? It is not a goal to deprecate or remove the StringBuilder and StringBuffer classes, which have traditionally been used for complex or programmatic string composition.

Another thing that has not changed is our view on the syntax for embedding expressions.  While many people did express the opinion of ?why not ?just' do what Kotlin/Scala does?, this issue was more than fully explored during the initial design round.  (In fact, while syntax disagreements are often purely subjective, this one was far more clear ? the $-syntax is objectively worse, and would be doubly so if injected into an existing language where there were already string literals in the wild.  This has all been more than adequately covered elsewhere, so I won?t rehash it here.)


Now, let?s talk about what we do think should change: the role of processors and the StringTemplate type.

Processors were envisioned as a means to abstract the transformation of templates to their final form (whether string, or something else.)  However, Java already has a well established means of abstracting behavior: methods.   (In fact, a processor application can be viewed as merely a new syntax for a method call.)  Our experience using the feature highlighted the question: When converting a SQL query expressed as a template to the form required by the database (such as PreparedStatement), why do we need to say:

  DB.?? template ??

When we could use an ordinary Java library:

  Query q = Query.of(??template??)

Indeed, one of the worst things about having processors in the language is that API designers are put in the difficult situation of not knowing whether to write a processor or an ordinary API, and often have to make that choice before the consequences are fully understood.  (To add to this, processors raise similar questions at the use site.) But the real criticism here is that template capture and processing are complected, when they should be separate, composable features.

This motivated us to revisit some of the reasons why processors were so central to the initial design in the first place.  And it turned out, this choice had been influenced ? perhaps overly so ? by early implementation experiments.  (One of the background design goals was to enable expensive operations like `String::format` to be (much) cheaper.  Without digressing too deeply on performance, String::format can be more than an order of magnitude worse than the equivalent concatenation operation, and this in turn sometimes motivates developers to use worse idioms for formatting.  The FMT processor brough that cost back in line with the equivalent concatenation.)  These early experiments biased the design towards needing to know the processor at the point of template capture, but upon reexamination we realized that there are other ways to achieve the desired performance goals without requiring processors to be known at capture time.  This, in turn, enabled us to revisit a point in the design space we had transited through earlier, where string templates were ?just a new kind of literal? and the job performed by processors could instead be performed by ordinary APIs.

At this point, a simpler design and implementation emerged that met the semantic, correctness, and performance goals: template literals (?Hello \{name}?) are simply the literal form of StringTemplate:

  StringTemplate st = ?Hello \{name}?;

String and StringTemplate remain unrelated types.  (We explored a number of ways to interconvert them, but they caused more trouble than they solved.)  Processing of string templates, including interpolation, is done by ordinary APIs that deal in StringTemplate, aided by some clever implementation tricks to ensure good performance.

For APIs where interpolation is known to be safe in the domain, such as PrintWriter, APIs can make that choice on behalf of the domain, by providing overloads to embody this design choice:

   void println(String) { ? }
   void println(StringTemplate) { ? interpolate and delegate to println(String) ?. }

The upshot is that for interpolation-safe APIs like println, we can use a template directly without giving up any safety:

   System.out.println(?Hello \{name}?);

In this example, the string template evaluates to StringTemplate, not String (no implicit interpolation), and chooses the StringTemplate overload of println, which in turn chooses how to process the template.  This stays true to the design principle that interpolation is dangerous enough that it should be an explicit choice in the code ? but it allows that choice to be made by libraries when the library is comfortable doing so.

Similarly, the FMT processor is replaced by an overload of String::format that interprets templates with embedded format specifiers (e.g., ?%d?):

  String format(String formatString, Object? parameters) { ? same as today ? }
  String format(StringTemplate template) {... equivalent of FMT ...}

And users can call this as:

  String s = String.format(?Hello %12s\{name}?);

Here, the String::format API has chosen to interpret string templates according to the rules previously specified in the FMT processor (not ordinary interpolation), but that choice is embedded in the library semantics so no further explicit choice at the use site is required.  The user already chose to pass it to String::format; that?s all the processing selection that is needed.

Where APIs do not express a choice of what template expansion means, users continue to be free to process them explicitly before passing them, using APIs that do (such as String::format or ordinary interpolation.).

The result is:

- The need for use-site "goop" (previously, the processor name; now, static or instance methods to process a template) goes away entirely when dealing with libraries that are already template-friendly.
- Even with libraries that require use-site goop, it is no more intrusive than before, and can be reduced over time as APIs get with the program.
- StringTemplate is just another type that APIs can support if they want.  The "DB" processor becomes an ordinary factory method that accepts a string template or an ordinary builder API.
- APIs now can have _more_ control over the timing and meaning of template processing, because we are not biasing so strongly towards early processing.
- It becomes easier to abstract over template processing (i.e., combine or manipulate templates as templates before processing)
- Interpolation remains an explicit choice, but ST-aware libraries can make this choice on behalf of the user.
- The language feature and API surface get considerably smaller, which is good.  Core JDK APIs (e.g., println, format, exception constructors) get upgraded to work with string templates.

The remaining question that everyone is probably asking is: ?so how do we do interpolation.?  The answer there is ?ordinary library methods?.  This might be a static method (String.join(StringTemplate)) or an instance method (template.join()), shed to be painted (but please, not right now.).

This is a sketch of direction, so feel free to pose questions/comments on the direction.  We?ll discuss the details as we go.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240312/8af305ce/attachment-0001.htm>

From ccherlin at gmail.com  Tue Mar 12 22:08:07 2024
From: ccherlin at gmail.com (Clement Cherlin)
Date: Tue, 12 Mar 2024 17:08:07 -0500
Subject: Update on String Templates (JEP 459)
In-Reply-To: <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
Message-ID: <CALEU8=xyA86F_60fhQO46_4X97tG42OMZr7OjxD2R50C+e1EDw@mail.gmail.com>

I agree overall with the problem statement, with the following
specific concerns:

1. Template literals (even ones with no embedded expressions) should
be visually and syntactically distinct from string literals, because
they are different types with different semantics. This visual
distinction should be immediately and obviously apparent when reading
code.

2. There should be an easy way to write a template literal with no
embedded expressions, including an empty one.

Motivation: A constant StringTemplate could, for example, be
concatenated (via method call, not '+') with non-constant String
Template(s) without having to mix String and StringTemplate types.

3. It should not be too easy to accidentally write a template literal
when you mean to write a string literal, or vice-versa.

4. Changing a string literal to a template literal or vice-versa
should be an explicit decision, not an implicit one.

Conclusions:

- There should be no implicit conversions, and context should not be
required to determine whether a literal creates a String or a
StringTemplate value.

- Template literals should not differ from string literals solely by
their contents. A template literal should either have a different
quote character than a string literal, or have a mandatory prefix.

As a consequence, inserting an embedded expression into a string
literal without changing the quote character or adding the prefix
should produce a compile-time error.

Cheers,
Clement Cherlin

On Tue, Mar 12, 2024 at 2:20?PM Brian Goetz <brian.goetz at oracle.com> wrote:
>
> OK, so let's summarize the EG discussion so far.  (As a reminder, syntax-heavy features like this are even more subject to "armchair theorization" than most, so please, take that into account when commenting.  As a further reminder, the best thing we could do right now is write more API code that manipulates string templates.)
>
> Overall, I think everyone agrees that the "make string templates the star of the show" approach is a winning direction.  No one seems too busted up at the loss of processors.
>
> I'm going to try and focus for now on "potential problems that might prompt further adjustment", rather than specific solutions.
>
> There is some ambient discomfort that the "sublanguage" of a template becomes a dynamic property of a template, introducing new opportunities for users to make mistakes with unprocessed templates.  (This was present before as well using the RAW processor, but much less prominent.)  But, I don't think this is a significant issue, its just something new to get used to.
>
> Most of the concerns have to do with the visual similarity between string literals and template literals.  While this is of course intended, there are some concerns that they may be "too similar".  Concerns raised include:
>
>  - In a code-generation scenario that leans on templates, sometimes we want to use a string literal as a degenerate form of template.  It may be surprising that this doesn't "just work", and alternatives (e.g., conversion functions, casting, etc) may have varying degrees of discoverability and yuck-factor.
>
>  - Given (a) the visual similarity of string and template literals and (b) the lenient treatment of concatenation between strings and everything else, users may well be tempted to concatenate string literals with template literals, and may be surprised at the outcome.
>
>  - Because template literals may be broad and wide, and their evaluation may involve side effects, we may want to give a lexical heads-up of "weird thing coming", rather than having template literals be framed more like "strings with benefits."
>
> Have I covered the concerns raised so far?
>
> Before we get too caught up in solutions, let's try to get on the same page about which of these are problems that need to be solved right now.
>
>
> (As a small matter of housekeeping, given that the preview train is already rolling, we will soon have to make a decision to (a) withdraw the current preview entirely, (b) re-preview the current design even though we know it will change, or (c) gain the requisite confidence in a new design in time to preview that.  From my vantage point, (c) is starting to look increasingly unlikely, and I suspect (a) is a better choice than (b).  But I bring this up not to start a project management discussions, as much as to raise awareness that there are project management constraints.)
>
>
>
>
> On 3/8/2024 1:35 PM, Brian Goetz wrote:
>
>
> Time to check in with where were are with String Templates.  We?ve gone through two rounds of preview, and have received some feedback.
>
> As a reminder, the primary goal of gathering feedback is to learn things about the design or implementation that we don?t already know.  This could be bug reports, experience reports, code review, careful analysis, novel alternatives, etc.    And the best feedback usually comes from using the feature ?in anger? ? trying to actually write code with it.  (?Some people would prefer a different syntax? or ?some people would prefer we focused on string interpolation only? fall squarely in the ?things we already knew? camp.)
>
> In the course of using this feature in the `jextract` project, we did learn quite a few things we didn?t already know, and this was conclusive enough that it has motivated us to adjust our approach in this feature.  Specifically, the role of processors is ?outsized? to the value they offer, and, after further exploration, we now believe it is possible to achieve the goals of the feature without an explicit ?processor? abstraction at all!  This is a very positive development.
>
> First, I want to affirm that that the goals of the project have not changed.  From JEP 459:
>
> Goals
>
> ? Simplify the writing of Java programs by making it easy to express strings that include values computed at run time.
> ? Enhance the readability of expressions that mix text and expressions, whether the text fits on a single source line (as with string literals) or spans several source lines (as with text blocks).
> ? Improve the security of Java programs that compose strings from user-provided values and pass them to other systems (e.g., building queries for databases) by supporting validation and transformation of both the template and the values of its embedded expressions.
> ? Retain flexibility by allowing Java libraries to define the formatting syntax used in string templates.
> ? Simplify the use of APIs that accept strings written in non-Java languages (e.g., SQL, XML, and JSON).
> ? Enable the creation of non-string values computed from literal text and embedded expressions without having to transit through an intermediate string representation.
>
> Non-Goals
> ? It is not a goal to introduce syntactic sugar for Java's string concatenation operator (+), since that would circumvent the goal of validation.
> ? It is not a goal to deprecate or remove the StringBuilder and StringBuffer classes, which have traditionally been used for complex or programmatic string composition.
>
> Another thing that has not changed is our view on the syntax for embedding expressions.  While many people did express the opinion of ?why not ?just' do what Kotlin/Scala does?, this issue was more than fully explored during the initial design round.  (In fact, while syntax disagreements are often purely subjective, this one was far more clear ? the $-syntax is objectively worse, and would be doubly so if injected into an existing language where there were already string literals in the wild.  This has all been more than adequately covered elsewhere, so I won?t rehash it here.)
>
>
> Now, let?s talk about what we do think should change: the role of processors and the StringTemplate type.
>
> Processors were envisioned as a means to abstract the transformation of templates to their final form (whether string, or something else.)  However, Java already has a well established means of abstracting behavior: methods.   (In fact, a processor application can be viewed as merely a new syntax for a method call.)  Our experience using the feature highlighted the question: When converting a SQL query expressed as a template to the form required by the database (such as PreparedStatement), why do we need to say:
>
>   DB.?? template ??
>
> When we could use an ordinary Java library:
>
>   Query q = Query.of(??template??)
>
> Indeed, one of the worst things about having processors in the language is that API designers are put in the difficult situation of not knowing whether to write a processor or an ordinary API, and often have to make that choice before the consequences are fully understood.  (To add to this, processors raise similar questions at the use site.) But the real criticism here is that template capture and processing are complected, when they should be separate, composable features.
>
> This motivated us to revisit some of the reasons why processors were so central to the initial design in the first place.  And it turned out, this choice had been influenced ? perhaps overly so ? by early implementation experiments.  (One of the background design goals was to enable expensive operations like `String::format` to be (much) cheaper.  Without digressing too deeply on performance, String::format can be more than an order of magnitude worse than the equivalent concatenation operation, and this in turn sometimes motivates developers to use worse idioms for formatting.  The FMT processor brough that cost back in line with the equivalent concatenation.)  These early experiments biased the design towards needing to know the processor at the point of template capture, but upon reexamination we realized that there are other ways to achieve the desired performance goals without requiring processors to be known at capture time.  This, in turn, enabled us to revisit a point in the design space we had transited through earlier, where string templates were ?just a new kind of literal? and the job performed by processors could instead be performed by ordinary APIs.
>
> At this point, a simpler design and implementation emerged that met the semantic, correctness, and performance goals: template literals (?Hello \{name}?) are simply the literal form of StringTemplate:
>
>   StringTemplate st = ?Hello \{name}?;
>
> String and StringTemplate remain unrelated types.  (We explored a number of ways to interconvert them, but they caused more trouble than they solved.)  Processing of string templates, including interpolation, is done by ordinary APIs that deal in StringTemplate, aided by some clever implementation tricks to ensure good performance.
>
> For APIs where interpolation is known to be safe in the domain, such as PrintWriter, APIs can make that choice on behalf of the domain, by providing overloads to embody this design choice:
>
>    void println(String) { ? }
>    void println(StringTemplate) { ? interpolate and delegate to println(String) ?. }
>
> The upshot is that for interpolation-safe APIs like println, we can use a template directly without giving up any safety:
>
>    System.out.println(?Hello \{name}?);
>
> In this example, the string template evaluates to StringTemplate, not String (no implicit interpolation), and chooses the StringTemplate overload of println, which in turn chooses how to process the template.  This stays true to the design principle that interpolation is dangerous enough that it should be an explicit choice in the code ? but it allows that choice to be made by libraries when the library is comfortable doing so.
>
> Similarly, the FMT processor is replaced by an overload of String::format that interprets templates with embedded format specifiers (e.g., ?%d?):
>
>   String format(String formatString, Object? parameters) { ? same as today ? }
>   String format(StringTemplate template) {... equivalent of FMT ...}
>
> And users can call this as:
>
>   String s = String.format(?Hello %12s\{name}?);
>
> Here, the String::format API has chosen to interpret string templates according to the rules previously specified in the FMT processor (not ordinary interpolation), but that choice is embedded in the library semantics so no further explicit choice at the use site is required.  The user already chose to pass it to String::format; that?s all the processing selection that is needed.
>
> Where APIs do not express a choice of what template expansion means, users continue to be free to process them explicitly before passing them, using APIs that do (such as String::format or ordinary interpolation.).
>
> The result is:
>
> - The need for use-site "goop" (previously, the processor name; now, static or instance methods to process a template) goes away entirely when dealing with libraries that are already template-friendly.
> - Even with libraries that require use-site goop, it is no more intrusive than before, and can be reduced over time as APIs get with the program.
> - StringTemplate is just another type that APIs can support if they want.  The "DB" processor becomes an ordinary factory method that accepts a string template or an ordinary builder API.
> - APIs now can have _more_ control over the timing and meaning of template processing, because we are not biasing so strongly towards early processing.
> - It becomes easier to abstract over template processing (i.e., combine or manipulate templates as templates before processing)
> - Interpolation remains an explicit choice, but ST-aware libraries can make this choice on behalf of the user.
> - The language feature and API surface get considerably smaller, which is good.  Core JDK APIs (e.g., println, format, exception constructors) get upgraded to work with string templates.
>
> The remaining question that everyone is probably asking is: ?so how do we do interpolation.?  The answer there is ?ordinary library methods?.  This might be a static method (String.join(StringTemplate)) or an instance method (template.join()), shed to be painted (but please, not right now.).
>
> This is a sketch of direction, so feel free to pose questions/comments on the direction.  We?ll discuss the details as we go.
>
>
>

From maurizio.cimadamore at oracle.com  Wed Mar 13 10:29:21 2024
From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore)
Date: Wed, 13 Mar 2024 10:29:21 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <CAE+3fjYP3xo4BPomSuvoG-whTQCNTKs6tK3SOUCuHT5YvZ0iWw@mail.gmail.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <636B984E-A544-4155-81D1-8752037A973B@oracle.com>
 <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com>
 <B6E205D1-C0BE-4506-A06C-7DF03A3910C1@oracle.com>
 <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com>
 <CAE+3fjYP3xo4BPomSuvoG-whTQCNTKs6tK3SOUCuHT5YvZ0iWw@mail.gmail.com>
Message-ID: <c8054d31-ebf5-49b3-a9e1-7553a4b52533@oracle.com>

Hi Tagir,
while subclassing is handy, I think it actively works against the goal 
of trying to make string handling any safer.

Let?s consider the case of a /new/ API, that wants to do things the 
/right/ way. This API will provide a StringTemplate-accepting factory.

But if clients can supply a value using |"foo" + bar|, then we?re back 
to where we started: the new API is no safer than a String-accepting 
factory.

Note: there is a big difference between passing |"foo" + bar| and 
|"foo\{bar}"|. In the former, the library only gets a string. It has no 
way to distinguish between which values were user-provided, and which 
ones were constant. In the latter, the string template has a value. The 
library might need to analyze that value more carefully as it might come 
from outside.

The main value of string templates is to allow clients to capture what 
/can change/ and separate it from what /cannot/ change. Attacks 
typically lurk in the variable part. But eager interpolation (e.g. 
string +) destroys this separation.

> I think that most of the APIs will still provide String overload. 
> E.g., for preparing an SQL statement, it's a perfectly reasonable 
> scenario?to have a constant string as the input. So 
> prepareStatement(String) will stay along with 
> prepareStatement(StringTemplate). And people will still be able to use 
> concatenation. I don't think that the absence of String <: 
> StringTemplate relation will protect anybody from using the 
> concatenation. On the other hand, if String actually implements 
> StringTemplate, it will be a very simple static analysis rule to warn 
> if the concatenation occurs in this context. If the expected type for 
> concatenation is StringTemplate, then something is definitely wrong. 
> Without 'String implements StringTemplate', one will not be able to 
> write a concatenation directly in StringTemplate context. Instead, 
> String-accepting overload will be used, and the expected type will be 
> String, so static analyzer will have to guess whether it's dangerous 
> to use the concatenation here. In short, I think that it's actually an 
> advantage: we have an additional hint here that concatenation is 
> undesired. Even compilation warning could be possible to implement.
>
> So, I don't see these points as real disadvantages. I definitely like 
> this approach much more than adding any kind of implicit conversion or 
> another literal syntax, which would complicate the specification much 
> more.

I don?t buy that, since there?s already String-accepting API in the 
wild, then we can never be safer than that. String-accepting variant can 
be deprecated, if needs be.

Maurizio

?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240313/32b1006b/attachment-0001.htm>

From maurizio.cimadamore at oracle.com  Wed Mar 13 14:36:28 2024
From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore)
Date: Wed, 13 Mar 2024 14:36:28 +0000
Subject: Does String extend StringTemplate? (Was: Update on String
 Templates (JEP 459))
In-Reply-To: <87dd98c8-e0ac-49e0-995a-5466a50219d3@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <636B984E-A544-4155-81D1-8752037A973B@oracle.com>
 <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com>
 <B6E205D1-C0BE-4506-A06C-7DF03A3910C1@oracle.com>
 <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com>
 <CAE+3fjYP3xo4BPomSuvoG-whTQCNTKs6tK3SOUCuHT5YvZ0iWw@mail.gmail.com>
 <87dd98c8-e0ac-49e0-995a-5466a50219d3@oracle.com>
Message-ID: <82613b9b-0c51-4e30-a3c8-30843f1691f1@oracle.com>

Hi Brian.
I believe this is ultimately a bad idea. Note that I?ve been a strong 
supporter of this position in the past.

Now, onto the reason I think it?s a bad idea. Let?s ignore legacy API 
for now. Let?s assume the world is already moved on and adopted 
StringTemplates. A new library that needs to parse strings which contain 
sensitive user-defined values, already got the memo, and will provide a 
StringTemplate-accepting factory (not merely a String- accepting one).

This world is inherently safer than the world we have today - because a 
string template is typically composed of two facets:

  * a ?variable part? (the template arguments)
  * a ?constant part? (the template fragments)

Libraries should focus their validation/escaping efforts on the variable 
part of a string template. But, for this assumption to hold water, we 
need to be able to guarantee that the user cannot accidentally sneak in 
some ?variable parts? into the ?constant part? of a string template. 
Unfortunately, the String <: StringTemplate approach seems to allow 
exactly that:

|Foo of(StringTemplate) { ... } Foo.of("Hello!") // ok, string is a 
constant Foo.of("Hello" + world) // what? |

If messy, concatenated strings can be treated as degenerate templates, I 
believe we?d be no better off than we are today - even in the case of 
brand new API that fully bought into the idea of StringTemplate.

Maurizio

On 12/03/2024 17:32, Brian Goetz wrote:

> Splitting off into a separate thread.
>
> I would like to redirect this discussion from the mechanical 
> challenges and consequences to the goals and semantics.
>
> If we are considering "String extends StringTemplate", we are making a 
> semantic statement that a String *is-a* StringTemplate.? While I can 
> imagine convincing oneself that this is true "if you look at it 
> right", this sets off all my "self-justification" detectors.
>
> So, I recommend we step back and examine why we think this is a good 
> idea before we descend into the mechanics.? My suspicion is that this 
> is motivated by "I want to be able to automatically use String where a 
> StringTemplate is desired", and that this seems a clever-enough hack 
> to get there.? (I think we probably also need to drill further, into 
> "why do we think it is important to be able to use String where 
> StringTemplate is desired", and I suspect further that part of it will 
> be "but the APIs are not yet fully equilibrated" (which would be a 
> truly bad reason to give String a new supertype.))
>
>
>
>
> On 3/12/2024 1:24 PM, Tagir Valeev wrote:
>> Hello, Maurizio!
>>
>> Thank you for the detailed explanation!
>>
>> On Mon, Mar 11, 2024 at 1:16?PM Maurizio Cimadamore 
>> <maurizio.cimadamore at oracle.com> wrote:
>>
>>     Hi all,
>>     we tried mainly three approaches to allow smoother interop
>>     between strings and string templates: (a) make String a subclass
>>     of StringTemplate. Or (b) make constant strings bs /convertible/
>>     to string templates. Or, (c) use target-typing. All these
>>     approaches have some issues, discussed below.
>>
>>     The first approach is slightly simpler, because it can be
>>     achieved entirely outside of the Java language. Unfortunately,
>>     adding ?String implements StringTemplate? adds overload
>>     ambiguities in cases such as this:
>>
>>     |format(StringTemplate) // 1 format(String, Object...) // 2 |
>>
>>     This is actually a very important case, as we predice that
>>     StringTemplate will serve as a great replacement for methods out
>>     there accepting a string/Object? pack.
>>
>>     Unfortunatly, if String <: StringTemplate, this means that
>>     calling format with a string literal will resolve to (1), not (2)
>>     as before. The problem here is that (2) is not even applicable
>>     during the two overload resolution phases (which is only allowed
>>     to use subtyping and conversions, respectively), as it is a
>>     varargs method. Because of this, (1) will now take the
>>     precedence, as that?s not varargs. While for String::format this
>>     is probably harmless, changing results of overload selection is
>>     something that should be done with care (esp. if different
>>     overloads have different return types), as it could lead to
>>     source compatibility issues.
>>
>> I would still like to advocate for String <: StringTemplate solution. 
>> I think that the overloading is not a big problem. Simply making 
>> String implements StringTemplate will not break any of existing code 
>> because there are no APIs yet that accept the StringTemplate 
>> instance. The problem may appear only when an API author actually 
>> adds such an overload and does this in an incompatible way with an 
>> existing String overload. This would be an extremely bad design 
>> choice, and the blame goes to the API author. You've correctly 
>> mentioned that for String::format this is harmless because the API is 
>> well-designed. We may suggest in StringTemplate documentation that 
>> the API designers should provide the same behavior for foo(String) 
>> and foo(StringTemplate) when they add an overload.
>>
>> I must say that we already had an experience of introducing new 
>> interfaces in the hierarchy of widely-used library classes. Closable 
>> got AutoClosable parent, StringBuilder became comparable, and so on. 
>> So far, the compatibility issues introduced were tolerable. Well, 
>> probably I'm missing something but we have preview rounds just for 
>> this purpose: to find out the disadvantages of the approach.
>>
>>     On top of these issues, making all strings be string templates
>>     has the disadvantage of also considering ?messy? strings obtained
>>     via concatenation of non-constant values string templates too,
>>     which seems bad.
>>
>> I think that most of the APIs will still provide String overload. 
>> E.g., for preparing an SQL statement, it's a perfectly reasonable 
>> scenario?to have a constant string as the input. So 
>> prepareStatement(String) will stay along with 
>> prepareStatement(StringTemplate). And people will still be able to 
>> use concatenation. I don't think that the absence of String <: 
>> StringTemplate relation will protect anybody from using the 
>> concatenation. On the other hand, if String actually implements 
>> StringTemplate, it will be a very simple static analysis rule to warn 
>> if the concatenation occurs in this context. If the expected type for 
>> concatenation is StringTemplate, then something is definitely wrong. 
>> Without 'String implements StringTemplate', one will not be able to 
>> write a concatenation directly in StringTemplate context. Instead, 
>> String-accepting overload will be used, and the expected type will be 
>> String, so static analyzer will have to guess whether it's dangerous 
>> to use the concatenation here. In short, I think that it's actually 
>> an advantage: we have an additional hint here that concatenation is 
>> undesired. Even compilation warning could be possible to implement.
>>
>> So, I don't see these points as real disadvantages. I definitely like 
>> this approach much more than adding any kind of implicit conversion 
>> or another literal syntax, which would complicate the specification 
>> much more.
>>
>> With best regards,
>> Tagir Valeev.
>>
>
?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240313/9f493e79/attachment-0001.htm>

From archie.cobbs at gmail.com  Wed Mar 13 14:48:45 2024
From: archie.cobbs at gmail.com (Archie Cobbs)
Date: Wed, 13 Mar 2024 09:48:45 -0500
Subject: Update on String Templates (JEP 459)
In-Reply-To: <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
Message-ID: <CANSoFxtLXQ52b+LWQBhoUTz6HPjaCsy=ufnZOYeyQsgH6EwZEQ@mail.gmail.com>

On Tue, Mar 12, 2024 at 12:08?PM Brian Goetz <brian.goetz at oracle.com> wrote:

> Have I covered the concerns raised so far?
>

Thanks for the helpful discussion check-point.

This thread has touched on a lot of different bits so for a moment I want
to focus on one narrow question. Forget for a moment all the stuff about
method resolution, varargs, and whether String <: StringTemplate.

I was intrigued by this comment (Maurizio):

> Another, simpler, option we considered was to use some kind of prefix to
mark a string template literal (e.g. make that explicit, instead of
resorting to language wizardry). That works, but has the disadvantage of
breaking the spell that there is only ?one string literal?, which is
something we have worked quite hard to achieve.

What exactly is the advantage, in terms of the mental model of the
programmer, of having "one string literal"?

Maybe I'm just not seeing it.

I can understand the advantage of having String <: StringTemplate - that
gives me more flexibility when passing around objects - great! But do I
need that same flexibility with *literals*?

Consider how we handle float vs. double literals. They overlap for 32-bit
values, which is very convenient, but you can also "force" a narrower
interpretation by adding an "f" suffix. That seems like pretty much the
best of both worlds to me.

So is this an analogous situation? Then we'd allow a StringTemplate literal
to have an *optional* "$" prefix:

obj.takingString("abcd");             // ok - string
obj.takingTemplate("abcd");           // ok - template
obj.takingStringOrTemplate($"abcd");  // ok - template
obj.takingStringOrTemplate("abcd");   // ok - string or template
(personally I don't care)
obj.takingString($"abcd");            // fail
obj.takingTemplate($"abcd");          // ok - template
obj.takingString("x = \{var}");       // fail
obj.takingTemplate("x = \{var}");     // ok - template

Thanks,
-Archie

-- 
Archie L. Cobbs
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240313/d6ccf6bf/attachment.htm>

From maurizio.cimadamore at oracle.com  Wed Mar 13 15:40:12 2024
From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore)
Date: Wed, 13 Mar 2024 15:40:12 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <CANSoFxtLXQ52b+LWQBhoUTz6HPjaCsy=ufnZOYeyQsgH6EwZEQ@mail.gmail.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <CANSoFxtLXQ52b+LWQBhoUTz6HPjaCsy=ufnZOYeyQsgH6EwZEQ@mail.gmail.com>
Message-ID: <1d408e14-c41e-454f-94f1-3e0b597520e7@oracle.com>

I don?t disagree.

After some more pondering, while text blocks and string literals clearly 
have a lot of overlap - e.g. they both end up being String objects, 
that?s not the case with string templates.

So, if the following assignment fails:

|String s = "foo \{bar}"; |

One might argue that perhaps the syntax should be ?more obviously 
different?.

Maurizio

On 13/03/2024 14:48, Archie Cobbs wrote:

> Consider how we handle float vs. double literals. They overlap for 
> 32-bit values, which is very convenient, but you can also "force" a 
> narrower interpretation by adding an "f" suffix. That seems like 
> pretty much the best of both worlds to me.
>
> So is this an analogous situation? Then we'd allow a StringTemplate 
> literal to have an /optional/ "$" prefix:
>
> obj.takingString("abcd"); ??????????? // ok - string
> obj.takingTemplate("abcd"); // ok - template
> obj.takingStringOrTemplate($"abcd"); // ok - template
> obj.takingStringOrTemplate("abcd"); // ok - string or template 
> (personally I don't care)
> obj.takingString($"abcd"); // fail
> obj.takingTemplate($"abcd"); // ok - template
> obj.takingString("x = \{var}"); // fail
> obj.takingTemplate("x = \{var}"); // ok - template

?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240313/d77dc57a/attachment-0001.htm>

From maurizio.cimadamore at oracle.com  Wed Mar 13 15:45:49 2024
From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore)
Date: Wed, 13 Mar 2024 15:45:49 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <B52FF02B-5379-438A-9A23-05519246AB1F@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>
 <B52FF02B-5379-438A-9A23-05519246AB1F@oracle.com>
Message-ID: <bf368957-e3c2-4d79-b462-06d20e38a387@oracle.com>

Hi Guy,

On 12/03/2024 17:54, Guy Steele wrote:
> (1) avoids this problem by making the syntaxes different. (2b) avoids 
> the problem by making the semantics match. But (2a) totally has this 
> problem.

I agree that 2a leaves us in a place that is suboptimal.

I think 2b is also undesirable (as I explained elsewhere), as it would 
compromise the design goals of the feature too much IMHO.

So, the choice is (also IMHO) between an ad-hoc conversion (with the 
problems that I described in my previous email) and a different literal 
syntax (your (1)).

Maurizio


From guy.steele at oracle.com  Wed Mar 13 18:03:02 2024
From: guy.steele at oracle.com (Guy Steele)
Date: Wed, 13 Mar 2024 18:03:02 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <bf368957-e3c2-4d79-b462-06d20e38a387@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>
 <B52FF02B-5379-438A-9A23-05519246AB1F@oracle.com>
 <bf368957-e3c2-4d79-b462-06d20e38a387@oracle.com>
Message-ID: <E3598C9D-26BA-4AC0-AE01-EE7E7C42CA44@oracle.com>

Now that Maurizio has made quite clear the need for string templates to be understood as something distinct from strings, when used for security-related purposes (the need to ensure that material from interpolated expressions is vetted by the template processor), I agree that 2b is undesirable. We do not want to tempt users to think that

?Hello, \{x}?

and

?Hello ? + x

are completely interchangeable.

On the other hand, there are applications where vetting is not important, and while the rest of this sentence is explicitly stated as a non-goal of JEP 459, I suspect we do, secretly, actually want users to feel free to use templates rather than ?+? concatenation to construct "plain old unvetted strings?. In the current state of JEP 459, this can be indicated in a clear way:

STR.?Hello, \{x}?

And of course STR can be replaced by the name of some other template processor.

Brian has now proposed that the template processor mechanism is clunky and redundant, and would be better handled by just providing methods that take arguments of type StringTemplate. Sounds good to me. In that world, we would probably want a template processor method that takes a StringTemplate and just does obvious, unvetted string concatenation after doing `toString` on each of the expression values. An obvious name for this method is `String.of`. So we would write

String.of(?Hello, \{x}?)

But this is unsatisfying because it is verbose.

I suggest that, rather than having a bit of prefix syntax that allows specification of any template processor, all we really need is a very concise prefix syntax that distinguishes the STR case from all other cases, the assumption being that all other cases do vetting of some sort (else they would just accept strings rather than string templates). That, plus Archie?s recent suggestion that ?$? be optional, leads me to suggest the following approach (which I suspect might be a good compromise because I expect that nearly everyone in this discussion will dislike some aspect of it :-) :

?????????
String is not a subtype of StringTemplate; they are disjoint types.

$?foo?              is a (trivial) string template literal
?foo?                is a string literal
        $?Hello, \{x}?     is a (nontrivial) string template literal
        ?Hello, \{x}?      is a shorthand (expanded by the compiler) for `String.of($?Hello, \{x}?)`
?????????

Thus you always need ?$? to be present before the leading double quote to get a template value. If there is no ?$? before the leading double quote, you get a string value. String literals are constant expressions, but if what otherwise looks like a string literal (no leading ?$?) contains ?\{?, then it is not a constant expression (and if having a constant expression is important to some user, that user should use ?+? concatenation instead). We would need to think of a good name for "what otherwise looks like a string literal (no leading ?$?) but contains ?\{? ?; right now, the best I can think of is ?string interpolation literal??but it isn't really a literal, it?s an expression. Maybe the right terms are:

$?foo?              trivial string template expression
?foo?                string literal
        $?Hello, \{x}?     nontrivial string template expression
        ?Hello, \{x}?      string interpolation expression

APIs that need to vet things can provide methods that accept string templates but no methods that accept strings; type checking will then prevent the accident of writing

SQL.process(?INSERT INTO Students (name) VALUES (\{new name});?);

when it should have been

SQL.process($?INSERT INTO Students (name) VALUES (\{new name});?);

(This example is of course borrowed from the explanation of ?Little Bobby Tables? over at the Explain XKCD wiki https://www.explainxkcd.com/wiki/index.php/Robert%27);_DROP_TABLE_Students;-- .)


On Mar 13, 2024, at 11:45?AM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com> wrote:

Hi Guy,

On 12/03/2024 17:54, Guy Steele wrote:
(1) avoids this problem by making the syntaxes different. (2b) avoids the problem by making the semantics match. But (2a) totally has this problem.

I agree that 2a leaves us in a place that is suboptimal.

I think 2b is also undesirable (as I explained elsewhere), as it would compromise the design goals of the feature too much IMHO.

So, the choice is (also IMHO) between an ad-hoc conversion (with the problems that I described in my previous email) and a different literal syntax (your (1)).

Maurizio


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240313/423f6fb2/attachment-0001.htm>

From brian.goetz at oracle.com  Wed Mar 13 19:05:59 2024
From: brian.goetz at oracle.com (Brian Goetz)
Date: Wed, 13 Mar 2024 15:05:59 -0400
Subject: Update on String Templates (JEP 459)
In-Reply-To: <CANSoFxtLXQ52b+LWQBhoUTz6HPjaCsy=ufnZOYeyQsgH6EwZEQ@mail.gmail.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <CANSoFxtLXQ52b+LWQBhoUTz6HPjaCsy=ufnZOYeyQsgH6EwZEQ@mail.gmail.com>
Message-ID: <ecedb63d-4f62-4549-9000-e6868580a934@oracle.com>


On 3/13/2024 10:48 AM, Archie Cobbs wrote:
>
> I was intrigued by this comment (Maurizio):
>
> > Another, simpler, option we considered was to use some kind of 
> prefix to mark a string template literal (e.g. make that explicit, 
> instead of resorting to language wizardry). That works, but has the 
> disadvantage of breaking the spell that there is only ?one string 
> literal?, which is something we have worked quite hard to achieve.
>
> What exactly is the advantage, in terms of the mental model of the 
> programmer, of having "one string literal"?

When we started doing text blocks, we did a survey of string 
literal-like features in other languages, and were a little concerned 
that a lot of languages had a proliferation of different kinds of 
strings with different rules.?? (An example of "different rules for 
different kinds of strings" would be that $ is a regular character in a 
string literal, but an escape character in an interpolated string.)

Before we figured out the design center of text blocks (think 
"two-dimensional string literals"), there were a number of envisioned 
extension directions for string literals -- multi-line, raw, embedded 
expressions, etc.? And because these extension directions are 
orthogonal, there could easily be 2^n kinds of string literal.? We 
didn't want to put users in a position of having to choose between e.g., 
"raw" and "multi-line", nor did we want to risk there being interactions 
between the rules for these different sub-kinds.

One technique we use to tie together these various forms is by having a 
common sub-language within the quotes; each of the forms uses the same 
set of escape sequences (though this set is extended with 
context-specific options, such as \{ for templates.)? Another is the 
delimiters; they are all "double-quote flavored", again to provide a 
sense that these are all projections of the same core literal feature.? 
The more we wander from this center, the more we risk ending up with 
locally-sane but globally-inconsistent sub-features.


> Maybe I'm just not seeing it.
>
> I can understand the advantage of having String <: StringTemplate - 
> that gives me more flexibility when passing around objects - great! 
> But do I need that same flexibility with /literals/?
>
> Consider how we handle float vs. double literals. They overlap for 
> 32-bit values, which is very convenient, but you can also "force" a 
> narrower interpretation by adding an "f" suffix. That seems like 
> pretty much the best of both worlds to me.
>
> So is this an analogous situation? Then we'd allow a StringTemplate 
> literal to have an /optional/ "$" prefix:
>
> obj.takingString("abcd"); ??????????? // ok - string
> obj.takingTemplate("abcd"); // ok - template
> obj.takingStringOrTemplate($"abcd"); // ok - template
> obj.takingStringOrTemplate("abcd"); // ok - string or template 
> (personally I don't care)
> obj.takingString($"abcd"); // fail
> obj.takingTemplate($"abcd"); // ok - template
> obj.takingString("x = \{var}"); // fail
> obj.takingTemplate("x = \{var}"); // ok - template
>
> Thanks,
> -Archie
>
> -- 
> Archie L. Cobbs
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240313/22db4246/attachment.htm>

From john.r.rose at oracle.com  Wed Mar 13 19:33:19 2024
From: john.r.rose at oracle.com (John Rose)
Date: Wed, 13 Mar 2024 12:33:19 -0700
Subject: Update on String Templates (JEP 459)
In-Reply-To: <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
Message-ID: <F39C31BF-ABBB-49CA-8A70-DF0169F8A732@oracle.com>

On 9 Mar 2024, at 3:48, Tagir Valeev wrote:

> The idea is interesting. There's a thing that disturbs me though.
> Currently, proc."string" and proc."string \{template}" are uniformly
> processed, and the processor may not care much about whether it's a string
> or a template: both can be processed uniformly. After this change, removing
> the last embedded expression from the template (e.g., after inlining a
> constant) will implicitly change the type of the literal from
> StringTemplate to String. This may either cause a compilation error, or
> silently bind to another overload which may or may not behave like a
> template overload with a single-fragment-template. For API authors, this
> means that every method accepting StringTemplate should have a counterpart
> accepting String. The logic inside both methods would likely be very
> similar, so probably both will eventually call a third private method. For
> API user, it could be unclear how to call a method accepting StringTemplate
> if I have simple string in hands but there's no String method (or it does
> slightly different thing due to poor API design). Should I use some ugly
> construct like "This is a string but the API wants a template, so I append
> an empty embedded expression\{""}"?

This is a huge thread that I hesitate to dive into, but here?s me putting in one toe:  Why do we care so much about no-arg string templates?  It?s a small corner case!  The workarounds (for the no-arg case) are totally straightforward even if the string template literals (as a syntax) are required to have at least one argument.

Can we have a plausible use case, please, for why a ST with no arguments would be important, so important that we are motived to invent a sigil syntax or special type system rules, to avoid requiring the user to invoke a static factory?

Also, Tagir?s workaround of adding a fake argument looks like it would work just fine, of course depending on which processor was eventually used.

And in that vein let me add one new (very bike-sheddy) suggestion before I beat a hasty retreat:  Instead of in (1) a sigil before the quote like Guy?s $"hello", put it (1b) after the quote, and in the ST case only.  The ST syntax could explicitly allow that a no-arg string template would be spelled with a leading sequence "\{}... which looks like the coder started writing a ST argument, but in fact dropped it.  So "hello" is a 5-char string, in any context.  And "\{}hello" is a 5-char no-arg string template, in any context.  That?s Tagir?s workaround, elevated a bit into a new corner case of (existing) syntax.

But even that teeny bit of syntax strikes me as overkill, because I don?t see the importance of the use cases (no-arg STs) it helps.  Just call ST.of("hello") and call it a day.

In any case, it seems fine to let the IDE take the lead with no-arg STs, helping the user decide when and how to disambiguate strings from no-arg STs.  Putting in syntax or type system help for this is surely more expensive than punting to the IDE, unless there is going to be heavy use of no-arg STs for some use cases I am not seeing.

From guy.steele at oracle.com  Wed Mar 13 20:13:30 2024
From: guy.steele at oracle.com (Guy Steele)
Date: Wed, 13 Mar 2024 20:13:30 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <F39C31BF-ABBB-49CA-8A70-DF0169F8A732@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <F39C31BF-ABBB-49CA-8A70-DF0169F8A732@oracle.com>
Message-ID: <DD00D360-14C9-481B-80C7-AA5C47625FE0@oracle.com>


> On Mar 13, 2024, at 3:33?PM, John Rose <john.r.rose at oracle.com> wrote:
> 
> On 9 Mar 2024, at 3:48, Tagir Valeev wrote:
> 
>> The idea is interesting. There's a thing that disturbs me though.
>> Currently, proc."string" and proc."string \{template}" are uniformly
>> processed, and the processor may not care much about whether it's a string
>> or a template: both can be processed uniformly. After this change, removing
>> the last embedded expression from the template (e.g., after inlining a
>> constant) will implicitly change the type of the literal from
>> StringTemplate to String. This may either cause a compilation error, or
>> silently bind to another overload which may or may not behave like a
>> template overload with a single-fragment-template. For API authors, this
>> means that every method accepting StringTemplate should have a counterpart
>> accepting String. The logic inside both methods would likely be very
>> similar, so probably both will eventually call a third private method. For
>> API user, it could be unclear how to call a method accepting StringTemplate
>> if I have simple string in hands but there's no String method (or it does
>> slightly different thing due to poor API design). Should I use some ugly
>> construct like "This is a string but the API wants a template, so I append
>> an empty embedded expression\{""}"?
> 
> This is a huge thread that I hesitate to dive into, but here?s me putting in one toe:  Why do we care so much about no-arg string templates?  It?s a small corner case!  The workarounds (for the no-arg case) are totally straightforward even if the string template literals (as a syntax) are required to have at least one argument.
> 
> Can we have a plausible use case, please, for why a ST with no arguments would be important, so important that we are motived to invent a sigil syntax or special type system rules, to avoid requiring the user to invoke a static factory?
> 
> Also, Tagir?s workaround of adding a fake argument looks like it would work just fine, of course depending on which processor was eventually used.
> 
> And in that vein let me add one new (very bike-sheddy) suggestion before I beat a hasty retreat:  Instead of in (1) a sigil before the quote like Guy?s $"hello", put it (1b) after the quote, and in the ST case only.  The ST syntax could explicitly allow that a no-arg string template would be spelled with a leading sequence "\{}... which looks like the coder started writing a ST argument, but in fact dropped it.  So "hello" is a 5-char string, in any context.  And "\{}hello" is a 5-char no-arg string template, in any context.  That?s Tagir?s workaround, elevated a bit into a new corner case of (existing) syntax.
> 
> But even that teeny bit of syntax strikes me as overkill, because I don?t see the importance of the use cases (no-arg STs) it helps.  Just call ST.of("hello") and call it a day.
> 
> In any case, it seems fine to let the IDE take the lead with no-arg STs, helping the user decide when and how to disambiguate strings from no-arg STs.  Putting in syntax or type system help for this is surely more expensive than punting to the IDE, unless there is going to be heavy use of no-arg STs for some use cases I am not seeing.

Well, just off the top of my head as a thought experiment, if I had a series of SQL commands to process, some with arguments and some not, I would rather write

SQL.process($?CREATE TABLE foo;?);
SQL.process($?ALTER TABLE foo ADD name varchar(40);?);
SQL.process($?ALTER TABLE foo ADD title varchar(30);?);
SQL.process($?INSERT INTO foo (name, title) VALUES (?Guy?, ?Hacker?);?);
SQL.process($?INSERT INTO foo (name, title) VALUES (\{other name}, \{other job});?);

than

SQL.process(ST.of(?CREATE TABLE foo;?));
SQL.process(ST.of(?ALTER TABLE foo ADD name varchar(40);?));
SQL.process(ST.of(?ALTER TABLE foo ADD title varchar(30);?));
SQL.process(ST.of(?INSERT INTO foo (name, title) VALUES (?Guy?, ?Hacker?);?));
SQL.process(?INSERT INTO foo (name, title) VALUES (\{other name}, \{other job});?);

especially if I thought that maybe down the road I might want to change the constants 30 and 40 and ?Hacker' to variables. I don't want to have to keep adding and deleting calls to ST.of as I edit the template strings during program development to have different numbers of interpolated expressions.

?Guy

From forax at univ-mlv.fr  Wed Mar 13 20:34:37 2024
From: forax at univ-mlv.fr (Remi Forax)
Date: Wed, 13 Mar 2024 21:34:37 +0100 (CET)
Subject: Update on String Templates (JEP 459)
In-Reply-To: <DD00D360-14C9-481B-80C7-AA5C47625FE0@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <F39C31BF-ABBB-49CA-8A70-DF0169F8A732@oracle.com>
 <DD00D360-14C9-481B-80C7-AA5C47625FE0@oracle.com>
Message-ID: <446018562.28334858.1710362077464.JavaMail.zimbra@univ-eiffel.fr>

----- Original Message -----
> From: "Guy Steele" <guy.steele at oracle.com>
> To: "John Rose" <john.r.rose at oracle.com>
> Cc: "Tagir Valeev" <amaembo at gmail.com>, "Brian Goetz" <brian.goetz at oracle.com>, "amber-spec-experts"
> <amber-spec-experts at openjdk.org>
> Sent: Wednesday, March 13, 2024 9:13:30 PM
> Subject: Re: Update on String Templates (JEP 459)

>> On Mar 13, 2024, at 3:33?PM, John Rose <john.r.rose at oracle.com> wrote:
>> 
>> On 9 Mar 2024, at 3:48, Tagir Valeev wrote:
>> 
>>> The idea is interesting. There's a thing that disturbs me though.
>>> Currently, proc."string" and proc."string \{template}" are uniformly
>>> processed, and the processor may not care much about whether it's a string
>>> or a template: both can be processed uniformly. After this change, removing
>>> the last embedded expression from the template (e.g., after inlining a
>>> constant) will implicitly change the type of the literal from
>>> StringTemplate to String. This may either cause a compilation error, or
>>> silently bind to another overload which may or may not behave like a
>>> template overload with a single-fragment-template. For API authors, this
>>> means that every method accepting StringTemplate should have a counterpart
>>> accepting String. The logic inside both methods would likely be very
>>> similar, so probably both will eventually call a third private method. For
>>> API user, it could be unclear how to call a method accepting StringTemplate
>>> if I have simple string in hands but there's no String method (or it does
>>> slightly different thing due to poor API design). Should I use some ugly
>>> construct like "This is a string but the API wants a template, so I append
>>> an empty embedded expression\{""}"?
>> 
>> This is a huge thread that I hesitate to dive into, but here?s me putting in one
>> toe:  Why do we care so much about no-arg string templates?  It?s a small
>> corner case!  The workarounds (for the no-arg case) are totally straightforward
>> even if the string template literals (as a syntax) are required to have at
>> least one argument.
>> 
>> Can we have a plausible use case, please, for why a ST with no arguments would
>> be important, so important that we are motived to invent a sigil syntax or
>> special type system rules, to avoid requiring the user to invoke a static
>> factory?
>> 
>> Also, Tagir?s workaround of adding a fake argument looks like it would work just
>> fine, of course depending on which processor was eventually used.
>> 
>> And in that vein let me add one new (very bike-sheddy) suggestion before I beat
>> a hasty retreat:  Instead of in (1) a sigil before the quote like Guy?s
>> $"hello", put it (1b) after the quote, and in the ST case only.  The ST syntax
>> could explicitly allow that a no-arg string template would be spelled with a
>> leading sequence "\{}... which looks like the coder started writing a ST
>> argument, but in fact dropped it.  So "hello" is a 5-char string, in any
>> context.  And "\{}hello" is a 5-char no-arg string template, in any context.
>> That?s Tagir?s workaround, elevated a bit into a new corner case of (existing)
>> syntax.
>> 
>> But even that teeny bit of syntax strikes me as overkill, because I don?t see
>> the importance of the use cases (no-arg STs) it helps.  Just call
>> ST.of("hello") and call it a day.
>> 
>> In any case, it seems fine to let the IDE take the lead with no-arg STs, helping
>> the user decide when and how to disambiguate strings from no-arg STs.  Putting
>> in syntax or type system help for this is surely more expensive than punting to
>> the IDE, unless there is going to be heavy use of no-arg STs for some use cases
>> I am not seeing.
> 
> Well, just off the top of my head as a thought experiment, if I had a series of
> SQL commands to process, some with arguments and some not, I would rather write
> 
> SQL.process($?CREATE TABLE foo;?);
> SQL.process($?ALTER TABLE foo ADD name varchar(40);?);
> SQL.process($?ALTER TABLE foo ADD title varchar(30);?);
> SQL.process($?INSERT INTO foo (name, title) VALUES (?Guy?, ?Hacker?);?);
> SQL.process($?INSERT INTO foo (name, title) VALUES (\{other name}, \{other
> job});?);
> 
> than
> 
> SQL.process(ST.of(?CREATE TABLE foo;?));
> SQL.process(ST.of(?ALTER TABLE foo ADD name varchar(40);?));
> SQL.process(ST.of(?ALTER TABLE foo ADD title varchar(30);?));
> SQL.process(ST.of(?INSERT INTO foo (name, title) VALUES (?Guy?, ?Hacker?);?));
> SQL.process(?INSERT INTO foo (name, title) VALUES (\{other name}, \{other
> job});?);
> 
> especially if I thought that maybe down the road I might want to change the
> constants 30 and 40 and ?Hacker' to variables. I don't want to have to keep
> adding and deleting calls to ST.of as I edit the template strings during
> program development to have different numbers of interpolated expressions.

Given what Maurizio said and this, i think the only missing piece in the puzzle is what about existing methods taking a String as parameter.

We know that for SQL.process(), we do not want process() to take a String but only a StringTemplate.
But what about the existing methods that takes a String.

Given a method Logger.warning(String), should
  LOG.warning($?CREATE TABLE foo;?);
  LOG.warning($?INSERT INTO foo (name, title) VALUES (\{other name}, \{other job});?);

be legal ? Is there an auto-conversion (a kind of boxing conversion) from StringTemplate to String ?


> 
> ?Guy

R?mi

From guy.steele at oracle.com  Wed Mar 13 21:04:46 2024
From: guy.steele at oracle.com (Guy Steele)
Date: Wed, 13 Mar 2024 21:04:46 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <446018562.28334858.1710362077464.JavaMail.zimbra@univ-eiffel.fr>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <F39C31BF-ABBB-49CA-8A70-DF0169F8A732@oracle.com>
 <DD00D360-14C9-481B-80C7-AA5C47625FE0@oracle.com>
 <446018562.28334858.1710362077464.JavaMail.zimbra@univ-eiffel.fr>
Message-ID: <5241ED01-588E-45DD-9C59-2F6A3D62F3B8@oracle.com>


> On Mar 13, 2024, at 4:34?PM, Remi Forax <forax at univ-mlv.fr> wrote:
> 
> ----- Original Message -----
>> From: "Guy Steele" <guy.steele at oracle.com>
>> To: "John Rose" <john.r.rose at oracle.com>
>> Cc: "Tagir Valeev" <amaembo at gmail.com>, "Brian Goetz" <brian.goetz at oracle.com>, "amber-spec-experts"
>> <amber-spec-experts at openjdk.org>
>> Sent: Wednesday, March 13, 2024 9:13:30 PM
>> Subject: Re: Update on String Templates (JEP 459)
> 
>>> On Mar 13, 2024, at 3:33?PM, John Rose <john.r.rose at oracle.com> wrote:
>>> 
>>> On 9 Mar 2024, at 3:48, Tagir Valeev wrote:
>>> 
>>>> The idea is interesting. There's a thing that disturbs me though.
>>>> Currently, proc."string" and proc."string \{template}" are uniformly
>>>> processed, and the processor may not care much about whether it's a string
>>>> or a template: both can be processed uniformly. After this change, removing
>>>> the last embedded expression from the template (e.g., after inlining a
>>>> constant) will implicitly change the type of the literal from
>>>> StringTemplate to String. This may either cause a compilation error, or
>>>> silently bind to another overload which may or may not behave like a
>>>> template overload with a single-fragment-template. For API authors, this
>>>> means that every method accepting StringTemplate should have a counterpart
>>>> accepting String. The logic inside both methods would likely be very
>>>> similar, so probably both will eventually call a third private method. For
>>>> API user, it could be unclear how to call a method accepting StringTemplate
>>>> if I have simple string in hands but there's no String method (or it does
>>>> slightly different thing due to poor API design). Should I use some ugly
>>>> construct like "This is a string but the API wants a template, so I append
>>>> an empty embedded expression\{""}"?
>>> 
>>> This is a huge thread that I hesitate to dive into, but here?s me putting in one
>>> toe:  Why do we care so much about no-arg string templates?  It?s a small
>>> corner case!  The workarounds (for the no-arg case) are totally straightforward
>>> even if the string template literals (as a syntax) are required to have at
>>> least one argument.
>>> 
>>> Can we have a plausible use case, please, for why a ST with no arguments would
>>> be important, so important that we are motived to invent a sigil syntax or
>>> special type system rules, to avoid requiring the user to invoke a static
>>> factory?
>>> 
>>> Also, Tagir?s workaround of adding a fake argument looks like it would work just
>>> fine, of course depending on which processor was eventually used.
>>> 
>>> And in that vein let me add one new (very bike-sheddy) suggestion before I beat
>>> a hasty retreat:  Instead of in (1) a sigil before the quote like Guy?s
>>> $"hello", put it (1b) after the quote, and in the ST case only.  The ST syntax
>>> could explicitly allow that a no-arg string template would be spelled with a
>>> leading sequence "\{}... which looks like the coder started writing a ST
>>> argument, but in fact dropped it.  So "hello" is a 5-char string, in any
>>> context.  And "\{}hello" is a 5-char no-arg string template, in any context.
>>> That?s Tagir?s workaround, elevated a bit into a new corner case of (existing)
>>> syntax.
>>> 
>>> But even that teeny bit of syntax strikes me as overkill, because I don?t see
>>> the importance of the use cases (no-arg STs) it helps.  Just call
>>> ST.of("hello") and call it a day.
>>> 
>>> In any case, it seems fine to let the IDE take the lead with no-arg STs, helping
>>> the user decide when and how to disambiguate strings from no-arg STs.  Putting
>>> in syntax or type system help for this is surely more expensive than punting to
>>> the IDE, unless there is going to be heavy use of no-arg STs for some use cases
>>> I am not seeing.
>> 
>> Well, just off the top of my head as a thought experiment, if I had a series of
>> SQL commands to process, some with arguments and some not, I would rather write
>> 
>> SQL.process($?CREATE TABLE foo;?);
>> SQL.process($?ALTER TABLE foo ADD name varchar(40);?);
>> SQL.process($?ALTER TABLE foo ADD title varchar(30);?);
>> SQL.process($?INSERT INTO foo (name, title) VALUES (?Guy?, ?Hacker?);?);
>> SQL.process($?INSERT INTO foo (name, title) VALUES (\{other name}, \{other
>> job});?);
>> 
>> than
>> 
>> SQL.process(ST.of(?CREATE TABLE foo;?));
>> SQL.process(ST.of(?ALTER TABLE foo ADD name varchar(40);?));
>> SQL.process(ST.of(?ALTER TABLE foo ADD title varchar(30);?));
>> SQL.process(ST.of(?INSERT INTO foo (name, title) VALUES (?Guy?, ?Hacker?);?));
>> SQL.process(?INSERT INTO foo (name, title) VALUES (\{other name}, \{other
>> job});?);
>> 
>> especially if I thought that maybe down the road I might want to change the
>> constants 30 and 40 and ?Hacker' to variables. I don't want to have to keep
>> adding and deleting calls to ST.of as I edit the template strings during
>> program development to have different numbers of interpolated expressions.
> 
> Given what Maurizio said and this, i think the only missing piece in the puzzle is what about existing methods taking a String as parameter.
> 
> We know that for SQL.process(), we do not want process() to take a String but only a StringTemplate.
> But what about the existing methods that takes a String.
> 
> Given a method Logger.warning(String), should
>  LOG.warning($?CREATE TABLE foo;?);
>  LOG.warning($?INSERT INTO foo (name, title) VALUES (\{other name}, \{other job});?);
> 
> be legal ? Is there an auto-conversion (a kind of boxing conversion) from StringTemplate to String ?

In my proposal, the answer would be ?no?. Instead you would have two choices:

(1) Instead of string template expressions as in the example just given, you could use string literals or string interpolation expressions (omit the ?$? characters):

 LOG.warning(?CREATE TABLE foo;?);
 LOG.warning(?INSERT INTO foo (name, title) VALUES (\{other name}, \{other job});?);

(2) If instead you have some other sort of expression (such as a variable) whose type is StringTempate, you can write

 LOG.warning(String.of(myStringTemplate));

This makes quite explicit that a conversion is happening from StringTemplate to String.


From guy.steele at oracle.com  Wed Mar 13 21:12:15 2024
From: guy.steele at oracle.com (Guy Steele)
Date: Wed, 13 Mar 2024 21:12:15 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <5241ED01-588E-45DD-9C59-2F6A3D62F3B8@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <F39C31BF-ABBB-49CA-8A70-DF0169F8A732@oracle.com>
 <DD00D360-14C9-481B-80C7-AA5C47625FE0@oracle.com>
 <446018562.28334858.1710362077464.JavaMail.zimbra@univ-eiffel.fr>
 <5241ED01-588E-45DD-9C59-2F6A3D62F3B8@oracle.com>
Message-ID: <B258FE1D-E85F-4BE0-A758-942800C6120F@oracle.com>


> On Mar 13, 2024, at 5:04?PM, Guy Steele <guy.steele at oracle.com> wrote:
> 
>> On Mar 13, 2024, at 4:34?PM, Remi Forax <forax at univ-mlv.fr> wrote:
>> 
>> Given what Maurizio said and this, i think the only missing piece in the puzzle is what about existing methods taking a String as parameter.
>> 
>> We know that for SQL.process(), we do not want process() to take a String but only a StringTemplate.
>> But what about the existing methods that takes a String.
>> 
>> Given a method Logger.warning(String), should
>> LOG.warning($?CREATE TABLE foo;?);
>> LOG.warning($?INSERT INTO foo (name, title) VALUES (\{other name}, \{other job});?);
>> 
>> be legal ? Is there an auto-conversion (a kind of boxing conversion) from StringTemplate to String ?
> 
> In my proposal, the answer would be ?no?. Instead you would have two choices:
> 
> (1) Instead of string template expressions as in the example just given, you could use string literals or string interpolation expressions (omit the ?$? characters):
> 
> LOG.warning(?CREATE TABLE foo;?);
> LOG.warning(?INSERT INTO foo (name, title) VALUES (\{other name}, \{other job});?);
> 
> (2) If instead you have some other sort of expression (such as a variable) whose type is StringTempate, you can write
> 
> LOG.warning(String.of(myStringTemplate));
> 
> This makes quite explicit that a conversion is happening from StringTemplate to String.

That reminds me: I would recommend that the instance method `toString` for class StringTemplate _not_ be the same as `String.of(Template)`; rather, it should print in some form that shows the internal structure of the StringTemplate.

?Guy


From brian.goetz at oracle.com  Wed Mar 13 21:25:21 2024
From: brian.goetz at oracle.com (Brian Goetz)
Date: Wed, 13 Mar 2024 17:25:21 -0400
Subject: Update on String Templates (JEP 459)
In-Reply-To: <B258FE1D-E85F-4BE0-A758-942800C6120F@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <F39C31BF-ABBB-49CA-8A70-DF0169F8A732@oracle.com>
 <DD00D360-14C9-481B-80C7-AA5C47625FE0@oracle.com>
 <446018562.28334858.1710362077464.JavaMail.zimbra@univ-eiffel.fr>
 <5241ED01-588E-45DD-9C59-2F6A3D62F3B8@oracle.com>
 <B258FE1D-E85F-4BE0-A758-942800C6120F@oracle.com>
Message-ID: <280a40e9-2c80-4699-8464-22fca2944b4b@oracle.com>

That is how it works in the current version, and this behavior would be 
carried forward.? Otherwise, it is a form of implicit interpolation, 
which goes against the goals of the project.

On 3/13/2024 5:12 PM, Guy Steele wrote:
> That reminds me: I would recommend that the instance method `toString` for class StringTemplate_not_
>   be the same as `String.of(Template)`; rather, it should print in some
> form that shows the internal structure of the StringTemplate.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240313/bc9d1caa/attachment.htm>

From forax at univ-mlv.fr  Wed Mar 13 22:00:57 2024
From: forax at univ-mlv.fr (forax at univ-mlv.fr)
Date: Wed, 13 Mar 2024 23:00:57 +0100 (CET)
Subject: Update on String Templates (JEP 459)
In-Reply-To: <5241ED01-588E-45DD-9C59-2F6A3D62F3B8@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <F39C31BF-ABBB-49CA-8A70-DF0169F8A732@oracle.com>
 <DD00D360-14C9-481B-80C7-AA5C47625FE0@oracle.com>
 <446018562.28334858.1710362077464.JavaMail.zimbra@univ-eiffel.fr>
 <5241ED01-588E-45DD-9C59-2F6A3D62F3B8@oracle.com>
Message-ID: <665665054.28373576.1710367257784.JavaMail.zimbra@univ-eiffel.fr>

----- Original Message -----
> From: "Guy Steele" <guy.steele at oracle.com>
> To: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "Brian Goetz" <brian.goetz at oracle.com>, "amber-spec-experts" <amber-spec-experts at openjdk.org>
> Sent: Wednesday, March 13, 2024 10:04:46 PM
> Subject: Re: Update on String Templates (JEP 459)

>> On Mar 13, 2024, at 4:34?PM, Remi Forax <forax at univ-mlv.fr> wrote:
>> 
>> ----- Original Message -----
>>> From: "Guy Steele" <guy.steele at oracle.com>
>>> To: "John Rose" <john.r.rose at oracle.com>
>>> Cc: "Tagir Valeev" <amaembo at gmail.com>, "Brian Goetz" <brian.goetz at oracle.com>,
>>> "amber-spec-experts"
>>> <amber-spec-experts at openjdk.org>
>>> Sent: Wednesday, March 13, 2024 9:13:30 PM
>>> Subject: Re: Update on String Templates (JEP 459)
>> 
>>>> On Mar 13, 2024, at 3:33?PM, John Rose <john.r.rose at oracle.com> wrote:
>>>> 
>>>> On 9 Mar 2024, at 3:48, Tagir Valeev wrote:
>>>> 
>>>>> The idea is interesting. There's a thing that disturbs me though.
>>>>> Currently, proc."string" and proc."string \{template}" are uniformly
>>>>> processed, and the processor may not care much about whether it's a string
>>>>> or a template: both can be processed uniformly. After this change, removing
>>>>> the last embedded expression from the template (e.g., after inlining a
>>>>> constant) will implicitly change the type of the literal from
>>>>> StringTemplate to String. This may either cause a compilation error, or
>>>>> silently bind to another overload which may or may not behave like a
>>>>> template overload with a single-fragment-template. For API authors, this
>>>>> means that every method accepting StringTemplate should have a counterpart
>>>>> accepting String. The logic inside both methods would likely be very
>>>>> similar, so probably both will eventually call a third private method. For
>>>>> API user, it could be unclear how to call a method accepting StringTemplate
>>>>> if I have simple string in hands but there's no String method (or it does
>>>>> slightly different thing due to poor API design). Should I use some ugly
>>>>> construct like "This is a string but the API wants a template, so I append
>>>>> an empty embedded expression\{""}"?
>>>> 
>>>> This is a huge thread that I hesitate to dive into, but here?s me putting in one
>>>> toe:  Why do we care so much about no-arg string templates?  It?s a small
>>>> corner case!  The workarounds (for the no-arg case) are totally straightforward
>>>> even if the string template literals (as a syntax) are required to have at
>>>> least one argument.
>>>> 
>>>> Can we have a plausible use case, please, for why a ST with no arguments would
>>>> be important, so important that we are motived to invent a sigil syntax or
>>>> special type system rules, to avoid requiring the user to invoke a static
>>>> factory?
>>>> 
>>>> Also, Tagir?s workaround of adding a fake argument looks like it would work just
>>>> fine, of course depending on which processor was eventually used.
>>>> 
>>>> And in that vein let me add one new (very bike-sheddy) suggestion before I beat
>>>> a hasty retreat:  Instead of in (1) a sigil before the quote like Guy?s
>>>> $"hello", put it (1b) after the quote, and in the ST case only.  The ST syntax
>>>> could explicitly allow that a no-arg string template would be spelled with a
>>>> leading sequence "\{}... which looks like the coder started writing a ST
>>>> argument, but in fact dropped it.  So "hello" is a 5-char string, in any
>>>> context.  And "\{}hello" is a 5-char no-arg string template, in any context.
>>>> That?s Tagir?s workaround, elevated a bit into a new corner case of (existing)
>>>> syntax.
>>>> 
>>>> But even that teeny bit of syntax strikes me as overkill, because I don?t see
>>>> the importance of the use cases (no-arg STs) it helps.  Just call
>>>> ST.of("hello") and call it a day.
>>>> 
>>>> In any case, it seems fine to let the IDE take the lead with no-arg STs, helping
>>>> the user decide when and how to disambiguate strings from no-arg STs.  Putting
>>>> in syntax or type system help for this is surely more expensive than punting to
>>>> the IDE, unless there is going to be heavy use of no-arg STs for some use cases
>>>> I am not seeing.
>>> 
>>> Well, just off the top of my head as a thought experiment, if I had a series of
>>> SQL commands to process, some with arguments and some not, I would rather write
>>> 
>>> SQL.process($?CREATE TABLE foo;?);
>>> SQL.process($?ALTER TABLE foo ADD name varchar(40);?);
>>> SQL.process($?ALTER TABLE foo ADD title varchar(30);?);
>>> SQL.process($?INSERT INTO foo (name, title) VALUES (?Guy?, ?Hacker?);?);
>>> SQL.process($?INSERT INTO foo (name, title) VALUES (\{other name}, \{other
>>> job});?);
>>> 
>>> than
>>> 
>>> SQL.process(ST.of(?CREATE TABLE foo;?));
>>> SQL.process(ST.of(?ALTER TABLE foo ADD name varchar(40);?));
>>> SQL.process(ST.of(?ALTER TABLE foo ADD title varchar(30);?));
>>> SQL.process(ST.of(?INSERT INTO foo (name, title) VALUES (?Guy?, ?Hacker?);?));
>>> SQL.process(?INSERT INTO foo (name, title) VALUES (\{other name}, \{other
>>> job});?);
>>> 
>>> especially if I thought that maybe down the road I might want to change the
>>> constants 30 and 40 and ?Hacker' to variables. I don't want to have to keep
>>> adding and deleting calls to ST.of as I edit the template strings during
>>> program development to have different numbers of interpolated expressions.
>> 
>> Given what Maurizio said and this, i think the only missing piece in the puzzle
>> is what about existing methods taking a String as parameter.
>> 
>> We know that for SQL.process(), we do not want process() to take a String but
>> only a StringTemplate.
>> But what about the existing methods that takes a String.
>> 
>> Given a method Logger.warning(String), should
>>  LOG.warning($?CREATE TABLE foo;?);
>>  LOG.warning($?INSERT INTO foo (name, title) VALUES (\{other name}, \{other
>>  job});?);
>> 
>> be legal ? Is there an auto-conversion (a kind of boxing conversion) from
>> StringTemplate to String ?
> 
> In my proposal, the answer would be ?no?. Instead you would have two choices:
> 
> (1) Instead of string template expressions as in the example just given, you
> could use string literals or string interpolation expressions (omit the ?$?
> characters):
> 
> LOG.warning(?CREATE TABLE foo;?);
> LOG.warning(?INSERT INTO foo (name, title) VALUES (\{other name}, \{other
> job});?);
> 
> (2) If instead you have some other sort of expression (such as a variable) whose
> type is StringTempate, you can write
> 
> LOG.warning(String.of(myStringTemplate));
> 
> This makes quite explicit that a conversion is happening from StringTemplate to
> String.

Make sense, i like it.

(1) make the string interpolation explicit, and it can be fully optimize using an invokedynamic (same trick as STR."...")

(2) The current method is interpolate(), so
  LOG.warning(myStringTemplate.interpolate());

Compared to the previous iteration, no Processor interface, no weird calling syntax but instead two new literals, string interpolation and string template.
I think the only missing optimization was FMT."..." but it can be done if necessary by specializing String.format(StringTemplate) at the compiler level.

R?mi

From john.r.rose at oracle.com  Wed Mar 13 22:22:20 2024
From: john.r.rose at oracle.com (John Rose)
Date: Wed, 13 Mar 2024 15:22:20 -0700
Subject: Update on String Templates (JEP 459)
In-Reply-To: <DD00D360-14C9-481B-80C7-AA5C47625FE0@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <F39C31BF-ABBB-49CA-8A70-DF0169F8A732@oracle.com>
 <DD00D360-14C9-481B-80C7-AA5C47625FE0@oracle.com>
Message-ID: <D63380BB-ED30-4286-B26E-FFD6677A79EC@oracle.com>

On 13 Mar 2024, at 13:13, Guy Steele wrote:

> ? Well, just off the top of my head as a thought experiment, if I had a series of SQL commands to process, some with arguments and some not, I would rather write
>
> SQL.process($?CREATE TABLE foo;?);
> SQL.process($?ALTER TABLE foo ADD name varchar(40);?);
> SQL.process($?ALTER TABLE foo ADD title varchar(30);?);
> SQL.process($?INSERT INTO foo (name, title) VALUES (?Guy?, ?Hacker?);?);
> SQL.process($?INSERT INTO foo (name, title) VALUES (\{other name}, \{other job});?);
>
> than
>
> SQL.process(ST.of(?CREATE TABLE foo;?));
> SQL.process(ST.of(?ALTER TABLE foo ADD name varchar(40);?));
> SQL.process(ST.of(?ALTER TABLE foo ADD title varchar(30);?));
> SQL.process(ST.of(?INSERT INTO foo (name, title) VALUES (?Guy?, ?Hacker?);?));
> SQL.process(?INSERT INTO foo (name, title) VALUES (\{other name}, \{other job});?);

OK, yes.  I think a simpler example is needed to answer my question more fully.  In this example, the name ?foo? is given as a literal.  But, even if only as a workaround, it would probably not hurt code like that to quote such a name as an argument.  So:

> var foo = ?foo?;  // or static final String FOO = ?foo?;
> SQL.process(?CREATE TABLE \{foo};?);
> SQL.process(?ALTER TABLE \{foo} ADD name varchar(40);?);
> SQL.process(?ALTER TABLE \{foo} ADD title varchar(30);?);
> SQL.process(?INSERT INTO \{foo} (name, title) VALUES (?Guy?, ?Hacker?);?);
> SQL.process(?INSERT INTO \{foo} (name, title) VALUES (\{other name}, \{other job});?);

And it?s not just a workaround here, it?s arguably better style (D.R.Y.) to factor out the name foo that links everything together.  Non-support of no-arg STs would possibly push users towards a more D.R.Y. style, possibly a good thing.

I think such multi-command examples, in many little languages, will tend to have some term like foo shared across phrases.  The very small example I?m looking for would ideally be non-factorable, just a little string with not much substructure.  Because if it?s factorable, then maybe the user should just factor it, and then it?s a ST with arguments.  And if it?s not factorable, then maybe it is some stand-alone thing that won?t be harmed by making it a canned constant, or making it via a factory method, or making it a true string which is introduced into the processor by other means.

Not all languages offend against D.R.Y. as much as SQL.  A more contextual stateful little language, like Forth or Turtle graphics or Postscript, might have lots of little fixed commands (like ?left? for a turtle).  When we work with such little languages we sometime have lots of static final strings to help us find the commands and spell them correctly.  (Like static final String LEFT = ?left? in class TurtleGraphics.)  That would maybe morph into lots of static final STs?

????????

OVERLOADS

Another possible answer, in the use case with SQL above, is that if the language processor expects lots of ad hoc non-factorable (or non-factored) strings, it should cater to that expectation by taking String as an overload option.  That places pressure on the API designer to perform the conversion (ST.of) on the fly.  And overloads can expand non-linearly when there are several arguments in play.  And there are sometimes ambiguity risks in some corner cases, as Maurizio has shown.  Still sometimes it?s a good tradeoff to add an overload, if the problems are in truly minor corner cases.  Or maybe allowing strings instead of STs gives up some optimizations?  But often it?s better to let the chips fall with API design and do the work to optimize whichever API turns out to be most user-friendly.  I don?t see (maybe I missed it) a decisive objection to overloading across ST and String, at least for some processing APIs.

????????

ALGEBRA

These examples also lead me to a different source of questions, which is whether or how the existing practice of string constant expressions (like static final FOO above) can or should connect to STs as well.  It?s an interesting line of thought, so I?ll write something here, but (bottom line) I don?t think we want to act on it, at least at first.

String constants have a privileged role in the JLS, and also in programmer practice (as with FOO above).  Can/should STs leverage this somehow?  Should a ?constant ST expression? be an alternative to a ST literal?  I?m thinking of a String or ST constant like MY_FORTH_PROLOGUE which I stick at the front of some ST that I?m building.

But that would seem to require some way to concatenate such a string to an ST, an expression like ST ?+? String -> ST, which seems disturbing to me, but might actually make sense.  Or would nesting be better, something like `(define tp (foo , at sub-tp bar))?  A variation of \{x} like \@{subtp}?  (And would there be javac constant folding rules for it, as well as dynamic rules for evaluation?)  This is speculative brainstorming; I?m not seriously recommending it for now.

Still, continuing? If MY_FORTH_PROLOGUE should be a static final ST, then I want options for prepending it locally to ad hoc strings.  So the question about ?what about constants? turns into a larger question, ?what about ST algebra on ST expressions??  If you allow constants to be defined non-locally, you need a way to combine them with ?more stuff? locally.  This relates to the issue raised earlier of whether nested STs should be part of the ST API:  Whether you concatenate two STs or nest one inside another, it seems you are doing some kind of generic ST algebra, generic across all uses of ST, not just for some processors.

And, circling back, if there were a way to fold ST literals together (with some non-local parts) then that would lead to another alternative to a ?sigil? to disambiguate a plain string from a no-arg ST.  You?d use the concatenation operation (whatever that is) to combine an empty ST into the string that needs markup.  Kind of like when we say ??+x to abbreviate String.valueOf(x).

Given nesting or concatenation syntax, no-arg ST literals could be disambiguated by a prefix like ST.EMPTY+??? or like ?\@{}??, which either prepends or nests a degenerate ST.  That could serve a role like $??? in your examples, Guy, although of course a single-char sigil looks nicer.

Maybe we want some more algebra like that someday, but I am not enthusiastic enough to recommend it now.  I guess the most I?d recommend is somehow leave room for building up nested or concatenated literals, as a future addition.  Allowing STs to start like ?\{}?? would solve today?s disambiguation problem with a kludge like $???, and also a hint of more ?algebra? in the future.

> SQL.process(?\{}CREATE TABLE foo;?);
> SQL.process(?\{}ALTER TABLE foo ADD name varchar(40);?);
> SQL.process(?\{}ALTER TABLE foo ADD title varchar(30);?);
> SQL.process(?\{}INSERT INTO foo (name, title) VALUES (?Guy?, ?Hacker?);?);
> SQL.process(?\{}INSERT INTO foo (name, title) VALUES (\{other name}, \{other job});?);

HTH

From john.r.rose at oracle.com  Wed Mar 13 22:37:57 2024
From: john.r.rose at oracle.com (John Rose)
Date: Wed, 13 Mar 2024 15:37:57 -0700
Subject: Update on String Templates (JEP 459)
In-Reply-To: <D63380BB-ED30-4286-B26E-FFD6677A79EC@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <F39C31BF-ABBB-49CA-8A70-DF0169F8A732@oracle.com>
 <DD00D360-14C9-481B-80C7-AA5C47625FE0@oracle.com>
 <D63380BB-ED30-4286-B26E-FFD6677A79EC@oracle.com>
Message-ID: <2636E7B0-2A41-4034-8367-A20687FABCF9@oracle.com>

On 13 Mar 2024, at 15:22, John Rose wrote:

> ? OVERLOADS ?
>
> I don?t see (maybe I missed it) a decisive objection to overloading across ST
> and String, at least for some processing APIs.

Perhaps it is this:  A language processor API that takes STs and never Strings is making it clear that all inputs should be properly vetted, nothing taken on trust as a bare string.

Doing that MIGHT require a performance model which permits expensive vetting operations to be memoized on particular OCCURRENCES of inputs (not just the input strings viewed in and of themselves).

If that?s true, then I guess that?s support for Guy?s proposal: That STs (even trivial ones) should never look identical to strings.  Maybe they should always be preceded by a sigil $, or (per my suggestion) they should always have at least one occurrence of \{ inside, even if it?s a trivial nop.

I kind of like Guy?s offensive-to-everyone suggestion that $ is required to make a true ST.  Then it?s clear how the veteting APIs mate up with their vetted inputs.  And if $ is not placed in front, we surrender to the string-pasters, but at least the resulting true-string expressions won?t be accepted by the vetting APIs.

From maurizio.cimadamore at oracle.com  Wed Mar 13 23:47:32 2024
From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore)
Date: Wed, 13 Mar 2024 23:47:32 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <2636E7B0-2A41-4034-8367-A20687FABCF9@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <F39C31BF-ABBB-49CA-8A70-DF0169F8A732@oracle.com>
 <DD00D360-14C9-481B-80C7-AA5C47625FE0@oracle.com>
 <D63380BB-ED30-4286-B26E-FFD6677A79EC@oracle.com>
 <2636E7B0-2A41-4034-8367-A20687FABCF9@oracle.com>
Message-ID: <0003bf81-07cc-46ac-9771-a3d362c06a7a@oracle.com>

There is a problem/slippery slope with overloads, which I think should 
be discussed (and that discussion seems, at least to me, more important 
than the discussion on how we spell string literals).

Consider the case of a /new/ API, that perhaps wants to build SQL 
queries (or any other kind of injection-sensitive factory):

|Query makeQuery(???) |

What should be the natural parameter type for this query? Well, we know 
that String is flawed here. Easy to reach for, but also too easy to 
abuse. StringTemplate is a much better type because it allows 
user-injectable values and constant parts to carried in separate parts 
of the string template, so that the library has a chance at looking at 
what?s going on.

Ok, so let?s say we write the factory as:

|Query makeQuery(StringTemplate) |

As that is clearly the safer option. This obviously works well /as long 
as clients are passing template with arguments/.

No-argument templates might be a corner case, but, sooner or later 
somebody might want to do this:

|makeQuery("SELECT foo FROM bar WHERE foo = 42"); |

Only to discover that this doesn?t compile. What then? There are a 
couple of alternatives I can think of. The first is to add a 
String-accepting overload:

|Query makeQuery(StringTemplate) Query makeQuery(String) |

The second is to use some use-site factory call to turn the string into 
a degenerate string template:

|makeQuery(StringTemplate.fromString("SELECT foo FROM bar WHERE foo = 
42")); |

IMHO, both approaches have problems: they force the user to go from the 
safer StringTemplate world, to the more unsafe String world. It?s sort 
of like crossing the Rubicon: once you?re in String-land, it then become 
easier to introduce potentially very costly mistakes. If we have overloads:

|makeQuery("SELECT " + foo + " FROM " + bar + " WHERE " + condition); |

This would now compile just fine. Effectively, safety-wise we?d be back 
at square one. The factory case is only marginally better - because 
using the factory is more convoluted, so it would perhaps be easier to 
spot that something fishy is going on. That said, as the expression got 
more complicated, it?s easier for bugs to sneak in:

|makeQuery(StringTemplate.fromString("SELECT " + foo + "FROM bar WHERE 
foo = 42")); |

So, at least in my opinion, having a string template literal, or some 
kind of compiler-controlled promotion from string /constants/ to string 
templates, is not just something we need to type less characters (I 
honestly couldn?t care less about that, at least not at this stage). 
These things are needed to allow developers to remain in 
StringTemplate-land.

That is, the best /overall/ outcome is for the library /not/ to have an 
overload, /and/ for the client to either say this:

|makeQuery("SELECT foo FROM bar WHERE foo = 42"); // works because of 
implicit promotion of constant String -> StringTemplate |

or this:

|makeQuery(<insert your favourite "I'M A TEMPLATE" char here>"SELECT foo 
FROM bar WHERE foo = 42"); // works because it's a string template all 
along |

Maurizio

On 13/03/2024 22:37, John Rose wrote:

    On 13 Mar 2024, at 15:22, John Rose wrote:

        ? OVERLOADS ?

        I don?t see (maybe I missed it) a decisive objection to overloading
        across ST and String, at least for some processing APIs.
        Perhaps it is this: A language processor API that takes STs and
        never Strings is making it clear that all inputs should be properly
        vetted, nothing taken on trust as a bare string.

    Doing that MIGHT require a performance model which permits expensive
    vetting operations to be memoized on particular OCCURRENCES of inputs
    (not just the input strings viewed in and of themselves).

    If that?s true, then I guess that?s support for Guy?s proposal: That
    STs (even trivial ones) should never look identical to strings.
    Maybe they should always be preceded by a sigil $, or (per my
    suggestion) they should always have at least one occurrence of {
    inside, even if it?s a trivial nop.

    I kind of like Guy?s offensive-to-everyone suggestion that $ is
    required to make a true ST. Then it?s clear how the veteting APIs
    mate up with their vetted inputs. And if $ is not placed in front,
    we surrender to the string-pasters, but at least the resulting
    true-string expressions won?t be accepted by the vetting APIs.

?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240313/f0d555af/attachment-0001.htm>

From john.r.rose at oracle.com  Thu Mar 14 01:22:53 2024
From: john.r.rose at oracle.com (John Rose)
Date: Wed, 13 Mar 2024 18:22:53 -0700
Subject: Update on String Templates (JEP 459)
In-Reply-To: <0003bf81-07cc-46ac-9771-a3d362c06a7a@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <F39C31BF-ABBB-49CA-8A70-DF0169F8A732@oracle.com>
 <DD00D360-14C9-481B-80C7-AA5C47625FE0@oracle.com>
 <D63380BB-ED30-4286-B26E-FFD6677A79EC@oracle.com>
 <2636E7B0-2A41-4034-8367-A20687FABCF9@oracle.com>
 <0003bf81-07cc-46ac-9771-a3d362c06a7a@oracle.com>
Message-ID: <6D8BC382-E9EC-40AB-A6EB-273FD326E2B4@oracle.com>

Thanks, Maurizio.  I find your arguments helpful and persuasive.  They 
indicate that ?autoboxing? is the wrong model, since it would lift 
ad hoc strings into places that want only STs.

The poly-expression move, applied only to string literals, is not so 
bad, since the only ad hoc strings liftable to STs are those right next 
to the API points that demand STs.

But, if we are going to make ST-demanding APIs the lock and STs the 
keys, it might be reasonable to demand that all STs look distinctive 
(with that extra sigil), which is an argument even against the 
poly-expression move.

Guy?s disruptive suggestion, of having both kinds of interpolation 
expressions, would play out as two tiers of vetting and security.  The 
lower tier is inhabited by strings.  You have to drive carefully on 
those streets, where dodgy APIs accept all kinds of strings, and there 
are no $ sigils to indicate vetted inputs.  The higher tier would be API 
points that demand STs (and do not welcome plain strings).  To get into 
that safer tier, you pay a cover charge, the $ sigils (or API points 
which manufacture STs explicitly).

It might seem wrong to ask a cover charge for a tier we want users to 
prefer, but the IDE will surely help pay it as needed.  (The $ is 
visible in the code, as a reminder the security is enabled.  Like the 
wrist band you get when you pay the cover?)

On the other hand, if we try to make everything be one tier (everything 
potentially vettable, but with loopholes for raw strings), the security 
guarantees get muddier.  If everything is equally secure, and there are 
loopholes (for string concat and the like) then everything is also 
equally insecure, in some hand-wavy sense.

More hand-waving:  Distinct tiers is a more honest design, allowing for 
better invariants within the higher tier, and relaxed behavior in the 
lower tier.  Also, maybe, having the distinct tiers be visibly connected 
by syntax encourages folks muddling around with string-concat to lift 
their code to work on STs instead of strings.  Switch the APIs and add 
the dollar signs.

OK, I?ll stop now.  I?m past the point where I need to try the API 
on some serious project, before I speculate more.

On 13 Mar 2024, at 16:47, Maurizio Cimadamore wrote:

> There is a problem/slippery slope with overloads, which I think should 
> be discussed (and that discussion seems, at least to me, more 
> important than the discussion on how we spell string literals).
>
> Consider the case of a /new/ API, that perhaps wants to build SQL 
> queries (or any other kind of injection-sensitive factory):
>
> |Query makeQuery(???) |
>
> What should be the natural parameter type for this query? Well, we 
> know that String is flawed here. Easy to reach for, but also too easy 
> to abuse. StringTemplate is a much better type because it allows 
> user-injectable values and constant parts to carried in separate parts 
> of the string template, so that the library has a chance at looking at 
> what?s going on.
>
> Ok, so let?s say we write the factory as:
>
> |Query makeQuery(StringTemplate) |
>
> As that is clearly the safer option. This obviously works well /as 
> long as clients are passing template with arguments/.
>
> No-argument templates might be a corner case, but, sooner or later 
> somebody might want to do this:
>
> |makeQuery("SELECT foo FROM bar WHERE foo = 42"); |
>
> Only to discover that this doesn?t compile. What then? There are a 
> couple of alternatives I can think of. The first is to add a 
> String-accepting overload:
>
> |Query makeQuery(StringTemplate) Query makeQuery(String) |
>
> The second is to use some use-site factory call to turn the string 
> into a degenerate string template:
>
> |makeQuery(StringTemplate.fromString("SELECT foo FROM bar WHERE foo = 
> 42")); |
>
> IMHO, both approaches have problems: they force the user to go from 
> the safer StringTemplate world, to the more unsafe String world. 
> It?s sort of like crossing the Rubicon: once you?re in 
> String-land, it then become easier to introduce potentially very 
> costly mistakes. If we have overloads:
>
> |makeQuery("SELECT " + foo + " FROM " + bar + " WHERE " + condition); 
> |
>
> This would now compile just fine. Effectively, safety-wise we?d be 
> back at square one. The factory case is only marginally better - 
> because using the factory is more convoluted, so it would perhaps be 
> easier to spot that something fishy is going on. That said, as the 
> expression got more complicated, it?s easier for bugs to sneak in:
>
> |makeQuery(StringTemplate.fromString("SELECT " + foo + "FROM bar WHERE 
> foo = 42")); |
>
> So, at least in my opinion, having a string template literal, or some 
> kind of compiler-controlled promotion from string /constants/ to 
> string templates, is not just something we need to type less 
> characters (I honestly couldn?t care less about that, at least not 
> at this stage). These things are needed to allow developers to remain 
> in StringTemplate-land.
>
> That is, the best /overall/ outcome is for the library /not/ to have 
> an overload, /and/ for the client to either say this:
>
> |makeQuery("SELECT foo FROM bar WHERE foo = 42"); // works because of 
> implicit promotion of constant String -> StringTemplate |
>
> or this:
>
> |makeQuery(<insert your favourite "I'M A TEMPLATE" char here>"SELECT 
> foo FROM bar WHERE foo = 42"); // works because it's a string template 
> all along |
>
> Maurizio
>
> On 13/03/2024 22:37, John Rose wrote:
>
>    On 13 Mar 2024, at 15:22, John Rose wrote:
>
>        ? OVERLOADS ?
>
>        I don?t see (maybe I missed it) a decisive objection to 
> overloading
>        across ST and String, at least for some processing APIs.
>        Perhaps it is this: A language processor API that takes STs and
>        never Strings is making it clear that all inputs should be 
> properly
>        vetted, nothing taken on trust as a bare string.
>
>    Doing that MIGHT require a performance model which permits 
> expensive
>    vetting operations to be memoized on particular OCCURRENCES of 
> inputs
>    (not just the input strings viewed in and of themselves).
>
>    If that?s true, then I guess that?s support for Guy?s 
> proposal: That
>    STs (even trivial ones) should never look identical to strings.
>    Maybe they should always be preceded by a sigil $, or (per my
>    suggestion) they should always have at least one occurrence of {
>    inside, even if it?s a trivial nop.
>
>    I kind of like Guy?s offensive-to-everyone suggestion that $ is
>    required to make a true ST. Then it?s clear how the veteting APIs
>    mate up with their vetted inputs. And if $ is not placed in front,
>    we surrender to the string-pasters, but at least the resulting
>    true-string expressions won?t be accepted by the vetting APIs.
>
> ?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240313/ceb5db92/attachment-0001.htm>

From guy.steele at oracle.com  Thu Mar 14 15:08:15 2024
From: guy.steele at oracle.com (Guy Steele)
Date: Thu, 14 Mar 2024 15:08:15 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <E3598C9D-26BA-4AC0-AE01-EE7E7C42CA44@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>
 <B52FF02B-5379-438A-9A23-05519246AB1F@oracle.com>
 <bf368957-e3c2-4d79-b462-06d20e38a387@oracle.com>
 <E3598C9D-26BA-4AC0-AE01-EE7E7C42CA44@oracle.com>
Message-ID: <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com>

Second thoughts about how to explain a string interpolation literal:

> On Mar 13, 2024, at 2:02?PM, Guy Steele <guy.steele at oracle.com> wrote:
> . . .
> 
> ?????????
> String is not a subtype of StringTemplate; they are disjoint types.
> 
> 	$?foo?              is a (trivial) string template literal
> 	?foo?                is a string literal
>         $?Hello, \{x}?     is a (nontrivial) string template literal
>         ?Hello, \{x}?      is a shorthand (expanded by the compiler) for `String.of($?Hello, \{x}?)`
> ?????????

Given that the intent is that String.of (or whatever we want to call it?possibly the `interpolation` instance method of class `StringTemplate` rather than a static method `String.of`) should just do standard string concatenation, we might be better off just saying that a string interpolation literal is expanded by the compiler into uses of ?+?; for example,

         ?Hello, \{x}.?

(I have added a period to the example to make the point clearer) is expanded into

        ?Hello, ? + x + ?.?

and in general

        ?c0\{e1}c1\{e2}c2?\{en}cn?

(where each ck is a possibly empty sequence of string characters and each ek is an expression)  is expanded into

        ?c0? + (e1) + ?c1? + (e2) + ?c2? + ? + (en) + ?cn?

The point is that, with this definition, ?c0\{e1}c1\{e2}c2?\{en}cn? is a constant expression iff every ek is a constant expression. This is handy for interpolating constant variables into a string that is itself intended to be constant.

?Guy


From ccherlin at gmail.com  Thu Mar 14 16:24:57 2024
From: ccherlin at gmail.com (Clement Cherlin)
Date: Thu, 14 Mar 2024 11:24:57 -0500
Subject: Update on String Templates (JEP 459)
In-Reply-To: <2636E7B0-2A41-4034-8367-A20687FABCF9@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <F39C31BF-ABBB-49CA-8A70-DF0169F8A732@oracle.com>
 <DD00D360-14C9-481B-80C7-AA5C47625FE0@oracle.com>
 <D63380BB-ED30-4286-B26E-FFD6677A79EC@oracle.com>
 <2636E7B0-2A41-4034-8367-A20687FABCF9@oracle.com>
Message-ID: <CALEU8=yu356uAS3zUsvEMj1c16SgRg5OYKRrUrENGtZXmRkmGQ@mail.gmail.com>

On Wed, Mar 13, 2024 at 6:45?PM John Rose <john.r.rose at oracle.com> wrote:
>
> On 13 Mar 2024, at 15:22, John Rose wrote:
>
> > ? OVERLOADS ?
> >
> > I don?t see (maybe I missed it) a decisive objection to overloading across ST
> > and String, at least for some processing APIs.
>
> Perhaps it is this:  A language processor API that takes STs and never Strings is making it clear that all inputs should be properly vetted, nothing taken on trust as a bare string.
>
> Doing that MIGHT require a performance model which permits expensive vetting operations to be memoized on particular OCCURRENCES of inputs (not just the input strings viewed in and of themselves).
>
> If that?s true, then I guess that?s support for Guy?s proposal: That STs (even trivial ones) should never look identical to strings.  Maybe they should always be preceded by a sigil $, or (per my suggestion) they should always have at least one occurrence of \{ inside, even if it?s a trivial nop.
>
> I kind of like Guy?s offensive-to-everyone suggestion that $ is required to make a true ST.  Then it?s clear how the veteting APIs mate up with their vetted inputs.  And if $ is not placed in front, we surrender to the string-pasters, but at least the resulting true-string expressions won?t be accepted by the vetting APIs.

Adding an empty interpolated value to signal a template is not a
viable solution, because "\{}abc" is not equivalent to ST.of("abc").
Running the current preview,

RAW."\{}abc" produces StringTemplate{ fragments = [ "", "abc" ],
values = [null] } which interpolates to "nullabc".

RAW."abc" produces StringTemplate{ fragments = [ "abc" ], values = []
} which interpolates to "abc".

I strongly support using different quotes or a prefixed sigil over any
form of linguistic magic like "if an interpolated value is empty we
pretend it's not there but still treat the literal as a template" or
"a string literal can be implicitly converted to a template literal in
{context}".

Cheers,
Clement Cherlin

From maurizio.cimadamore at oracle.com  Thu Mar 14 17:40:55 2024
From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore)
Date: Thu, 14 Mar 2024 17:40:55 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>
 <B52FF02B-5379-438A-9A23-05519246AB1F@oracle.com>
 <bf368957-e3c2-4d79-b462-06d20e38a387@oracle.com>
 <E3598C9D-26BA-4AC0-AE01-EE7E7C42CA44@oracle.com>
 <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com>
Message-ID: <c9374ab8-1e0e-4746-9f6b-b77135ddd809@oracle.com>

Not to pour too much cold water on the idea of having string 
interpolation literal, but I?d like to mention a few points here.

First, it was a deliberate design goal of the string template feature to 
make interpolation an explicit act. Note that, if we had the syntax you 
describe, we actually achieve the opposite effect: string interpolation 
is now the default, and implicit, and actually /cheaper/ (to type) than 
the safer template alternative. This is a bit of a red herring, I think.

The second problem is that interpolation literals can sometimes be 
deceiving. Consider this example:

|String.format("Hello, my name is %s{name}"); // can you spot the bug? |

Where |String::format| has a new overload which accepts a StringTemplate.

Basically, since here we forgot the leading ?$? (or whatever char that 
is), the whole thing is just a big interpolation. Semantically 
equivalent to:

|String.format("Hello, my name is %s" + name); // whoops! |

This will fail, as |String::format| will be waiting for an argument (a 
string), but none is provided. So:

|| Exception java.util.MissingFormatArgumentException: Format specifier 
'%s' | at Formatter.format (Formatter.java:2672) | at Formatter.format 
(Formatter.java:2609) | at String.format (String.java:2897) | at (#2:1) |

This is a very odd (and new!) failure mode, that I?m sure is gonna 
surprise developers.

Maurizio

On 14/03/2024 15:08, Guy Steele wrote:

> Second thoughts about how to explain a string interpolation literal:
>
>> On Mar 13, 2024, at 2:02?PM, Guy Steele<guy.steele at oracle.com>  wrote:
>> . . .
>>
>> ?????????
>> String is not a subtype of StringTemplate; they are disjoint types.
>>
>> 	$?foo?              is a (trivial) string template literal
>> 	?foo?                is a string literal
>>          $?Hello, \{x}?     is a (nontrivial) string template literal
>>          ?Hello, \{x}?      is a shorthand (expanded by the compiler) for `String.of($?Hello, \{x}?)`
>> ?????????
> Given that the intent is that String.of (or whatever we want to call it?possibly the `interpolation` instance method of class `StringTemplate` rather than a static method `String.of`) should just do standard string concatenation, we might be better off just saying that a string interpolation literal is expanded by the compiler into uses of ?+?; for example,
>
>           ?Hello, \{x}.?
>
> (I have added a period to the example to make the point clearer) is expanded into
>
>          ?Hello, ? + x + ?.?
>
> and in general
>
>          ?c0\{e1}c1\{e2}c2?\{en}cn?
>
> (where each ck is a possibly empty sequence of string characters and each ek is an expression)  is expanded into
>
>          ?c0? + (e1) + ?c1? + (e2) + ?c2? + ? + (en) + ?cn?
>
> The point is that, with this definition, ?c0\{e1}c1\{e2}c2?\{en}cn? is a constant expression iff every ek is a constant expression. This is handy for interpolating constant variables into a string that is itself intended to be constant.
>
> ?Guy
>
?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240314/a65f09cf/attachment-0001.htm>

From ccherlin at gmail.com  Thu Mar 14 19:04:02 2024
From: ccherlin at gmail.com (Clement Cherlin)
Date: Thu, 14 Mar 2024 14:04:02 -0500
Subject: Update on String Templates (JEP 459)
In-Reply-To: <c9374ab8-1e0e-4746-9f6b-b77135ddd809@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>
 <B52FF02B-5379-438A-9A23-05519246AB1F@oracle.com>
 <bf368957-e3c2-4d79-b462-06d20e38a387@oracle.com>
 <E3598C9D-26BA-4AC0-AE01-EE7E7C42CA44@oracle.com>
 <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com>
 <c9374ab8-1e0e-4746-9f6b-b77135ddd809@oracle.com>
Message-ID: <CALEU8=ypsj+2rXRRkzLa9fRvWPKe4PQ1fdh6ZM9ki3scDcZPtA@mail.gmail.com>

I think there are a few basic use cases which everyone wants to be
safe and ergonomic.

1. New APIs that accept StringTemplate, not String, and do processing
with the value above and beyond direct interpolation (SQL queries,
HTML/XML escaping, transforming to JSON, etc.).
2. Existing APIs that accept String or (String, Object...) that have
StringTemplate support added, such as PrintWriter::println or
String::format.
3. Old APIs that have not been (and may never be) updated to accept
StringTemplate, but we want to pass interpolated strings to.

# Problems

Use case #1:
Issues passing constant templates if there is no explicit syntactic
distinction between string and template literals.

Use case #2:
Complicated and potentially erroneous overload selection if there is
no explicit syntactic distinction between string and template
literals.

Use case 3:
Passing interpolated templates to APIs that only support String,
without excess ceremony.

# Proposed Solution

I believe there is a common solution to these problems that
(hopefully) addresses all of these issues.

Prefixing a template with an explicit processor was nice in one way,
because the processor made the semantics of the interpolation
explicit. However, processors were more trouble than they were worth.

What if instead of the extremes of a myriad of processors, or a single
template prefix, or no prefix and complex/confusing context rules, we
have exactly two prefixes? To avoid bikeshedding (obviously, the final
names would be much shorter), I will call them TEMPLATE and
INTERPOLATE. These are semantically identical to the old RAW and STR
processors respectively, but syntactically have no "." between them
and the leading quote.

TEMPLATE"hey \{name}" -> StringTemplate
INTERPOLATE"hey \{name}" -> String

Unlike processors, these two are the *only* valid prefixes.

This brings back the clarity of RAW and STR without the complexity of
processor classes. Processing of TEMPLATE literals is done by normal
methods that take StringTemplate. INTERPOLATE literals evaluate
directly to regular Strings.

The two kinds of expressions can have different translation
strategies, like constant-ification of INTERPOLATE expressions with
constant values, as Guy suggests.

# Examples

Use case #1
generateQuery(TEMPLATE"update table \{tableName} set \{column} =
\{value} where \{whereExp}"); // OK
generateQuery(INTERPOLATE"update table \{tableName} set \{column} =
\{value} where \{whereExp}"); // incompatible type error

Use case #2
System.out.println(TEMPLATE"Hello, \{world}!"); // OK
System.out.println(INTERPOLATE"Hello, \{world}!"); // OK, and if
'world' is constant, it may be folded
String.format(TEMPLATE"I am %d\{age} years old"); // OK
String.format(INTERPOLATE"I am %d\{age} years old"); // IDE warning
and runtime exception because format string doesn't match number of
parameters.

Use case #3
someOldStringMethod(TEMPLATE"some runtime values go here: \{value1}
and here: \{value2}"); // incompatible type error
someOldStringMethod(INTERPOLATE"some runtime values go here: \{value1}
and here: \{value2}"); // OK

What do you think?

Cheers,
Clement Cherlin

On Thu, Mar 14, 2024 at 12:44?PM Maurizio Cimadamore
<maurizio.cimadamore at oracle.com> wrote:
>
> Not to pour too much cold water on the idea of having string interpolation literal, but I?d like to mention a few points here.
>
> First, it was a deliberate design goal of the string template feature to make interpolation an explicit act. Note that, if we had the syntax you describe, we actually achieve the opposite effect: string interpolation is now the default, and implicit, and actually cheaper (to type) than the safer template alternative. This is a bit of a red herring, I think.
>
> The second problem is that interpolation literals can sometimes be deceiving. Consider this example:
>
> String.format("Hello, my name is %s{name}"); // can you spot the bug?
>
> Where String::format has a new overload which accepts a StringTemplate.
>
> Basically, since here we forgot the leading ?$? (or whatever char that is), the whole thing is just a big interpolation. Semantically equivalent to:
>
>  String.format("Hello, my name is %s" + name); // whoops!
>
> This will fail, as String::format will be waiting for an argument (a string), but none is provided. So:
>
> |  Exception java.util.MissingFormatArgumentException: Format specifier '%s'
> |        at Formatter.format (Formatter.java:2672)
> |        at Formatter.format (Formatter.java:2609)
> |        at String.format (String.java:2897)
> |        at (#2:1)
>
> This is a very odd (and new!) failure mode, that I?m sure is gonna surprise developers.
>
> Maurizio
>
> On 14/03/2024 15:08, Guy Steele wrote:
>
> Second thoughts about how to explain a string interpolation literal:
>
> On Mar 13, 2024, at 2:02?PM, Guy Steele <guy.steele at oracle.com> wrote:
> . . .
>
> ?????????
> String is not a subtype of StringTemplate; they are disjoint types.
>
> $?foo?              is a (trivial) string template literal
> ?foo?                is a string literal
>         $?Hello, \{x}?     is a (nontrivial) string template literal
>         ?Hello, \{x}?      is a shorthand (expanded by the compiler) for `String.of($?Hello, \{x}?)`
> ?????????
>
> Given that the intent is that String.of (or whatever we want to call it?possibly the `interpolation` instance method of class `StringTemplate` rather than a static method `String.of`) should just do standard string concatenation, we might be better off just saying that a string interpolation literal is expanded by the compiler into uses of ?+?; for example,
>
>          ?Hello, \{x}.?
>
> (I have added a period to the example to make the point clearer) is expanded into
>
>         ?Hello, ? + x + ?.?
>
> and in general
>
>         ?c0\{e1}c1\{e2}c2?\{en}cn?
>
> (where each ck is a possibly empty sequence of string characters and each ek is an expression)  is expanded into
>
>         ?c0? + (e1) + ?c1? + (e2) + ?c2? + ? + (en) + ?cn?
>
> The point is that, with this definition, ?c0\{e1}c1\{e2}c2?\{en}cn? is a constant expression iff every ek is a constant expression. This is handy for interpolating constant variables into a string that is itself intended to be constant.
>
> ?Guy
>

From maurizio.cimadamore at oracle.com  Thu Mar 14 19:24:37 2024
From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore)
Date: Thu, 14 Mar 2024 19:24:37 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <CALEU8=ypsj+2rXRRkzLa9fRvWPKe4PQ1fdh6ZM9ki3scDcZPtA@mail.gmail.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>
 <B52FF02B-5379-438A-9A23-05519246AB1F@oracle.com>
 <bf368957-e3c2-4d79-b462-06d20e38a387@oracle.com>
 <E3598C9D-26BA-4AC0-AE01-EE7E7C42CA44@oracle.com>
 <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com>
 <c9374ab8-1e0e-4746-9f6b-b77135ddd809@oracle.com>
 <CALEU8=ypsj+2rXRRkzLa9fRvWPKe4PQ1fdh6ZM9ki3scDcZPtA@mail.gmail.com>
Message-ID: <80744de8-103f-4a51-8df6-52642aff00a6@oracle.com>

On 14/03/2024 19:04, Clement Cherlin wrote:

> What if instead of the extremes of a myriad of processors, or a single
> template prefix, or no prefix and complex/confusing context rules, we
> have exactly two prefixes? To avoid bikeshedding (obviously, the final
> names would be much shorter), I will call them TEMPLATE and
> INTERPOLATE. These are semantically identical to the old RAW and STR
> processors respectively, but syntactically have no "." between them
> and the leading quote.
>
> TEMPLATE"hey \{name}" -> StringTemplate
> INTERPOLATE"hey \{name}" -> String

See my latest email to Guy.

Having /different/ prefix for interpolated vs. raw template literals 
does help a bit with the case I brought up there - as here we?re 
basically in a world where a string literal with embedded arguments 
/must/ have a suitable prefix.

A possible point which is not too far from where we are today is just 
reuse STR and RAW as prefixes, but also make RAW /optional/, so that:

  * it can be used to disambiguate interpretation of strings w/o
    embedded expressions;
      o it can be the /default/ if you type something that does have
        embedded expressions (e.g. nudge towards the safer route if
        there?s embedded expressions)

Another idea that came up was, instead of just using prefixes, use /types/:

  * the STR prefix is written String
  * the RAW prefix is written StringTemplate

This is slightly better (say what you mean!) - but a potential problem 
is that one might wonder why a special syntax is needed given a cast is 
just a pair of |(| and |)| away?

Maurizio

?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240314/a2eb5633/attachment-0001.htm>

From guy.steele at oracle.com  Thu Mar 14 19:39:17 2024
From: guy.steele at oracle.com (Guy Steele)
Date: Thu, 14 Mar 2024 19:39:17 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <c9374ab8-1e0e-4746-9f6b-b77135ddd809@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>
 <B52FF02B-5379-438A-9A23-05519246AB1F@oracle.com>
 <bf368957-e3c2-4d79-b462-06d20e38a387@oracle.com>
 <E3598C9D-26BA-4AC0-AE01-EE7E7C42CA44@oracle.com>
 <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com>
 <c9374ab8-1e0e-4746-9f6b-b77135ddd809@oracle.com>
Message-ID: <F217B131-C5FD-4587-B251-06760F08DD36@oracle.com>

This is a very important example to consider. I observe, however, that there are at least two possible ways to avoid the unpleasant surprise:

(1) Don't have string interpolation literals, because accidentally using a string interpolation literal instead of a string template literals can result in invoking the wrong overload of a method.

(2) Don?t overload methods so as to accept either a string or a string template.

If we were to take approach (2), then:

(a) We would keep `println` as is, and not allow it to accept a template, but that?s okay?if you thought you wanted a template, what you really want is plan old string interpolation, and the type checking will make sure you don't use the wrong one.

(b) A SQL processor would accept a template but not a string?if you thought you wanted string interpolation, what you really want is a template, and the type checking will make sure you don't use the wrong one.

(c) I think `format` is a special case that we tend to get hung up on, and I think that, in this particular branch of the design space we are exploring, perhaps a name other than `String.format` should be chosen for the method that does string formatting on templates. Possible names are `StringTemplate.format` and `String.format$`, but I will leave further bikeshedding on this to others. I do recognize that this move will not enable the type system per se to absolutely prevent programmers from writing

String.format("Hello, my name is %s{name}"); // can you spot the bug?

but, as Clement has observed, such cases will probably provoke a warning about a mismatch between the number of arguments and the number of %-specifiers that require parameters, so maybe overloading would be okay anyway for `String.format`.

Anyway, my point is that whether to overload a method to accept either a string or a string template can be evaluated on a case-by-case basis according to a small number of principles that I think we could enumerate and explain pretty easily.

?Guy

On Mar 14, 2024, at 1:40?PM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com> wrote:


Not to pour too much cold water on the idea of having string interpolation literal, but I?d like to mention a few points here.

First, it was a deliberate design goal of the string template feature to make interpolation an explicit act. Note that, if we had the syntax you describe, we actually achieve the opposite effect: string interpolation is now the default, and implicit, and actually cheaper (to type) than the safer template alternative. This is a bit of a red herring, I think.

The second problem is that interpolation literals can sometimes be deceiving. Consider this example:

String.format("Hello, my name is %s{name}"); // can you spot the bug?


Where String::format has a new overload which accepts a StringTemplate.

Basically, since here we forgot the leading ?$? (or whatever char that is), the whole thing is just a big interpolation. Semantically equivalent to:

 String.format("Hello, my name is %s" + name); // whoops!


This will fail, as String::format will be waiting for an argument (a string), but none is provided. So:

|  Exception java.util.MissingFormatArgumentException: Format specifier '%s'
|        at Formatter.format (Formatter.java:2672)
|        at Formatter.format (Formatter.java:2609)
|        at String.format (String.java:2897)
|        at (#2:1)


This is a very odd (and new!) failure mode, that I?m sure is gonna surprise developers.

Maurizio

On 14/03/2024 15:08, Guy Steele wrote:


Second thoughts about how to explain a string interpolation literal:


On Mar 13, 2024, at 2:02?PM, Guy Steele <guy.steele at oracle.com><mailto:guy.steele at oracle.com> wrote:
. . .

?????????
String is not a subtype of StringTemplate; they are disjoint types.

        $?foo?              is a (trivial) string template literal
        ?foo?                is a string literal
        $?Hello, \{x}?     is a (nontrivial) string template literal
        ?Hello, \{x}?      is a shorthand (expanded by the compiler) for `String.of($?Hello, \{x}?)`
?????????


Given that the intent is that String.of (or whatever we want to call it?possibly the `interpolation` instance method of class `StringTemplate` rather than a static method `String.of`) should just do standard string concatenation, we might be better off just saying that a string interpolation literal is expanded by the compiler into uses of ?+?; for example,

         ?Hello, \{x}.?

(I have added a period to the example to make the point clearer) is expanded into

        ?Hello, ? + x + ?.?

and in general

        ?c0\{e1}c1\{e2}c2?\{en}cn?

(where each ck is a possibly empty sequence of string characters and each ek is an expression)  is expanded into

        ?c0? + (e1) + ?c1? + (e2) + ?c2? + ? + (en) + ?cn?

The point is that, with this definition, ?c0\{e1}c1\{e2}c2?\{en}cn? is a constant expression iff every ek is a constant expression. This is handy for interpolating constant variables into a string that is itself intended to be constant.

?Guy


?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240314/792f92d0/attachment-0001.htm>

From john.r.rose at oracle.com  Thu Mar 14 20:21:03 2024
From: john.r.rose at oracle.com (John Rose)
Date: Thu, 14 Mar 2024 13:21:03 -0700
Subject: Update on String Templates (JEP 459)
In-Reply-To: <CALEU8=yu356uAS3zUsvEMj1c16SgRg5OYKRrUrENGtZXmRkmGQ@mail.gmail.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <F39C31BF-ABBB-49CA-8A70-DF0169F8A732@oracle.com>
 <DD00D360-14C9-481B-80C7-AA5C47625FE0@oracle.com>
 <D63380BB-ED30-4286-B26E-FFD6677A79EC@oracle.com>
 <2636E7B0-2A41-4034-8367-A20687FABCF9@oracle.com>
 <CALEU8=yu356uAS3zUsvEMj1c16SgRg5OYKRrUrENGtZXmRkmGQ@mail.gmail.com>
Message-ID: <286722DC-1BD8-46E8-9FF5-32D8E6C06624@oracle.com>

On 14 Mar 2024, at 9:24, Clement Cherlin wrote:

> ?
> Adding an empty interpolated value to signal a template is not a
> viable solution, because "\{}abc" is not equivalent to ST.of("abc").
> Running the current preview,
>
> RAW."\{}abc" produces StringTemplate{ fragments = [ "", "abc" ],
> values = [null] } which interpolates to "nullabc".

Surely that is a bug in the preview.  In making that suggestion I assumed that omitting the expression altogether was illegal in the current syntax.  Having empty brackets be illegal (instead of an obscure way to say ?\{null}?), they would have been a way to force a string to be a template, as a compatible extension.   But it?s not my favorite suggestion; just something I put out there FWIW.

From ccherlin at gmail.com  Thu Mar 14 20:53:09 2024
From: ccherlin at gmail.com (Clement Cherlin)
Date: Thu, 14 Mar 2024 15:53:09 -0500
Subject: Update on String Templates (JEP 459)
In-Reply-To: <80744de8-103f-4a51-8df6-52642aff00a6@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>
 <B52FF02B-5379-438A-9A23-05519246AB1F@oracle.com>
 <bf368957-e3c2-4d79-b462-06d20e38a387@oracle.com>
 <E3598C9D-26BA-4AC0-AE01-EE7E7C42CA44@oracle.com>
 <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com>
 <c9374ab8-1e0e-4746-9f6b-b77135ddd809@oracle.com>
 <CALEU8=ypsj+2rXRRkzLa9fRvWPKe4PQ1fdh6ZM9ki3scDcZPtA@mail.gmail.com>
 <80744de8-103f-4a51-8df6-52642aff00a6@oracle.com>
Message-ID: <CALEU8=xVddJCMKnuqo4w=stSiarWuHaK9mSY9RPT+DLChBrRAQ@mail.gmail.com>

On Thu, Mar 14, 2024 at 2:24?PM Maurizio Cimadamore
<maurizio.cimadamore at oracle.com> wrote:
>
> On 14/03/2024 19:04, Clement Cherlin wrote:
>
> What if instead of the extremes of a myriad of processors, or a single
> template prefix, or no prefix and complex/confusing context rules, we
> have exactly two prefixes? To avoid bikeshedding (obviously, the final
> names would be much shorter), I will call them TEMPLATE and
> INTERPOLATE. These are semantically identical to the old RAW and STR
> processors respectively, but syntactically have no "." between them
> and the leading quote.
>
> TEMPLATE"hey \{name}" -> StringTemplate
> INTERPOLATE"hey \{name}" -> String
>
> See my latest email to Guy.
>
> Having different prefix for interpolated vs. raw template literals does help a bit with the case I brought up there - as here we?re basically in a world where a string literal with embedded arguments must have a suitable prefix.

Yes, that's what I had in mind when writing this proposal.

> A possible point which is not too far from where we are today is just reuse STR and RAW as prefixes,

I considered that, but while I have no problem with STR, I think the
name RAW is too much a legacy of the Processor API. I think RAW should
be replaced by something that signifies "this is a template", not
"process this template with the processor that does nothing". I will
continue to resist the urge to present alternative prefixes until the
appropriate time.

>  but also make RAW optional, so that:
>
> it can be used to disambiguate interpretation of strings w/o embedded expressions;
>
> it can be the default if you type something that does have embedded expressions (e.g. nudge towards the safer route if there?s embedded expressions)

That's certainly an option. I don't prefer it, but it's not terrible,
and would reduce clutter somewhat when you are dealing exclusively
with templates and don't need the reminder. Like "final" on an
effectively final value, it could be a matter of taste and convention
whether to include the prefix for a template with embedded
expressions. However, it has the downside that you may accidentally
convert a template back to a string literal by removing the last
embedded expression.

> Another idea that came up was, instead of just using prefixes, use types:
>
> the STR prefix is written String
> the RAW prefix is written StringTemplate
>
> This is slightly better (say what you mean!) - but a potential problem is that one might wonder why a special syntax is needed given a cast is just a pair of ( and ) away?
>
> Maurizio

I'm going to assume you're joking here, so I don't feel the need to
write a thousand words about how terrible Java's casting syntax is.

Cheers,
Clement Cherlin

From maurizio.cimadamore at oracle.com  Thu Mar 14 22:00:17 2024
From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore)
Date: Thu, 14 Mar 2024 22:00:17 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <F217B131-C5FD-4587-B251-06760F08DD36@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>
 <B52FF02B-5379-438A-9A23-05519246AB1F@oracle.com>
 <bf368957-e3c2-4d79-b462-06d20e38a387@oracle.com>
 <E3598C9D-26BA-4AC0-AE01-EE7E7C42CA44@oracle.com>
 <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com>
 <c9374ab8-1e0e-4746-9f6b-b77135ddd809@oracle.com>
 <F217B131-C5FD-4587-B251-06760F08DD36@oracle.com>
Message-ID: <f0f270b1-d186-4b6b-a451-d5f01d1bcd42@oracle.com>


On 14/03/2024 19:39, Guy Steele wrote:
> This is a very important example to consider. I observe, however, that 
> there are at least two possible ways to avoid the unpleasant surprise:
>
> (1) Don't have string interpolation literals, because accidentally 
> using a string interpolation literal instead of a string template 
> literals can result in invoking the wrong overload of a method.
>
> (2) Don?t overload methods so as to accept either a string or a string 
> template.

I agree with your analysis, but note that there is also a third option:

(3) make it so that both string interpolation literal and string 
template literal have a prefix.

I believe that is enough to solve the issue (because the program I wrote 
would no longer compile: the compiler would require an explicit prefix).

Maurizio

>
> If we were to take approach (2), then:
>
> (a) We would keep `println` as is, and not allow it to accept a 
> template, but that?s okay?if you thought you wanted a template, what 
> you really want is plan old string interpolation, and the type 
> checking will make sure you don't use the wrong one.
>
> (b) A SQL processor would accept a template but not a string?if you 
> thought you wanted string interpolation, what you really want is a 
> template, and the type checking will make sure you don't use the wrong 
> one.
>
> (c) I think `format` is a special case that we tend to get hung up on, 
> and I think that, in this particular branch of the design space we are 
> exploring, perhaps a name other than `String.format` should be chosen 
> for the method that does string formatting on templates. Possible 
> names are `StringTemplate.format` and `String.format$`, but I will 
> leave further bikeshedding on this to others. I do recognize that this 
> move will not enable the type system per se to absolutely prevent 
> programmers from writing
> |String.format("Hello, my name is %s{name}"); // can you spot the bug? |
> but, as Clement has observed, such cases will probably provoke a 
> warning about a mismatch between the number of arguments and the 
> number of %-specifiers that require parameters, so maybe overloading 
> would be okay anyway for `String.format`.
>
> Anyway, my point is that whether to overload a method to accept either 
> a string or a string template can be evaluated on a case-by-case basis 
> according to a small number of principles that I think we could 
> enumerate and explain pretty easily.
>
> ?Guy
>
>> On Mar 14, 2024, at 1:40?PM, Maurizio Cimadamore 
>> <maurizio.cimadamore at oracle.com> wrote:
>>
>> Not to pour too much cold water on the idea of having string 
>> interpolation literal, but I?d like to mention a few points here.
>>
>> First, it was a deliberate design goal of the string template feature 
>> to make interpolation an explicit act. Note that, if we had the 
>> syntax you describe, we actually achieve the opposite effect: string 
>> interpolation is now the default, and implicit, and actually 
>> /cheaper/ (to type) than the safer template alternative. This is a 
>> bit of a red herring, I think.
>>
>> The second problem is that interpolation literals can sometimes be 
>> deceiving. Consider this example:
>>
>> |String.format("Hello, my name is %s{name}"); // can you spot the bug? |
>>
>> Where |String::format| has a new overload which accepts a StringTemplate.
>>
>> Basically, since here we forgot the leading ?$? (or whatever char 
>> that is), the whole thing is just a big interpolation. Semantically 
>> equivalent to:
>>
>> |String.format("Hello, my name is %s" + name); // whoops! |
>>
>> This will fail, as |String::format| will be waiting for an argument 
>> (a string), but none is provided. So:
>>
>> || Exception java.util.MissingFormatArgumentException: Format 
>> specifier '%s' | at Formatter.format (Formatter.java:2672) | at 
>> Formatter.format (Formatter.java:2609) | at String.format 
>> (String.java:2897) | at (#2:1) |
>>
>> This is a very odd (and new!) failure mode, that I?m sure is gonna 
>> surprise developers.
>>
>> Maurizio
>>
>> On 14/03/2024 15:08, Guy Steele wrote:
>>
>>
>>
>>> Second thoughts about how to explain a string interpolation literal:
>>>
>>>> On Mar 13, 2024, at 2:02?PM, Guy Steele<guy.steele at oracle.com>  wrote:
>>>> . . .
>>>>
>>>> ?????????
>>>> String is not a subtype of StringTemplate; they are disjoint types.
>>>>
>>>> 	$?foo?              is a (trivial) string template literal
>>>> 	?foo?                is a string literal
>>>>          $?Hello, \{x}?     is a (nontrivial) string template literal
>>>>          ?Hello, \{x}?      is a shorthand (expanded by the compiler) for `String.of($?Hello, \{x}?)`
>>>> ?????????
>>> Given that the intent is that String.of (or whatever we want to call it?possibly the `interpolation` instance method of class `StringTemplate` rather than a static method `String.of`) should just do standard string concatenation, we might be better off just saying that a string interpolation literal is expanded by the compiler into uses of ?+?; for example,
>>>
>>>           ?Hello, \{x}.?
>>>
>>> (I have added a period to the example to make the point clearer) is expanded into
>>>
>>>          ?Hello, ? + x + ?.?
>>>
>>> and in general
>>>
>>>          ?c0\{e1}c1\{e2}c2?\{en}cn?
>>>
>>> (where each ck is a possibly empty sequence of string characters and each ek is an expression)  is expanded into
>>>
>>>          ?c0? + (e1) + ?c1? + (e2) + ?c2? + ? + (en) + ?cn?
>>>
>>> The point is that, with this definition, ?c0\{e1}c1\{e2}c2?\{en}cn? is a constant expression iff every ek is a constant expression. This is handy for interpolating constant variables into a string that is itself intended to be constant.
>>>
>>> ?Guy
>>>
>>
>>
>> ?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240314/7228be20/attachment-0001.htm>

From guy.steele at oracle.com  Thu Mar 14 22:05:27 2024
From: guy.steele at oracle.com (Guy Steele)
Date: Thu, 14 Mar 2024 22:05:27 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <f0f270b1-d186-4b6b-a451-d5f01d1bcd42@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>
 <B52FF02B-5379-438A-9A23-05519246AB1F@oracle.com>
 <bf368957-e3c2-4d79-b462-06d20e38a387@oracle.com>
 <E3598C9D-26BA-4AC0-AE01-EE7E7C42CA44@oracle.com>
 <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com>
 <c9374ab8-1e0e-4746-9f6b-b77135ddd809@oracle.com>
 <F217B131-C5FD-4587-B251-06760F08DD36@oracle.com>
 <f0f270b1-d186-4b6b-a451-d5f01d1bcd42@oracle.com>
Message-ID: <E65EA9D5-312B-4A39-99F7-013D70C6E62C@oracle.com>

Is your intent that a string interpolation literal would have a type other than String? If so, I agree that this is a third option?with the consequence that each API designer now needs to contemplate three-way overloading.

If that is not your intent, then I am not seeing how the prefix helps?so please explain?

Thanks,
Guy

On Mar 14, 2024, at 6:00?PM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com> wrote:


On 14/03/2024 19:39, Guy Steele wrote:
This is a very important example to consider. I observe, however, that there are at least two possible ways to avoid the unpleasant surprise:

(1) Don't have string interpolation literals, because accidentally using a string interpolation literal instead of a string template literals can result in invoking the wrong overload of a method.

(2) Don?t overload methods so as to accept either a string or a string template.

I agree with your analysis, but note that there is also a third option:

(3) make it so that both string interpolation literal and string template literal have a prefix.

I believe that is enough to solve the issue (because the program I wrote would no longer compile: the compiler would require an explicit prefix).

Maurizio

If we were to take approach (2), then:

(a) We would keep `println` as is, and not allow it to accept a template, but that?s okay?if you thought you wanted a template, what you really want is plan old string interpolation, and the type checking will make sure you don't use the wrong one.

(b) A SQL processor would accept a template but not a string?if you thought you wanted string interpolation, what you really want is a template, and the type checking will make sure you don't use the wrong one.

(c) I think `format` is a special case that we tend to get hung up on, and I think that, in this particular branch of the design space we are exploring, perhaps a name other than `String.format` should be chosen for the method that does string formatting on templates. Possible names are `StringTemplate.format` and `String.format$`, but I will leave further bikeshedding on this to others. I do recognize that this move will not enable the type system per se to absolutely prevent programmers from writing

String.format("Hello, my name is %s{name}"); // can you spot the bug?


but, as Clement has observed, such cases will probably provoke a warning about a mismatch between the number of arguments and the number of %-specifiers that require parameters, so maybe overloading would be okay anyway for `String.format`.

Anyway, my point is that whether to overload a method to accept either a string or a string template can be evaluated on a case-by-case basis according to a small number of principles that I think we could enumerate and explain pretty easily.

?Guy

On Mar 14, 2024, at 1:40?PM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com><mailto:maurizio.cimadamore at oracle.com> wrote:


Not to pour too much cold water on the idea of having string interpolation literal, but I?d like to mention a few points here.

First, it was a deliberate design goal of the string template feature to make interpolation an explicit act. Note that, if we had the syntax you describe, we actually achieve the opposite effect: string interpolation is now the default, and implicit, and actually cheaper (to type) than the safer template alternative. This is a bit of a red herring, I think.

The second problem is that interpolation literals can sometimes be deceiving. Consider this example:

String.format("Hello, my name is %s{name}"); // can you spot the bug?


Where String::format has a new overload which accepts a StringTemplate.

Basically, since here we forgot the leading ?$? (or whatever char that is), the whole thing is just a big interpolation. Semantically equivalent to:

 String.format("Hello, my name is %s" + name); // whoops!


This will fail, as String::format will be waiting for an argument (a string), but none is provided. So:

|  Exception java.util.MissingFormatArgumentException: Format specifier '%s'
|        at Formatter.format (Formatter.java:2672)
|        at Formatter.format (Formatter.java:2609)
|        at String.format (String.java:2897)
|        at (#2:1)


This is a very odd (and new!) failure mode, that I?m sure is gonna surprise developers.

Maurizio

On 14/03/2024 15:08, Guy Steele wrote:


Second thoughts about how to explain a string interpolation literal:


On Mar 13, 2024, at 2:02?PM, Guy Steele <guy.steele at oracle.com><mailto:guy.steele at oracle.com> wrote:
. . .

?????????
String is not a subtype of StringTemplate; they are disjoint types.

        $?foo?              is a (trivial) string template literal
        ?foo?                is a string literal
        $?Hello, \{x}?     is a (nontrivial) string template literal
        ?Hello, \{x}?      is a shorthand (expanded by the compiler) for `String.of($?Hello, \{x}?)`
?????????


Given that the intent is that String.of (or whatever we want to call it?possibly the `interpolation` instance method of class `StringTemplate` rather than a static method `String.of`) should just do standard string concatenation, we might be better off just saying that a string interpolation literal is expanded by the compiler into uses of ?+?; for example,

         ?Hello, \{x}.?

(I have added a period to the example to make the point clearer) is expanded into

        ?Hello, ? + x + ?.?

and in general

        ?c0\{e1}c1\{e2}c2?\{en}cn?

(where each ck is a possibly empty sequence of string characters and each ek is an expression)  is expanded into

        ?c0? + (e1) + ?c1? + (e2) + ?c2? + ? + (en) + ?cn?

The point is that, with this definition, ?c0\{e1}c1\{e2}c2?\{en}cn? is a constant expression iff every ek is a constant expression. This is handy for interpolating constant variables into a string that is itself intended to be constant.

?Guy


?


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240314/838c3fd1/attachment-0001.htm>

From maurizio.cimadamore at oracle.com  Thu Mar 14 22:07:03 2024
From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore)
Date: Thu, 14 Mar 2024 22:07:03 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <CALEU8=xVddJCMKnuqo4w=stSiarWuHaK9mSY9RPT+DLChBrRAQ@mail.gmail.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>
 <B52FF02B-5379-438A-9A23-05519246AB1F@oracle.com>
 <bf368957-e3c2-4d79-b462-06d20e38a387@oracle.com>
 <E3598C9D-26BA-4AC0-AE01-EE7E7C42CA44@oracle.com>
 <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com>
 <c9374ab8-1e0e-4746-9f6b-b77135ddd809@oracle.com>
 <CALEU8=ypsj+2rXRRkzLa9fRvWPKe4PQ1fdh6ZM9ki3scDcZPtA@mail.gmail.com>
 <80744de8-103f-4a51-8df6-52642aff00a6@oracle.com>
 <CALEU8=xVddJCMKnuqo4w=stSiarWuHaK9mSY9RPT+DLChBrRAQ@mail.gmail.com>
Message-ID: <a04fded2-ee97-4bf1-b2be-1719038985d4@oracle.com>

On 14/03/2024 20:53, Clement Cherlin wrote:

> I'm going to assume you're joking here, so I don't feel the need to
> write a thousand words about how terrible Java's casting syntax is.

Honestly no, I wasn?t joking, at least not from a semantic perspective. 
Let me see if I can explain myself.

Let?s say a string interpolation literal is spelled like:

|String"my name is \{name}" |

What is this expression doing? Well, it?s taking some literal that looks 
like a string, but has some embedded expression, and explicitly ask to 
turn that thing into a String?

Now, isn?t that what (morally) a cast is for? E.g. the object you are 
casting has a type (StringTemplate) and you want to turn it into 
something else (String). Seems quite close. And, while cast between 
reference types don't do much (beside changing the type), cast between 
primitives, or between primitives and references (boxed types) do end up 
changing the underlying representation. So again, semantically we're not 
too far. Where I think cast is a bad fit is that we can't use the cast 
syntax just as a syntactic device. If cast syntax works, it means 
there's a casting conversion between String and StringTemplate which 
means (as I mentioned the other day) that pattern matching will need to 
come along for the ride too.

In terms of syntax, I might agree with you that it?s not a great option, 
but the ?conversion vibe? that a cast gives isn?t totally off the mark.

Maurizio

?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240314/87a34862/attachment.htm>

From maurizio.cimadamore at oracle.com  Thu Mar 14 22:15:01 2024
From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore)
Date: Thu, 14 Mar 2024 22:15:01 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <E65EA9D5-312B-4A39-99F7-013D70C6E62C@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>
 <B52FF02B-5379-438A-9A23-05519246AB1F@oracle.com>
 <bf368957-e3c2-4d79-b462-06d20e38a387@oracle.com>
 <E3598C9D-26BA-4AC0-AE01-EE7E7C42CA44@oracle.com>
 <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com>
 <c9374ab8-1e0e-4746-9f6b-b77135ddd809@oracle.com>
 <F217B131-C5FD-4587-B251-06760F08DD36@oracle.com>
 <f0f270b1-d186-4b6b-a451-d5f01d1bcd42@oracle.com>
 <E65EA9D5-312B-4A39-99F7-013D70C6E62C@oracle.com>
Message-ID: <b221590a-8082-4543-a95f-a870e9dffb8a@oracle.com>


On 14/03/2024 22:05, Guy Steele wrote:
> Is your intent that a string interpolation literal would have a type 
> other than String? If so, I agree that this is a third option?with the 
> consequence that each API designer now needs to contemplate three-way 
> overloading.
>
> If that is not your intent, then I am not seeing how the prefix 
> helps?so please explain?

Let's go back to the example I mentioned:

|String.format("Hello, my name is %s\{name}"); // can you spot the bug? |

There's a string with an embedded expression here. The compiler might 
require a prefix here (e.g. do you want a string, or a string 
template?). If no prefix is added (as in the above code) it might just 
be an error, and this won't compile.

This means that if I do:

|String.format(INTERPOLATED"Hello, my name is %s\{name}"); |

I will select String.format(String, Object...) - but I will do so 
deliberately - it's not just what happens "by default" (as was the case 
before).

Or, if I want the template version, I do:

|String.format(TEMPLATE"Hello, my name is %s\{name}");|

Basically, requiring all literals that have embedded expression to have 
a prefix removes the problem of defaulting on the String side of the 
fence. Then, personally I'd also prefer if the default was actually on 
the StringTemplate side of the fence, so that the above was actually 
identical to this:

|String.format("Hello, my name is %s\{name}"); // ok, this is a template|

Note that these two prefixes might also come in handy when 
disambiguating a literal with no embedded expressions. Only, in that 
case the default would point the other way.

To summarize:

  * template literal with arguments -> defaults to StringTemplate. User
    can ask interpolation explicitly, by adding a prefix
  * template literal w/o arguments -> defaults to String. User can ask a
    degenerate template explicitly, by adding a prefix

This doesn't sound too bad, and it feels like it has the defaults 
pointing the right way?

Maurizio

> Thanks,
> Guy
>
>> On Mar 14, 2024, at 6:00?PM, Maurizio Cimadamore 
>> <maurizio.cimadamore at oracle.com> wrote:
>>
>>
>> On 14/03/2024 19:39, Guy Steele wrote:
>>> This is a very important example to consider. I observe, however, 
>>> that there are at least two possible ways to avoid the unpleasant 
>>> surprise:
>>>
>>> (1) Don't have string interpolation literals, because accidentally 
>>> using a string interpolation literal instead of a string template 
>>> literals can result in invoking the wrong overload of a method.
>>>
>>> (2) Don?t overload methods so as to accept either a string or a 
>>> string template.
>>
>> I agree with your analysis, but note that there is also a third option:
>>
>> (3) make it so that both string interpolation literal and string 
>> template literal have a prefix.
>>
>> I believe that is enough to solve the issue (because the program I 
>> wrote would no longer compile: the compiler would require an explicit 
>> prefix).
>>
>> Maurizio
>>
>>>
>>> If we were to take approach (2), then:
>>>
>>> (a) We would keep `println` as is, and not allow it to accept a 
>>> template, but that?s okay?if you thought you wanted a template, what 
>>> you really want is plan old string interpolation, and the type 
>>> checking will make sure you don't use the wrong one.
>>>
>>> (b) A SQL processor would accept a template but not a string?if you 
>>> thought you wanted string interpolation, what you really want is a 
>>> template, and the type checking will make sure you don't use the 
>>> wrong one.
>>>
>>> (c) I think `format` is a special case that we tend to get hung up 
>>> on, and I think that, in this particular branch of the design space 
>>> we are exploring, perhaps a name other than `String.format` should 
>>> be chosen for the method that does string formatting on templates. 
>>> Possible names are `StringTemplate.format` and `String.format$`, but 
>>> I will leave further bikeshedding on this to others. I do recognize 
>>> that this move will not enable the type system per se to absolutely 
>>> prevent programmers from writing
>>> |String.format("Hello, my name is %s{name}"); // can you spot the bug? |
>>> but, as Clement has observed, such cases will probably provoke a 
>>> warning about a mismatch between the number of arguments and the 
>>> number of %-specifiers that require parameters, so maybe overloading 
>>> would be okay anyway for `String.format`.
>>>
>>> Anyway, my point is that whether to overload a method to accept 
>>> either a string or a string template can be evaluated on a 
>>> case-by-case basis according to a small number of principles that I 
>>> think we could enumerate and explain pretty easily.
>>>
>>> ?Guy
>>>
>>>> On Mar 14, 2024, at 1:40?PM, Maurizio Cimadamore 
>>>> <maurizio.cimadamore at oracle.com> wrote:
>>>>
>>>> Not to pour too much cold water on the idea of having string 
>>>> interpolation literal, but I?d like to mention a few points here.
>>>>
>>>> First, it was a deliberate design goal of the string template 
>>>> feature to make interpolation an explicit act. Note that, if we had 
>>>> the syntax you describe, we actually achieve the opposite effect: 
>>>> string interpolation is now the default, and implicit, and actually 
>>>> /cheaper/ (to type) than the safer template alternative. This is a 
>>>> bit of a red herring, I think.
>>>>
>>>> The second problem is that interpolation literals can sometimes be 
>>>> deceiving. Consider this example:
>>>>
>>>> |String.format("Hello, my name is %s{name}"); // can you spot the bug? |
>>>>
>>>> Where |String::format| has a new overload which accepts a 
>>>> StringTemplate.
>>>>
>>>> Basically, since here we forgot the leading ?$? (or whatever char 
>>>> that is), the whole thing is just a big interpolation. Semantically 
>>>> equivalent to:
>>>>
>>>> |String.format("Hello, my name is %s" + name); // whoops! |
>>>>
>>>> This will fail, as |String::format| will be waiting for an argument 
>>>> (a string), but none is provided. So:
>>>>
>>>> || Exception java.util.MissingFormatArgumentException: Format 
>>>> specifier '%s' | at Formatter.format (Formatter.java:2672) | at 
>>>> Formatter.format (Formatter.java:2609) | at String.format 
>>>> (String.java:2897) | at (#2:1) |
>>>>
>>>> This is a very odd (and new!) failure mode, that I?m sure is gonna 
>>>> surprise developers.
>>>>
>>>> Maurizio
>>>>
>>>> On 14/03/2024 15:08, Guy Steele wrote:
>>>>
>>>>
>>>>
>>>>> Second thoughts about how to explain a string interpolation literal:
>>>>>
>>>>>> On Mar 13, 2024, at 2:02?PM, Guy Steele<guy.steele at oracle.com>  wrote:
>>>>>> . . .
>>>>>>
>>>>>> ?????????
>>>>>> String is not a subtype of StringTemplate; they are disjoint types.
>>>>>>
>>>>>> 	$?foo?              is a (trivial) string template literal
>>>>>> 	?foo?                is a string literal
>>>>>>          $?Hello, \{x}?     is a (nontrivial) string template literal
>>>>>>          ?Hello, \{x}?      is a shorthand (expanded by the compiler) for `String.of($?Hello, \{x}?)`
>>>>>> ?????????
>>>>> Given that the intent is that String.of (or whatever we want to call it?possibly the `interpolation` instance method of class `StringTemplate` rather than a static method `String.of`) should just do standard string concatenation, we might be better off just saying that a string interpolation literal is expanded by the compiler into uses of ?+?; for example,
>>>>>
>>>>>           ?Hello, \{x}.?
>>>>>
>>>>> (I have added a period to the example to make the point clearer) is expanded into
>>>>>
>>>>>          ?Hello, ? + x + ?.?
>>>>>
>>>>> and in general
>>>>>
>>>>>          ?c0\{e1}c1\{e2}c2?\{en}cn?
>>>>>
>>>>> (where each ck is a possibly empty sequence of string characters and each ek is an expression)  is expanded into
>>>>>
>>>>>          ?c0? + (e1) + ?c1? + (e2) + ?c2? + ? + (en) + ?cn?
>>>>>
>>>>> The point is that, with this definition, ?c0\{e1}c1\{e2}c2?\{en}cn? is a constant expression iff every ek is a constant expression. This is handy for interpolating constant variables into a string that is itself intended to be constant.
>>>>>
>>>>> ?Guy
>>>>>
>>>>
>>>>
>>>> ?
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240314/2bdf6d11/attachment-0001.htm>

From robbepincket at live.be  Thu Mar 14 22:36:55 2024
From: robbepincket at live.be (Robbe Pincket)
Date: Thu, 14 Mar 2024 22:36:55 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <E65EA9D5-312B-4A39-99F7-013D70C6E62C@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>
 <B52FF02B-5379-438A-9A23-05519246AB1F@oracle.com>
 <bf368957-e3c2-4d79-b462-06d20e38a387@oracle.com>
 <E3598C9D-26BA-4AC0-AE01-EE7E7C42CA44@oracle.com>
 <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com>
 <c9374ab8-1e0e-4746-9f6b-b77135ddd809@oracle.com>
 <F217B131-C5FD-4587-B251-06760F08DD36@oracle.com>
 <f0f270b1-d186-4b6b-a451-d5f01d1bcd42@oracle.com>
 <E65EA9D5-312B-4A39-99F7-013D70C6E62C@oracle.com>
Message-ID: <AS8PR10MB7798C0AB2D5844BCE6A3730AD3292@AS8PR10MB7798.EURPRD10.PROD.OUTLOOK.COM>

Hi experts

I thought I?d give my 2 cents here for a sec.

I just looked through this long email chain. I was busy with other things in life so I haven?t checked it out earlier.

First of all, I was surprised it took so long for someone to only apply implicit conversion between `String` and `StringTemplate` only for constant `String`s given that there is already a similar case in the compiler. `int`s can?t be implicitly cast to a `byte`, except (some) constant `int` expressions **can** be implicitly converted to `byte`.

This is by far my favorite suggestion posted so far (and if weren?t suggested yet, I would have). So I?m a bit surprised it seems to have disappeared again.

On another idea going around, using a `$` prefix for string templates and having implicit `String.of(?)` on the ?template? if it isn?t there is at the bottom of my list.
The fact that forgetting the `$` prefix just opens you up to an SQL injection attack, while the feature is being advertised as ?safe? is for me unacceptable.

(I?m not a big fan of `TEMPLATE"Foo: \{bar}"` either as it?s just so much longer than `"Foo: " + bar`)

I haven?t seen anyone suggesting the opposite though? Have a `$` prefix for standard String interpolation (for those apis that don?t accept a String) and when it?s not there it?s a normal `StringTemplate`. Adding an extra char by accident feels much less likely to me than forgetting one. But I wouldn?t be against having it be something like `STR"..."` instead of `$"..."`.

Combining these would give the following:

```
String s1 = "test" // still a string literal
StringTemplate st2 = "test" // allowed, constant strings can be implicitly converted to templates
StringTemplate st3 = "Foo: \{bar}" // Simple string template

// either
String s4a = $"Foo: \{bar}" // short for String.of("Foo: \{bar}")
String s4b = STR"Foo: \{bar}" // short for String.of("Foo: \{bar}")
```

Kind regards
Robbe Pincket
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240314/96242a77/attachment.htm>

From maurizio.cimadamore at oracle.com  Thu Mar 14 23:44:20 2024
From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore)
Date: Thu, 14 Mar 2024 23:44:20 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <AS8PR10MB7798C0AB2D5844BCE6A3730AD3292@AS8PR10MB7798.EURPRD10.PROD.OUTLOOK.COM>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>
 <B52FF02B-5379-438A-9A23-05519246AB1F@oracle.com>
 <bf368957-e3c2-4d79-b462-06d20e38a387@oracle.com>
 <E3598C9D-26BA-4AC0-AE01-EE7E7C42CA44@oracle.com>
 <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com>
 <c9374ab8-1e0e-4746-9f6b-b77135ddd809@oracle.com>
 <F217B131-C5FD-4587-B251-06760F08DD36@oracle.com>
 <f0f270b1-d186-4b6b-a451-d5f01d1bcd42@oracle.com>
 <E65EA9D5-312B-4A39-99F7-013D70C6E62C@oracle.com>
 <AS8PR10MB7798C0AB2D5844BCE6A3730AD3292@AS8PR10MB7798.EURPRD10.PROD.OUTLOOK.COM>
Message-ID: <ef9dae24-a90d-4772-b228-4448acb5c126@oracle.com>


On 14/03/2024 22:36, Robbe Pincket wrote:
> (I?m not a big fan of `TEMPLATE"Foo: \{bar}"` either as it?s just so 
> much longer than `"Foo: " + bar`)

Note that when I suggested TEMPLATE as a prefix I was obviously not 
being super serious :-)

Let's do a test (bear with me). Let's assume the two prefixes were S and 
T (not saying I like them, just trying them out for size). Let's also 
assume there's no conversion. Then your examples become:

```
String s1 = "test" // still a string literal
StringTemplate st2 = T"test" // allowed, constant strings can be 
implicitly converted to templates
StringTemplate st3 = "Foo: \{bar}" // Simple string template

String s4c = S"Foo: \{bar}" // short for String.of("Foo: \{bar}")

```

I think that's not too bad? (please don't focus too much on the letters).

In the sense: the rare cases (st2) has a prefix. And the operation we 
want explicit (s4c) also has a prefix. Everything else is fine.

Control question #1: does the conversion here change things much? Or, 
are we reaching for conversions just to have something "shorter" ?

Control question #2: let's now assume that S and T were spelled (String) 
and (StringTemplate), respectively. How do we feel about this?

```
String s1 = "test" // still a string literal
StringTemplate st2 = (StringTemplate)"test" // allowed, cast from 
constant string to template
StringTemplate st3 = "Foo: \{bar}" // Simple string template

String s4c = (String)"Foo: \{bar}" // allowed, cast from template back 
to String (interpolation)

```

Maurizio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240314/6a70095a/attachment-0001.htm>

From robbepincket at live.be  Fri Mar 15 00:24:11 2024
From: robbepincket at live.be (Robbe Pincket)
Date: Fri, 15 Mar 2024 00:24:11 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <ef9dae24-a90d-4772-b228-4448acb5c126@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>
 <B52FF02B-5379-438A-9A23-05519246AB1F@oracle.com>
 <bf368957-e3c2-4d79-b462-06d20e38a387@oracle.com>
 <E3598C9D-26BA-4AC0-AE01-EE7E7C42CA44@oracle.com>
 <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com>
 <c9374ab8-1e0e-4746-9f6b-b77135ddd809@oracle.com>
 <F217B131-C5FD-4587-B251-06760F08DD36@oracle.com>
 <f0f270b1-d186-4b6b-a451-d5f01d1bcd42@oracle.com>
 <E65EA9D5-312B-4A39-99F7-013D70C6E62C@oracle.com>
 <AS8PR10MB7798C0AB2D5844BCE6A3730AD3292@AS8PR10MB7798.EURPRD10.PROD.OUTLOOK.COM>
 <ef9dae24-a90d-4772-b228-4448acb5c126@oracle.com>
Message-ID: <AS8PR10MB77984941E17CA8589BE4E2C7D3282@AS8PR10MB7798.EURPRD10.PROD.OUTLOOK.COM>


On 14/03/2024 23:44 UTC, Maurizio Cimadamore wrote:
           On 14/03/2024 22:36, Robbe Pincket wrote:
                      (I?m not a big fan of `TEMPLATE"Foo: \{bar}"` either as it?s just so much longer than `"Foo: " + bar`)

           Note that when I suggested TEMPLATE as a prefix I was obviously not being super serious :-)

           Let's do a test (bear with me). Let's assume the two prefixes were S and T (not saying I like them, just trying them out for size). Let's also assume there's no conversion. Then your examples become:

           ```
           String s1 = "test" // still a string literal
           StringTemplate st2 = T"test" // allowed, constant strings can be implicitly converted to templates
           StringTemplate st3 = "Foo: \{bar}" // Simple string template
           String s4c = S"Foo: \{bar}" // short for String.of("Foo: \{bar}")
           ```

           I think that's not too bad? (please don't focus too much on the letters).
           In the sense: the rare cases (st2) has a prefix. And the operation we want explicit (s4c) also has a prefix. Everything else is fine.

So the difference is that `T` (or something else) has to be used for templates without any holes?
To me it feels a bit weird to have a prefix for the special case of a hole-less template.

I think I saw an argument passing by, saying something along the line that `String` and `StringTemplate` are semantically different so implicit conversion in either direction would be bad because it would be ambigous.
If this `T` idea is based on that, I don't really see why it would be that bad. If an API accepts either, I would intuitivly expect that passing a string and passing a hole-less template with the same string would give me the same result.

           Control question #1: does the conversion here change things much? Or, are we reaching for conversions just to have something "shorter" ?
           Control question #2: let's now assume that S and T were spelled (String) and (StringTemplate), respectively. How do we feel about this?

           ```
           String s1 = "test" // still a string literal
           StringTemplate st2 = (StringTemplate)"test" // allowed, cast from constant string to template
           StringTemplate st3 = "Foo: \{bar}" // Simple string template
           String s4c = (String)"Foo: \{bar}" // allowed, cast from template back to String (interpolation)
           ```

I think I answered #1. Having the 'T' *just* for the "hole-less" template feels a bit odd? I don't you can sell me on #2.
* Why would I prefer using `(String)"Foo: \{bar}"` over `"Foo: " + bar`. This is not just a "length issue", as templates would win with more holes, but there is also a cost of switching habits.
* (Ignoring primitives), casts have up until now always just returned the input, but now with a different static type. Using the casting syntax to do actual interpolation (or create a template from a string) feels weird to me

If ?(StringTemplate)"test"` and `(String)"Foo: \{bar}"` are valid, will the following things work too? `"test" instanceof StringTemplate template` and `"Foo: \{bar}" instanceof String str`. The second one I'd assume no, the first one is a bit unclear to me.

           Maurizio

Kind regards
Robbe Pincket
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240315/70f4d2dc/attachment.htm>

From maurizio.cimadamore at oracle.com  Fri Mar 15 00:49:48 2024
From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore)
Date: Fri, 15 Mar 2024 00:49:48 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <AS8PR10MB77984941E17CA8589BE4E2C7D3282@AS8PR10MB7798.EURPRD10.PROD.OUTLOOK.COM>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>
 <B52FF02B-5379-438A-9A23-05519246AB1F@oracle.com>
 <bf368957-e3c2-4d79-b462-06d20e38a387@oracle.com>
 <E3598C9D-26BA-4AC0-AE01-EE7E7C42CA44@oracle.com>
 <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com>
 <c9374ab8-1e0e-4746-9f6b-b77135ddd809@oracle.com>
 <F217B131-C5FD-4587-B251-06760F08DD36@oracle.com>
 <f0f270b1-d186-4b6b-a451-d5f01d1bcd42@oracle.com>
 <E65EA9D5-312B-4A39-99F7-013D70C6E62C@oracle.com>
 <AS8PR10MB7798C0AB2D5844BCE6A3730AD3292@AS8PR10MB7798.EURPRD10.PROD.OUTLOOK.COM>
 <ef9dae24-a90d-4772-b228-4448acb5c126@oracle.com>
 <AS8PR10MB77984941E17CA8589BE4E2C7D3282@AS8PR10MB7798.EURPRD10.PROD.OUTLOOK.COM>
Message-ID: <8882415e-aa07-4738-b6ee-1926bfb420b1@oracle.com>


On 15/03/2024 00:24, Robbe Pincket wrote:
>
> On 14/03/2024 23:44 UTC, Maurizio Cimadamore wrote:
>
> ?????????? On 14/03/2024 22:36, Robbe Pincket wrote:
>
> (I?m not a big fan of `TEMPLATE"Foo: \{bar}"` either as it?s just so 
> much longer than `"Foo: " + bar`)
>
> ?????????? Note that when I suggested TEMPLATE as a prefix I was 
> obviously not being super serious :-)
>
> ?????????? Let's do a test (bear with me). Let's assume the two 
> prefixes were S and T (not saying I like them, just trying them out 
> for size). Let's also assume there's no conversion. Then your examples 
> become:
>
> ?????????? ```
>
> ?????????? String s1 = "test" // still a string literal
>
> StringTemplate st2 = T"test" // allowed, constant strings can be 
> implicitly converted to templates
>
> StringTemplate st3 = "Foo: \{bar}" // Simple string template
>
> ?????????? String s4c = S"Foo: \{bar}" // short for String.of("Foo: 
> \{bar}")
>
> ?????????? ```
>
> ?????????? I think that's not too bad? (please don't focus too much on 
> the letters).
>
> ?????????? In the sense: the rare cases (st2) has a prefix. And the 
> operation we want explicit (s4c) also has a prefix. Everything else is 
> fine.
>
> So the difference is that `T` (or something else) has to be used for 
> templates without any holes?
>
> To me it feels a bit weird to have a prefix for the special case of a 
> hole-less template.
>
Ok. Note that T would be required in that case, but one might also use 
it as a visual delimiter: if a template is very long, it might not be 
too readable to leave it implicit as to whether the thing in quotes is a 
template or not.
>
> I think I saw an argument passing by, saying something along the line 
> that `String` and `StringTemplate` are semantically different so 
> implicit conversion in either direction would be bad because it would 
> be ambigous.
>
> If this `T` idea is based on that, I don't really see why it would be 
> that bad. If an API accepts either, I would intuitivly expect that 
> passing a string and passing a hole-less template with the same string 
> would give me the same result.
>
> ?????????? Control question #1: does the conversion here change things 
> much? Or, are we reaching for conversions just to have something 
> "shorter" ?
>
> ?????????? Control question #2: let's now assume that S and T were 
> spelled (String) and (StringTemplate), respectively. How do we feel 
> about this?
>
> ?????????? ```
>
> ?????????? String s1 = "test" // still a string literal
>
> StringTemplate st2 = (StringTemplate)"test" // allowed, cast from 
> constant string to template
>
> StringTemplate st3 = "Foo: \{bar}" // Simple string template
>
> ?????????? String s4c = (String)"Foo: \{bar}" // allowed, cast from 
> template back to String (interpolation)
>
> ?????????? ```
>
> I think I answered #1. Having the 'T' *just* for the "hole-less" 
> template feels a bit odd? I don't you can sell me on #2.
>
Fair enough - I had to ask :-)
>
> If ?(StringTemplate)"test"` and `(String)"Foo: \{bar}"` are valid, 
> will the following things work too? `"test" instanceof StringTemplate 
> template` and `"Foo: \{bar}" instanceof String str`. The second one 
> I'd assume no, the first one is a bit unclear to me.
>
These are questions I raised even in the context of the implicit 
conversion you are advocating for: once you add an assignment 
conversion, cast comes with it, and with cast, patterns and instanceof. 
In other words, that's the price we have to pay for eliminating the T in 
the hole-less template in the way you proposed. Casts just make that 
trade-off more explicit.

Another thing I don't love about implicit conversion, is that they don't 
play with inference too well:

```
List<StringTemplate> ls = List.of("Hello");
```

The above would be an error. The type-variable (X) for List::of is 
seeing two different constraints:

* X = StringTemplate (from the target type)
* String <: X (from the argument)

This fails, because we'd infer StringTemplate which is a supertype of 
String. Even if we could somehow "convince" inference that 
StringTemplate is a valid "more general" type, I see lots and lots of 
dragons here:

* inference would have to be careful only to do certain moves if 
constant strings are involved
* if we pick StringTemplate, we're basically saying that the method is 
applicable by conversion, so in overload step 2. But is this the 
overload step we used to pick that candidate in the first place? 
Probably not, because String <: X requires only subtyping, not conversion.

Ultimately, implicit conversion would only "kind of work" and will lead 
to issues when interacting with generics. While it's tempting to sweep 
issues under the rug (after all they do not seem very important for the 
examples we're discussing), such compromises have a tendency to bit us 
back when feature "grow up" and start playing more with other feature: 
any loss of compositionality there costs quite a bit. Which is why I'm 
not in love with implicit conversions

Maurizio


> ?????????? Maurizio
>
> Kind regards
>
> Robbe Pincket
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240315/a2e3095c/attachment-0001.htm>

From guy.steele at oracle.com  Fri Mar 15 01:05:46 2024
From: guy.steele at oracle.com (Guy Steele)
Date: Fri, 15 Mar 2024 01:05:46 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <b221590a-8082-4543-a95f-a870e9dffb8a@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>
 <B52FF02B-5379-438A-9A23-05519246AB1F@oracle.com>
 <bf368957-e3c2-4d79-b462-06d20e38a387@oracle.com>
 <E3598C9D-26BA-4AC0-AE01-EE7E7C42CA44@oracle.com>
 <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com>
 <c9374ab8-1e0e-4746-9f6b-b77135ddd809@oracle.com>
 <F217B131-C5FD-4587-B251-06760F08DD36@oracle.com>
 <f0f270b1-d186-4b6b-a451-d5f01d1bcd42@oracle.com>
 <E65EA9D5-312B-4A39-99F7-013D70C6E62C@oracle.com>
 <b221590a-8082-4543-a95f-a870e9dffb8a@oracle.com>
Message-ID: <330FB2A7-3154-4CC5-AA34-D4ECBFBC713C@oracle.com>

Thanks for these derails, but they don?t quite answer my question: how does the compiler makes the decision to require the prefix? Specifically, is it done purely by examining the types of the literals (in which case the existing story, about how method overloading decides which of several methods with the same name to call, is adequate), or are you imagining some additional ad-hoc mechanism that is somehow examining the syntax of method arguments (in which case some care will be needed to ensure that it interacts properly with the rest of the method overloading resolution mechanism)? I ask because, given your explanation below, I am not seeing how types alone can do the job?but maybe I am missing something.

?Guy

On Mar 14, 2024, at 6:15?PM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com> wrote:


On 14/03/2024 22:05, Guy Steele wrote:
Is your intent that a string interpolation literal would have a type other than String? If so, I agree that this is a third option?with the consequence that each API designer now needs to contemplate three-way overloading.

If that is not your intent, then I am not seeing how the prefix helps?so please explain?

Let's go back to the example I mentioned:

String.format("Hello, my name is %s\{name}"); // can you spot the bug?


There's a string with an embedded expression here. The compiler might require a prefix here (e.g. do you want a string, or a string template?). If no prefix is added (as in the above code) it might just be an error, and this won't compile.

This means that if I do:

String.format(INTERPOLATED"Hello, my name is %s\{name}");


I will select String.format(String, Object...) - but I will do so deliberately - it's not just what happens "by default" (as was the case before).

Or, if I want the template version, I do:

String.format(TEMPLATE"Hello, my name is %s\{name}");


Basically, requiring all literals that have embedded expression to have a prefix removes the problem of defaulting on the String side of the fence. Then, personally I'd also prefer if the default was actually on the StringTemplate side of the fence, so that the above was actually identical to this:

String.format("Hello, my name is %s\{name}"); // ok, this is a template


Note that these two prefixes might also come in handy when disambiguating a literal with no embedded expressions. Only, in that case the default would point the other way.

To summarize:

  *   template literal with arguments -> defaults to StringTemplate. User can ask interpolation explicitly, by adding a prefix
  *   template literal w/o arguments -> defaults to String. User can ask a degenerate template explicitly, by adding a prefix

This doesn't sound too bad, and it feels like it has the defaults pointing the right way?

Maurizio

Thanks,
Guy

On Mar 14, 2024, at 6:00?PM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com><mailto:maurizio.cimadamore at oracle.com> wrote:


On 14/03/2024 19:39, Guy Steele wrote:
This is a very important example to consider. I observe, however, that there are at least two possible ways to avoid the unpleasant surprise:

(1) Don't have string interpolation literals, because accidentally using a string interpolation literal instead of a string template literals can result in invoking the wrong overload of a method.

(2) Don?t overload methods so as to accept either a string or a string template.

I agree with your analysis, but note that there is also a third option:

(3) make it so that both string interpolation literal and string template literal have a prefix.

I believe that is enough to solve the issue (because the program I wrote would no longer compile: the compiler would require an explicit prefix).

Maurizio

If we were to take approach (2), then:

(a) We would keep `println` as is, and not allow it to accept a template, but that?s okay?if you thought you wanted a template, what you really want is plan old string interpolation, and the type checking will make sure you don't use the wrong one.

(b) A SQL processor would accept a template but not a string?if you thought you wanted string interpolation, what you really want is a template, and the type checking will make sure you don't use the wrong one.

(c) I think `format` is a special case that we tend to get hung up on, and I think that, in this particular branch of the design space we are exploring, perhaps a name other than `String.format` should be chosen for the method that does string formatting on templates. Possible names are `StringTemplate.format` and `String.format$`, but I will leave further bikeshedding on this to others. I do recognize that this move will not enable the type system per se to absolutely prevent programmers from writing

String.format("Hello, my name is %s{name}"); // can you spot the bug?


but, as Clement has observed, such cases will probably provoke a warning about a mismatch between the number of arguments and the number of %-specifiers that require parameters, so maybe overloading would be okay anyway for `String.format`.

Anyway, my point is that whether to overload a method to accept either a string or a string template can be evaluated on a case-by-case basis according to a small number of principles that I think we could enumerate and explain pretty easily.

?Guy

On Mar 14, 2024, at 1:40?PM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com><mailto:maurizio.cimadamore at oracle.com> wrote:


Not to pour too much cold water on the idea of having string interpolation literal, but I?d like to mention a few points here.

First, it was a deliberate design goal of the string template feature to make interpolation an explicit act. Note that, if we had the syntax you describe, we actually achieve the opposite effect: string interpolation is now the default, and implicit, and actually cheaper (to type) than the safer template alternative. This is a bit of a red herring, I think.

The second problem is that interpolation literals can sometimes be deceiving. Consider this example:

String.format("Hello, my name is %s{name}"); // can you spot the bug?


Where String::format has a new overload which accepts a StringTemplate.

Basically, since here we forgot the leading ?$? (or whatever char that is), the whole thing is just a big interpolation. Semantically equivalent to:

 String.format("Hello, my name is %s" + name); // whoops!


This will fail, as String::format will be waiting for an argument (a string), but none is provided. So:

|  Exception java.util.MissingFormatArgumentException: Format specifier '%s'
|        at Formatter.format (Formatter.java:2672)
|        at Formatter.format (Formatter.java:2609)
|        at String.format (String.java:2897)
|        at (#2:1)


This is a very odd (and new!) failure mode, that I?m sure is gonna surprise developers.

Maurizio

On 14/03/2024 15:08, Guy Steele wrote:


Second thoughts about how to explain a string interpolation literal:


On Mar 13, 2024, at 2:02?PM, Guy Steele <guy.steele at oracle.com><mailto:guy.steele at oracle.com> wrote:
. . .

?????????
String is not a subtype of StringTemplate; they are disjoint types.

        $?foo?              is a (trivial) string template literal
        ?foo?                is a string literal
        $?Hello, \{x}?     is a (nontrivial) string template literal
        ?Hello, \{x}?      is a shorthand (expanded by the compiler) for `String.of($?Hello, \{x}?)`
?????????


Given that the intent is that String.of (or whatever we want to call it?possibly the `interpolation` instance method of class `StringTemplate` rather than a static method `String.of`) should just do standard string concatenation, we might be better off just saying that a string interpolation literal is expanded by the compiler into uses of ?+?; for example,

         ?Hello, \{x}.?

(I have added a period to the example to make the point clearer) is expanded into

        ?Hello, ? + x + ?.?

and in general

        ?c0\{e1}c1\{e2}c2?\{en}cn?

(where each ck is a possibly empty sequence of string characters and each ek is an expression)  is expanded into

        ?c0? + (e1) + ?c1? + (e2) + ?c2? + ? + (en) + ?cn?

The point is that, with this definition, ?c0\{e1}c1\{e2}c2?\{en}cn? is a constant expression iff every ek is a constant expression. This is handy for interpolating constant variables into a string that is itself intended to be constant.

?Guy


?


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240315/cbc613bf/attachment-0001.htm>

From guy.steele at oracle.com  Fri Mar 15 01:07:16 2024
From: guy.steele at oracle.com (Guy Steele)
Date: Fri, 15 Mar 2024 01:07:16 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <330FB2A7-3154-4CC5-AA34-D4ECBFBC713C@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>
 <B52FF02B-5379-438A-9A23-05519246AB1F@oracle.com>
 <bf368957-e3c2-4d79-b462-06d20e38a387@oracle.com>
 <E3598C9D-26BA-4AC0-AE01-EE7E7C42CA44@oracle.com>
 <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com>
 <c9374ab8-1e0e-4746-9f6b-b77135ddd809@oracle.com>
 <F217B131-C5FD-4587-B251-06760F08DD36@oracle.com>
 <f0f270b1-d186-4b6b-a451-d5f01d1bcd42@oracle.com>
 <E65EA9D5-312B-4A39-99F7-013D70C6E62C@oracle.com>
 <b221590a-8082-4543-a95f-a870e9dffb8a@oracle.com>
 <330FB2A7-3154-4CC5-AA34-D4ECBFBC713C@oracle.com>
Message-ID: <FFD0AA50-A66A-4B4C-BA37-C68C3302EF34@oracle.com>


On Mar 14, 2024, at 9:05?PM, Guy Steele <guy.steele at oracle.com> wrote:

Thanks for these derails,

Sorry: ?details"

but they don?t quite answer my question: how does the compiler makes the decision to require the prefix? Specifically, is it done purely by examining the types of the literals (in which case the existing story, about how method overloading decides which of several methods with the same name to call, is adequate), or are you imagining some additional ad-hoc mechanism that is somehow examining the syntax of method arguments (in which case some care will be needed to ensure that it interacts properly with the rest of the method overloading resolution mechanism)? I ask because, given your explanation below, I am not seeing how types alone can do the job?but maybe I am missing something.

?Guy

On Mar 14, 2024, at 6:15?PM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com> wrote:


On 14/03/2024 22:05, Guy Steele wrote:
Is your intent that a string interpolation literal would have a type other than String? If so, I agree that this is a third option?with the consequence that each API designer now needs to contemplate three-way overloading.

If that is not your intent, then I am not seeing how the prefix helps?so please explain?

Let's go back to the example I mentioned:

String.format("Hello, my name is %s\{name}"); // can you spot the bug?


There's a string with an embedded expression here. The compiler might require a prefix here (e.g. do you want a string, or a string template?). If no prefix is added (as in the above code) it might just be an error, and this won't compile.

This means that if I do:

String.format(INTERPOLATED"Hello, my name is %s\{name}");


I will select String.format(String, Object...) - but I will do so deliberately - it's not just what happens "by default" (as was the case before).

Or, if I want the template version, I do:

String.format(TEMPLATE"Hello, my name is %s\{name}");


Basically, requiring all literals that have embedded expression to have a prefix removes the problem of defaulting on the String side of the fence. Then, personally I'd also prefer if the default was actually on the StringTemplate side of the fence, so that the above was actually identical to this:

String.format("Hello, my name is %s\{name}"); // ok, this is a template


Note that these two prefixes might also come in handy when disambiguating a literal with no embedded expressions. Only, in that case the default would point the other way.

To summarize:

  *   template literal with arguments -> defaults to StringTemplate. User can ask interpolation explicitly, by adding a prefix
  *   template literal w/o arguments -> defaults to String. User can ask a degenerate template explicitly, by adding a prefix

This doesn't sound too bad, and it feels like it has the defaults pointing the right way?

Maurizio

Thanks,
Guy

On Mar 14, 2024, at 6:00?PM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com><mailto:maurizio.cimadamore at oracle.com> wrote:


On 14/03/2024 19:39, Guy Steele wrote:
This is a very important example to consider. I observe, however, that there are at least two possible ways to avoid the unpleasant surprise:

(1) Don't have string interpolation literals, because accidentally using a string interpolation literal instead of a string template literals can result in invoking the wrong overload of a method.

(2) Don?t overload methods so as to accept either a string or a string template.

I agree with your analysis, but note that there is also a third option:

(3) make it so that both string interpolation literal and string template literal have a prefix.

I believe that is enough to solve the issue (because the program I wrote would no longer compile: the compiler would require an explicit prefix).

Maurizio

If we were to take approach (2), then:

(a) We would keep `println` as is, and not allow it to accept a template, but that?s okay?if you thought you wanted a template, what you really want is plan old string interpolation, and the type checking will make sure you don't use the wrong one.

(b) A SQL processor would accept a template but not a string?if you thought you wanted string interpolation, what you really want is a template, and the type checking will make sure you don't use the wrong one.

(c) I think `format` is a special case that we tend to get hung up on, and I think that, in this particular branch of the design space we are exploring, perhaps a name other than `String.format` should be chosen for the method that does string formatting on templates. Possible names are `StringTemplate.format` and `String.format$`, but I will leave further bikeshedding on this to others. I do recognize that this move will not enable the type system per se to absolutely prevent programmers from writing

String.format("Hello, my name is %s{name}"); // can you spot the bug?


but, as Clement has observed, such cases will probably provoke a warning about a mismatch between the number of arguments and the number of %-specifiers that require parameters, so maybe overloading would be okay anyway for `String.format`.

Anyway, my point is that whether to overload a method to accept either a string or a string template can be evaluated on a case-by-case basis according to a small number of principles that I think we could enumerate and explain pretty easily.

?Guy

On Mar 14, 2024, at 1:40?PM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com><mailto:maurizio.cimadamore at oracle.com> wrote:


Not to pour too much cold water on the idea of having string interpolation literal, but I?d like to mention a few points here.

First, it was a deliberate design goal of the string template feature to make interpolation an explicit act. Note that, if we had the syntax you describe, we actually achieve the opposite effect: string interpolation is now the default, and implicit, and actually cheaper (to type) than the safer template alternative. This is a bit of a red herring, I think.

The second problem is that interpolation literals can sometimes be deceiving. Consider this example:

String.format("Hello, my name is %s{name}"); // can you spot the bug?


Where String::format has a new overload which accepts a StringTemplate.

Basically, since here we forgot the leading ?$? (or whatever char that is), the whole thing is just a big interpolation. Semantically equivalent to:

 String.format("Hello, my name is %s" + name); // whoops!


This will fail, as String::format will be waiting for an argument (a string), but none is provided. So:

|  Exception java.util.MissingFormatArgumentException: Format specifier '%s'
|        at Formatter.format (Formatter.java:2672)
|        at Formatter.format (Formatter.java:2609)
|        at String.format (String.java:2897)
|        at (#2:1)


This is a very odd (and new!) failure mode, that I?m sure is gonna surprise developers.

Maurizio

On 14/03/2024 15:08, Guy Steele wrote:


Second thoughts about how to explain a string interpolation literal:


On Mar 13, 2024, at 2:02?PM, Guy Steele <guy.steele at oracle.com><mailto:guy.steele at oracle.com> wrote:
. . .

?????????
String is not a subtype of StringTemplate; they are disjoint types.

        $?foo?              is a (trivial) string template literal
        ?foo?                is a string literal
        $?Hello, \{x}?     is a (nontrivial) string template literal
        ?Hello, \{x}?      is a shorthand (expanded by the compiler) for `String.of($?Hello, \{x}?)`
?????????


Given that the intent is that String.of (or whatever we want to call it?possibly the `interpolation` instance method of class `StringTemplate` rather than a static method `String.of`) should just do standard string concatenation, we might be better off just saying that a string interpolation literal is expanded by the compiler into uses of ?+?; for example,

         ?Hello, \{x}.?

(I have added a period to the example to make the point clearer) is expanded into

        ?Hello, ? + x + ?.?

and in general

        ?c0\{e1}c1\{e2}c2?\{en}cn?

(where each ck is a possibly empty sequence of string characters and each ek is an expression)  is expanded into

        ?c0? + (e1) + ?c1? + (e2) + ?c2? + ? + (en) + ?cn?

The point is that, with this definition, ?c0\{e1}c1\{e2}c2?\{en}cn? is a constant expression iff every ek is a constant expression. This is handy for interpolating constant variables into a string that is itself intended to be constant.

?Guy


?


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240315/6d259683/attachment-0001.htm>

From guy.steele at oracle.com  Fri Mar 15 02:10:07 2024
From: guy.steele at oracle.com (Guy Steele)
Date: Fri, 15 Mar 2024 02:10:07 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <330FB2A7-3154-4CC5-AA34-D4ECBFBC713C@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>
 <B52FF02B-5379-438A-9A23-05519246AB1F@oracle.com>
 <bf368957-e3c2-4d79-b462-06d20e38a387@oracle.com>
 <E3598C9D-26BA-4AC0-AE01-EE7E7C42CA44@oracle.com>
 <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com>
 <c9374ab8-1e0e-4746-9f6b-b77135ddd809@oracle.com>
 <F217B131-C5FD-4587-B251-06760F08DD36@oracle.com>
 <f0f270b1-d186-4b6b-a451-d5f01d1bcd42@oracle.com>
 <E65EA9D5-312B-4A39-99F7-013D70C6E62C@oracle.com>
 <b221590a-8082-4543-a95f-a870e9dffb8a@oracle.com>
 <330FB2A7-3154-4CC5-AA34-D4ECBFBC713C@oracle.com>
Message-ID: <7873F0D5-E053-4541-B2A5-2E41B536DD8D@oracle.com>

Oh, I think I get it now; I misinterpreted "The compiler might require a prefix here? to mean "The compiler might require a prefix on a literal that is a method argument?, but I now see, from your later sentence "Basically, requiring all literals that have embedded expression to have a prefix . . .? that maybe you just want to adjust the syntax of literals to be roughly what Clement suggested:

???                             plain string literal, cannot contain \{?}, type is String
INTERPOLATION???     string interpolation, may contain \{?}, type is String
TEMPLATE???      string template, , may contain \{?}, type is StringTemplate

where the precise syntax for the prefixed INTERPOLATION and TEMPLATE is to be determined. Do I understand your proposal correctly now?

?Guy

On Mar 14, 2024, at 9:05?PM, Guy Steele <guy.steele at oracle.com> wrote:

Thanks for these derails, but they don?t quite answer my question: how does the compiler makes the decision to require the prefix? Specifically, is it done purely by examining the types of the literals (in which case the existing story, about how method overloading decides which of several methods with the same name to call, is adequate), or are you imagining some additional ad-hoc mechanism that is somehow examining the syntax of method arguments (in which case some care will be needed to ensure that it interacts properly with the rest of the method overloading resolution mechanism)? I ask because, given your explanation below, I am not seeing how types alone can do the job?but maybe I am missing something.

?Guy

On Mar 14, 2024, at 6:15?PM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com> wrote:


On 14/03/2024 22:05, Guy Steele wrote:
Is your intent that a string interpolation literal would have a type other than String? If so, I agree that this is a third option?with the consequence that each API designer now needs to contemplate three-way overloading.

If that is not your intent, then I am not seeing how the prefix helps?so please explain?

Let's go back to the example I mentioned:

String.format("Hello, my name is %s\{name}"); // can you spot the bug?


There's a string with an embedded expression here. The compiler might require a prefix here (e.g. do you want a string, or a string template?). If no prefix is added (as in the above code) it might just be an error, and this won't compile.

This means that if I do:

String.format(INTERPOLATED"Hello, my name is %s\{name}");


I will select String.format(String, Object...) - but I will do so deliberately - it's not just what happens "by default" (as was the case before).

Or, if I want the template version, I do:

String.format(TEMPLATE"Hello, my name is %s\{name}");


Basically, requiring all literals that have embedded expression to have a prefix removes the problem of defaulting on the String side of the fence. Then, personally I'd also prefer if the default was actually on the StringTemplate side of the fence, so that the above was actually identical to this:

String.format("Hello, my name is %s\{name}"); // ok, this is a template


Note that these two prefixes might also come in handy when disambiguating a literal with no embedded expressions. Only, in that case the default would point the other way.

To summarize:

  *   template literal with arguments -> defaults to StringTemplate. User can ask interpolation explicitly, by adding a prefix
  *   template literal w/o arguments -> defaults to String. User can ask a degenerate template explicitly, by adding a prefix

This doesn't sound too bad, and it feels like it has the defaults pointing the right way?

Maurizio

Thanks,
Guy

On Mar 14, 2024, at 6:00?PM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com><mailto:maurizio.cimadamore at oracle.com> wrote:


On 14/03/2024 19:39, Guy Steele wrote:
This is a very important example to consider. I observe, however, that there are at least two possible ways to avoid the unpleasant surprise:

(1) Don't have string interpolation literals, because accidentally using a string interpolation literal instead of a string template literals can result in invoking the wrong overload of a method.

(2) Don?t overload methods so as to accept either a string or a string template.

I agree with your analysis, but note that there is also a third option:

(3) make it so that both string interpolation literal and string template literal have a prefix.

I believe that is enough to solve the issue (because the program I wrote would no longer compile: the compiler would require an explicit prefix).

Maurizio

If we were to take approach (2), then:

(a) We would keep `println` as is, and not allow it to accept a template, but that?s okay?if you thought you wanted a template, what you really want is plan old string interpolation, and the type checking will make sure you don't use the wrong one.

(b) A SQL processor would accept a template but not a string?if you thought you wanted string interpolation, what you really want is a template, and the type checking will make sure you don't use the wrong one.

(c) I think `format` is a special case that we tend to get hung up on, and I think that, in this particular branch of the design space we are exploring, perhaps a name other than `String.format` should be chosen for the method that does string formatting on templates. Possible names are `StringTemplate.format` and `String.format$`, but I will leave further bikeshedding on this to others. I do recognize that this move will not enable the type system per se to absolutely prevent programmers from writing

String.format("Hello, my name is %s{name}"); // can you spot the bug?


but, as Clement has observed, such cases will probably provoke a warning about a mismatch between the number of arguments and the number of %-specifiers that require parameters, so maybe overloading would be okay anyway for `String.format`.

Anyway, my point is that whether to overload a method to accept either a string or a string template can be evaluated on a case-by-case basis according to a small number of principles that I think we could enumerate and explain pretty easily.

?Guy

On Mar 14, 2024, at 1:40?PM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com><mailto:maurizio.cimadamore at oracle.com> wrote:


Not to pour too much cold water on the idea of having string interpolation literal, but I?d like to mention a few points here.

First, it was a deliberate design goal of the string template feature to make interpolation an explicit act. Note that, if we had the syntax you describe, we actually achieve the opposite effect: string interpolation is now the default, and implicit, and actually cheaper (to type) than the safer template alternative. This is a bit of a red herring, I think.

The second problem is that interpolation literals can sometimes be deceiving. Consider this example:

String.format("Hello, my name is %s{name}"); // can you spot the bug?


Where String::format has a new overload which accepts a StringTemplate.

Basically, since here we forgot the leading ?$? (or whatever char that is), the whole thing is just a big interpolation. Semantically equivalent to:

 String.format("Hello, my name is %s" + name); // whoops!


This will fail, as String::format will be waiting for an argument (a string), but none is provided. So:

|  Exception java.util.MissingFormatArgumentException: Format specifier '%s'
|        at Formatter.format (Formatter.java:2672)
|        at Formatter.format (Formatter.java:2609)
|        at String.format (String.java:2897)
|        at (#2:1)


This is a very odd (and new!) failure mode, that I?m sure is gonna surprise developers.

Maurizio

On 14/03/2024 15:08, Guy Steele wrote:


Second thoughts about how to explain a string interpolation literal:


On Mar 13, 2024, at 2:02?PM, Guy Steele <guy.steele at oracle.com><mailto:guy.steele at oracle.com> wrote:
. . .

?????????
String is not a subtype of StringTemplate; they are disjoint types.

        $?foo?              is a (trivial) string template literal
        ?foo?                is a string literal
        $?Hello, \{x}?     is a (nontrivial) string template literal
        ?Hello, \{x}?      is a shorthand (expanded by the compiler) for `String.of($?Hello, \{x}?)`
?????????


Given that the intent is that String.of (or whatever we want to call it?possibly the `interpolation` instance method of class `StringTemplate` rather than a static method `String.of`) should just do standard string concatenation, we might be better off just saying that a string interpolation literal is expanded by the compiler into uses of ?+?; for example,

         ?Hello, \{x}.?

(I have added a period to the example to make the point clearer) is expanded into

        ?Hello, ? + x + ?.?

and in general

        ?c0\{e1}c1\{e2}c2?\{en}cn?

(where each ck is a possibly empty sequence of string characters and each ek is an expression)  is expanded into

        ?c0? + (e1) + ?c1? + (e2) + ?c2? + ? + (en) + ?cn?

The point is that, with this definition, ?c0\{e1}c1\{e2}c2?\{en}cn? is a constant expression iff every ek is a constant expression. This is handy for interpolating constant variables into a string that is itself intended to be constant.

?Guy


?


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240315/b007fa1d/attachment-0001.htm>

From maurizio.cimadamore at oracle.com  Fri Mar 15 09:56:42 2024
From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore)
Date: Fri, 15 Mar 2024 09:56:42 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <7873F0D5-E053-4541-B2A5-2E41B536DD8D@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>
 <B52FF02B-5379-438A-9A23-05519246AB1F@oracle.com>
 <bf368957-e3c2-4d79-b462-06d20e38a387@oracle.com>
 <E3598C9D-26BA-4AC0-AE01-EE7E7C42CA44@oracle.com>
 <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com>
 <c9374ab8-1e0e-4746-9f6b-b77135ddd809@oracle.com>
 <F217B131-C5FD-4587-B251-06760F08DD36@oracle.com>
 <f0f270b1-d186-4b6b-a451-d5f01d1bcd42@oracle.com>
 <E65EA9D5-312B-4A39-99F7-013D70C6E62C@oracle.com>
 <b221590a-8082-4543-a95f-a870e9dffb8a@oracle.com>
 <330FB2A7-3154-4CC5-AA34-D4ECBFBC713C@oracle.com>
 <7873F0D5-E053-4541-B2A5-2E41B536DD8D@oracle.com>
Message-ID: <794eaa0d-f244-43a3-af5b-7ecf11ac8a33@oracle.com>


On 15/03/2024 02:10, Guy Steele wrote:
> Oh, I think I get it now; I misinterpreted "The compiler might require 
> a prefix here? to mean "The compiler might require a prefix on a 
> literal that is a method argument?, but I now see, from your later 
> sentence "Basically, requiring all literals that have embedded 
> expression to have a prefix . . .? that maybe you just want to adjust 
> the syntax of literals to be roughly what Clement suggested:
>
> ??? ? ? ? ? ? ? ? ? ? ? ? ? ? ? plain string literal, cannot contain 
> \{?}, type is String
> INTERPOLATION??? ? ? string interpolation, may contain \{?}, type is 
> String
> TEMPLATE??? ? ? ?string template, , may contain \{?}, type is 
> StringTemplate
>
> where the precise syntax for the prefixed INTERPOLATION and TEMPLATE 
> is to be determined. Do I understand your proposal correctly now?
Yes, with the further tweak that the prefix (with syntax TBD) might be 
omitted in the "obvious cases" (but kept for clarity):

* "Hello" w/o prefix is just String
* "Hello \{world}" without prefix is just StringTemplate

Does this help? (I'm basically trying to get to a world where use of 
prefix will be relatively rare, as common cases have the right defaults).

Maurizio

>
> ?Guy
>
>> On Mar 14, 2024, at 9:05?PM, Guy Steele <guy.steele at oracle.com> wrote:
>>
>> Thanks for these derails, but they don?t quite answer my question: 
>> how does the compiler makes the decision to require the prefix? 
>> Specifically, is it done purely by examining the types of the 
>> literals (in which case the existing story, about how method 
>> overloading decides which of several methods with the same name to 
>> call, is adequate), or are you imagining some additional ad-hoc 
>> mechanism that is somehow examining the syntax of method arguments 
>> (in which case some care will be needed to ensure that it interacts 
>> properly with the rest of the method overloading resolution 
>> mechanism)? I ask because, given your explanation below, I am not 
>> seeing how types alone can do the job?but maybe I am missing something.
>>
>> ?Guy
>>
>>> On Mar 14, 2024, at 6:15?PM, Maurizio Cimadamore 
>>> <maurizio.cimadamore at oracle.com> wrote:
>>>
>>>
>>> On 14/03/2024 22:05, Guy Steele wrote:
>>>> Is your intent that a string interpolation literal would have a 
>>>> type other than String? If so, I agree that this is a third 
>>>> option?with the consequence that each API designer now needs to 
>>>> contemplate three-way overloading.
>>>>
>>>> If that is not your intent, then I am not seeing how the prefix 
>>>> helps?so please explain?
>>>
>>> Let's go back to the example I mentioned:
>>>
>>> |String.format("Hello, my name is %s\{name}"); // can you spot the bug? |
>>> There's a string with an embedded expression here. The compiler 
>>> might require a prefix here (e.g. do you want a string, or a string 
>>> template?). If no prefix is added (as in the above code) it might 
>>> just be an error, and this won't compile.
>>>
>>> This means that if I do:
>>> |String.format(INTERPOLATED"Hello, my name is %s\{name}"); |
>>>
>>> I will select String.format(String, Object...) - but I will do so 
>>> deliberately - it's not just what happens "by default" (as was the 
>>> case before).
>>>
>>> Or, if I want the template version, I do:
>>>
>>> |String.format(TEMPLATE"Hello, my name is %s\{name}");|
>>>
>>> Basically, requiring all literals that have embedded expression to 
>>> have a prefix removes the problem of defaulting on the String side 
>>> of the fence. Then, personally I'd also prefer if the default was 
>>> actually on the StringTemplate side of the fence, so that the above 
>>> was actually identical to this:
>>>
>>> |String.format("Hello, my name is %s\{name}"); // ok, this is a template|
>>>
>>> Note that these two prefixes might also come in handy when 
>>> disambiguating a literal with no embedded expressions. Only, in that 
>>> case the default would point the other way.
>>>
>>> To summarize:
>>>
>>>   * template literal with arguments -> defaults to StringTemplate.
>>>     User can ask interpolation explicitly, by adding a prefix
>>>   * template literal w/o arguments -> defaults to String. User can
>>>     ask a degenerate template explicitly, by adding a prefix
>>>
>>> This doesn't sound too bad, and it feels like it has the defaults 
>>> pointing the right way?
>>>
>>> Maurizio
>>>
>>>> Thanks,
>>>> Guy
>>>>
>>>>> On Mar 14, 2024, at 6:00?PM, Maurizio Cimadamore 
>>>>> <maurizio.cimadamore at oracle.com> wrote:
>>>>>
>>>>>
>>>>> On 14/03/2024 19:39, Guy Steele wrote:
>>>>>> This is a very important example to consider. I observe, however, 
>>>>>> that there are at least two possible ways to avoid the unpleasant 
>>>>>> surprise:
>>>>>>
>>>>>> (1) Don't have string interpolation literals, because 
>>>>>> accidentally using a string interpolation literal instead of a 
>>>>>> string template literals can result in invoking the wrong 
>>>>>> overload of a method.
>>>>>>
>>>>>> (2) Don?t overload methods so as to accept either a string or a 
>>>>>> string template.
>>>>>
>>>>> I agree with your analysis, but note that there is also a third 
>>>>> option:
>>>>>
>>>>> (3) make it so that both string interpolation literal and string 
>>>>> template literal have a prefix.
>>>>>
>>>>> I believe that is enough to solve the issue (because the program I 
>>>>> wrote would no longer compile: the compiler would require an 
>>>>> explicit prefix).
>>>>>
>>>>> Maurizio
>>>>>
>>>>>>
>>>>>> If we were to take approach (2), then:
>>>>>>
>>>>>> (a) We would keep `println` as is, and not allow it to accept a 
>>>>>> template, but that?s okay?if you thought you wanted a template, 
>>>>>> what you really want is plan old string interpolation, and the 
>>>>>> type checking will make sure you don't use the wrong one.
>>>>>>
>>>>>> (b) A SQL processor would accept a template but not a string?if 
>>>>>> you thought you wanted string interpolation, what you really want 
>>>>>> is a template, and the type checking will make sure you don't use 
>>>>>> the wrong one.
>>>>>>
>>>>>> (c) I think `format` is a special case that we tend to get hung 
>>>>>> up on, and I think that, in this particular branch of the design 
>>>>>> space we are exploring, perhaps a name other than `String.format` 
>>>>>> should be chosen for the method that does string formatting on 
>>>>>> templates. Possible names are `StringTemplate.format` and 
>>>>>> `String.format$`, but I will leave further bikeshedding on this 
>>>>>> to others. I do recognize that this move will not enable the type 
>>>>>> system per se to absolutely prevent programmers from writing
>>>>>> |String.format("Hello, my name is %s{name}"); // can you spot the 
>>>>>> bug? |
>>>>>> but, as Clement has observed, such cases will probably provoke a 
>>>>>> warning about a mismatch between the number of arguments and the 
>>>>>> number of %-specifiers that require parameters, so maybe 
>>>>>> overloading would be okay anyway for `String.format`.
>>>>>>
>>>>>> Anyway, my point is that whether to overload a method to accept 
>>>>>> either a string or a string template can be evaluated on a 
>>>>>> case-by-case basis according to a small number of principles that 
>>>>>> I think we could enumerate and explain pretty easily.
>>>>>>
>>>>>> ?Guy
>>>>>>
>>>>>>> On Mar 14, 2024, at 1:40?PM, Maurizio Cimadamore 
>>>>>>> <maurizio.cimadamore at oracle.com> wrote:
>>>>>>>
>>>>>>> Not to pour too much cold water on the idea of having string 
>>>>>>> interpolation literal, but I?d like to mention a few points here.
>>>>>>>
>>>>>>> First, it was a deliberate design goal of the string template 
>>>>>>> feature to make interpolation an explicit act. Note that, if we 
>>>>>>> had the syntax you describe, we actually achieve the opposite 
>>>>>>> effect: string interpolation is now the default, and implicit, 
>>>>>>> and actually /cheaper/ (to type) than the safer template 
>>>>>>> alternative. This is a bit of a red herring, I think.
>>>>>>>
>>>>>>> The second problem is that interpolation literals can sometimes 
>>>>>>> be deceiving. Consider this example:
>>>>>>>
>>>>>>> |String.format("Hello, my name is %s{name}"); // can you spot 
>>>>>>> the bug? |
>>>>>>>
>>>>>>> Where |String::format| has a new overload which accepts a 
>>>>>>> StringTemplate.
>>>>>>>
>>>>>>> Basically, since here we forgot the leading ?$? (or whatever 
>>>>>>> char that is), the whole thing is just a big interpolation. 
>>>>>>> Semantically equivalent to:
>>>>>>>
>>>>>>> |String.format("Hello, my name is %s" + name); // whoops! |
>>>>>>>
>>>>>>> This will fail, as |String::format| will be waiting for an 
>>>>>>> argument (a string), but none is provided. So:
>>>>>>>
>>>>>>> || Exception java.util.MissingFormatArgumentException: Format 
>>>>>>> specifier '%s' | at Formatter.format (Formatter.java:2672) | at 
>>>>>>> Formatter.format (Formatter.java:2609) | at String.format 
>>>>>>> (String.java:2897) | at (#2:1) |
>>>>>>>
>>>>>>> This is a very odd (and new!) failure mode, that I?m sure is 
>>>>>>> gonna surprise developers.
>>>>>>>
>>>>>>> Maurizio
>>>>>>>
>>>>>>> On 14/03/2024 15:08, Guy Steele wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Second thoughts about how to explain a string interpolation literal:
>>>>>>>>
>>>>>>>>> On Mar 13, 2024, at 2:02?PM, Guy Steele<guy.steele at oracle.com>  wrote:
>>>>>>>>> . . .
>>>>>>>>>
>>>>>>>>> ?????????
>>>>>>>>> String is not a subtype of StringTemplate; they are disjoint types.
>>>>>>>>>
>>>>>>>>> 	$?foo?              is a (trivial) string template literal
>>>>>>>>> 	?foo?                is a string literal
>>>>>>>>>          $?Hello, \{x}?     is a (nontrivial) string template literal
>>>>>>>>>          ?Hello, \{x}?      is a shorthand (expanded by the compiler) for `String.of($?Hello, \{x}?)`
>>>>>>>>> ?????????
>>>>>>>> Given that the intent is that String.of (or whatever we want to call it?possibly the `interpolation` instance method of class `StringTemplate` rather than a static method `String.of`) should just do standard string concatenation, we might be better off just saying that a string interpolation literal is expanded by the compiler into uses of ?+?; for example,
>>>>>>>>
>>>>>>>>           ?Hello, \{x}.?
>>>>>>>>
>>>>>>>> (I have added a period to the example to make the point clearer) is expanded into
>>>>>>>>
>>>>>>>>          ?Hello, ? + x + ?.?
>>>>>>>>
>>>>>>>> and in general
>>>>>>>>
>>>>>>>>          ?c0\{e1}c1\{e2}c2?\{en}cn?
>>>>>>>>
>>>>>>>> (where each ck is a possibly empty sequence of string characters and each ek is an expression)  is expanded into
>>>>>>>>
>>>>>>>>          ?c0? + (e1) + ?c1? + (e2) + ?c2? + ? + (en) + ?cn?
>>>>>>>>
>>>>>>>> The point is that, with this definition, ?c0\{e1}c1\{e2}c2?\{en}cn? is a constant expression iff every ek is a constant expression. This is handy for interpolating constant variables into a string that is itself intended to be constant.
>>>>>>>>
>>>>>>>> ?Guy
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ?
>>>>>>
>>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240315/c9c320a2/attachment-0001.htm>

From asviraspossible at gmail.com  Fri Mar 15 13:48:56 2024
From: asviraspossible at gmail.com (Victor Nazarov)
Date: Fri, 15 Mar 2024 14:48:56 +0100
Subject: Update on String Templates (JEP 459)
In-Reply-To: <794eaa0d-f244-43a3-af5b-7ecf11ac8a33@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>
 <B52FF02B-5379-438A-9A23-05519246AB1F@oracle.com>
 <bf368957-e3c2-4d79-b462-06d20e38a387@oracle.com>
 <E3598C9D-26BA-4AC0-AE01-EE7E7C42CA44@oracle.com>
 <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com>
 <c9374ab8-1e0e-4746-9f6b-b77135ddd809@oracle.com>
 <F217B131-C5FD-4587-B251-06760F08DD36@oracle.com>
 <f0f270b1-d186-4b6b-a451-d5f01d1bcd42@oracle.com>
 <E65EA9D5-312B-4A39-99F7-013D70C6E62C@oracle.com>
 <b221590a-8082-4543-a95f-a870e9dffb8a@oracle.com>
 <330FB2A7-3154-4CC5-AA34-D4ECBFBC713C@oracle.com>
 <7873F0D5-E053-4541-B2A5-2E41B536DD8D@oracle.com>
 <794eaa0d-f244-43a3-af5b-7ecf11ac8a33@oracle.com>
Message-ID: <CAFOkWZbF5m-LB6zCqYrEYugd3GNyouVTXCpLCqn9uwx5vQhLag@mail.gmail.com>

Hello experts,

I'm not sure if we need one more voice in this thread, but maybe my summary
can be a small contribution.

I've read the whole thread and I saw only two goals that were named for the
StringTemplates-feature.

1) is safety as explained very thoroughly by Maurizio Cimadamore, and
another is
2) avoiding proliferation of String-literal sublanguages as advocated by
Brian Goetz

As Maurizion Cimadamore explain in one of the message, from the safety
point of view, the only solutions from those mentioned in the thread that
fit the bill are

a) either special syntax for string-templates that is distinct from
plain-strings, or
b) automatic promotion from string-*literal* (without any placeholders
inside) into StringTemplate.

If we take into account the goal stated by Brian Goetz, then we can see
that (b) looks better than (a), because we avoid differently looking
language elements.
The problem with (b) though is overload selection and many other problems
that Maurizio Cimadamore stated already in the original message of this
thread.

My observation is that if all these problems are completely new, then it's
probably hard to choose the right poison, but
my opinion is that Java language already had these problems before and
solved them, so why not solve this problem with StringTemplates the exact
same way.

Already solved problems are numeric-types, let us look at the relationship
between int and long:

* numeric-literal that fit in 32-bits can be both int and long
* numeric-literal outside the 32-bit range can only be long
* m(int),m(long) with numeric-literal that can be both int and long selects
int-overload
* n = i works as when n is long and i is int
* i = n is compile-time error
* i = (int) n succeeds, when i is int and n is long
* n instanceof int soon to succeed on long variable n as long as n fits
within 32-bits
* i instanceof long succeeds when i is int
* additionally numeric-literal can use "l" or "L" suffix to denote that it
is really long, this can be used to tweak overload-selection

I think the above can be translated almost word for word to StringTemplates
world:

* stringy-literal that doesn't have holes-with-values can be both String
and StringTemplate
* stringy-literal that has holes-with-values can only be StringTemplate
* m(String),m(StringTemplate) with stringy-literal that can be both String
and StringTemplate selects String-overload
* t = s works as when t is StringTemplate and s is String
* s = t is compile-time error
* s = (String) t succeeds, when s is String and t is StringTemplate (and
does string concatenation)
* t instanceof String succeeds on StringTemplate variable t as long as t
doesn't have any holes-with-values
* s instanceof StringTemplate succeeds when s is String
* additionally stringy-literal can use "t" or "T" *suffix* to denote that
it is really a template, this can be used to tweak overload-selection and
to certify, that some processing of values is expected

For me the table for String-StringTemplates satisfies both (1) and (2)
goals and feels natural for Java-language, because most of these rules have
been present in the language for more than 20 years already.

--
Victor Nazarov


On Fri, Mar 15, 2024 at 12:59?PM Maurizio Cimadamore <
maurizio.cimadamore at oracle.com> wrote:

>
> On 15/03/2024 02:10, Guy Steele wrote:
>
> Oh, I think I get it now; I misinterpreted "The compiler might require a
> prefix here? to mean "The compiler might require a prefix on a literal that
> is a method argument?, but I now see, from your later sentence "Basically,
> requiring all literals that have embedded expression to have a prefix . .
> .? that maybe you just want to adjust the syntax of literals to be roughly
> what Clement suggested:
>
> ???                             plain string literal, cannot contain \{?},
> type is String
> INTERPOLATION???     string interpolation, may contain \{?}, type is String
> TEMPLATE???      string template, , may contain \{?}, type is
> StringTemplate
>
> where the precise syntax for the prefixed INTERPOLATION and TEMPLATE is to
> be determined. Do I understand your proposal correctly now?
>
> Yes, with the further tweak that the prefix (with syntax TBD) might be
> omitted in the "obvious cases" (but kept for clarity):
>
> * "Hello" w/o prefix is just String
> * "Hello \{world}" without prefix is just StringTemplate
>
> Does this help? (I'm basically trying to get to a world where use of
> prefix will be relatively rare, as common cases have the right defaults).
>
> Maurizio
>
>
> ?Guy
>
> On Mar 14, 2024, at 9:05?PM, Guy Steele <guy.steele at oracle.com>
> <guy.steele at oracle.com> wrote:
>
> Thanks for these derails, but they don?t quite answer my question: how
> does the compiler makes the decision to require the prefix? Specifically,
> is it done purely by examining the types of the literals (in which case the
> existing story, about how method overloading decides which of several
> methods with the same name to call, is adequate), or are you imagining some
> additional ad-hoc mechanism that is somehow examining the syntax of method
> arguments (in which case some care will be needed to ensure that it
> interacts properly with the rest of the method overloading resolution
> mechanism)? I ask because, given your explanation below, I am not seeing
> how types alone can do the job?but maybe I am missing something.
>
> ?Guy
>
> On Mar 14, 2024, at 6:15?PM, Maurizio Cimadamore
> <maurizio.cimadamore at oracle.com> <maurizio.cimadamore at oracle.com> wrote:
>
>
> On 14/03/2024 22:05, Guy Steele wrote:
>
> Is your intent that a string interpolation literal would have a type other
> than String? If so, I agree that this is a third option?with the
> consequence that each API designer now needs to contemplate three-way
> overloading.
>
> If that is not your intent, then I am not seeing how the prefix helps?so
> please explain?
>
> Let's go back to the example I mentioned:
>
> String.format("Hello, my name is %s\{name}"); // can you spot the bug?
>
> There's a string with an embedded expression here. The compiler might
> require a prefix here (e.g. do you want a string, or a string template?).
> If no prefix is added (as in the above code) it might just be an error, and
> this won't compile.
>
> This means that if I do:
>
> String.format(INTERPOLATED"Hello, my name is %s\{name}");
>
>
> I will select String.format(String, Object...) - but I will do so
> deliberately - it's not just what happens "by default" (as was the case
> before).
>
> Or, if I want the template version, I do:
>
> String.format(TEMPLATE"Hello, my name is %s\{name}");
>
>
> Basically, requiring all literals that have embedded expression to have a
> prefix removes the problem of defaulting on the String side of the fence.
> Then, personally I'd also prefer if the default was actually on the
> StringTemplate side of the fence, so that the above was actually identical
> to this:
>
> String.format("Hello, my name is %s\{name}"); // ok, this is a template
>
>
> Note that these two prefixes might also come in handy when disambiguating
> a literal with no embedded expressions. Only, in that case the default
> would point the other way.
>
> To summarize:
>
>    - template literal with arguments -> defaults to StringTemplate. User
>    can ask interpolation explicitly, by adding a prefix
>    - template literal w/o arguments -> defaults to String. User can ask a
>    degenerate template explicitly, by adding a prefix
>
> This doesn't sound too bad, and it feels like it has the defaults pointing
> the right way?
>
> Maurizio
>
> Thanks,
> Guy
>
> On Mar 14, 2024, at 6:00?PM, Maurizio Cimadamore
> <maurizio.cimadamore at oracle.com> <maurizio.cimadamore at oracle.com> wrote:
>
>
> On 14/03/2024 19:39, Guy Steele wrote:
>
> This is a very important example to consider. I observe, however, that
> there are at least two possible ways to avoid the unpleasant surprise:
>
> (1) Don't have string interpolation literals, because accidentally using a
> string interpolation literal instead of a string template literals can
> result in invoking the wrong overload of a method.
>
> (2) Don?t overload methods so as to accept either a string or a string
> template.
>
> I agree with your analysis, but note that there is also a third option:
>
> (3) make it so that both string interpolation literal and string template
> literal have a prefix.
>
> I believe that is enough to solve the issue (because the program I wrote
> would no longer compile: the compiler would require an explicit prefix).
>
> Maurizio
>
>
> If we were to take approach (2), then:
>
> (a) We would keep `println` as is, and not allow it to accept a template,
> but that?s okay?if you thought you wanted a template, what you really want
> is plan old string interpolation, and the type checking will make sure you
> don't use the wrong one.
>
> (b) A SQL processor would accept a template but not a string?if you
> thought you wanted string interpolation, what you really want is a
> template, and the type checking will make sure you don't use the wrong one.
>
> (c) I think `format` is a special case that we tend to get hung up on, and
> I think that, in this particular branch of the design space we are
> exploring, perhaps a name other than `String.format` should be chosen for
> the method that does string formatting on templates. Possible names are
> `StringTemplate.format` and `String.format$`, but I will leave further
> bikeshedding on this to others. I do recognize that this move will not
> enable the type system per se to absolutely prevent programmers from writing
>
> String.format("Hello, my name is %s{name}"); // can you spot the bug?
>
> but, as Clement has observed, such cases will probably provoke a warning
> about a mismatch between the number of arguments and the number of
> %-specifiers that require parameters, so maybe overloading would be okay
> anyway for `String.format`.
>
> Anyway, my point is that whether to overload a method to accept either a
> string or a string template can be evaluated on a case-by-case basis
> according to a small number of principles that I think we could enumerate
> and explain pretty easily.
>
> ?Guy
>
> On Mar 14, 2024, at 1:40?PM, Maurizio Cimadamore
> <maurizio.cimadamore at oracle.com> <maurizio.cimadamore at oracle.com> wrote:
>
> Not to pour too much cold water on the idea of having string interpolation
> literal, but I?d like to mention a few points here.
>
> First, it was a deliberate design goal of the string template feature to
> make interpolation an explicit act. Note that, if we had the syntax you
> describe, we actually achieve the opposite effect: string interpolation is
> now the default, and implicit, and actually *cheaper* (to type) than the
> safer template alternative. This is a bit of a red herring, I think.
>
> The second problem is that interpolation literals can sometimes be
> deceiving. Consider this example:
>
> String.format("Hello, my name is %s{name}"); // can you spot the bug?
>
> Where String::format has a new overload which accepts a StringTemplate.
>
> Basically, since here we forgot the leading ?$? (or whatever char that
> is), the whole thing is just a big interpolation. Semantically equivalent
> to:
>
>  String.format("Hello, my name is %s" + name); // whoops!
>
> This will fail, as String::format will be waiting for an argument (a
> string), but none is provided. So:
>
> |  Exception java.util.MissingFormatArgumentException: Format specifier '%s'
> |        at Formatter.format (Formatter.java:2672)
> |        at Formatter.format (Formatter.java:2609)
> |        at String.format (String.java:2897)
> |        at (#2:1)
>
> This is a very odd (and new!) failure mode, that I?m sure is gonna
> surprise developers.
>
> Maurizio
>
> On 14/03/2024 15:08, Guy Steele wrote:
>
>
> Second thoughts about how to explain a string interpolation literal:
>
>
> On Mar 13, 2024, at 2:02?PM, Guy Steele <guy.steele at oracle.com> <guy.steele at oracle.com> wrote:
> . . .
>
> ?????????
> String is not a subtype of StringTemplate; they are disjoint types.
>
> 	$?foo?              is a (trivial) string template literal
> 	?foo?                is a string literal
>         $?Hello, \{x}?     is a (nontrivial) string template literal
>         ?Hello, \{x}?      is a shorthand (expanded by the compiler) for `String.of($?Hello, \{x}?)`
> ?????????
>
> Given that the intent is that String.of (or whatever we want to call it?possibly the `interpolation` instance method of class `StringTemplate` rather than a static method `String.of`) should just do standard string concatenation, we might be better off just saying that a string interpolation literal is expanded by the compiler into uses of ?+?; for example,
>
>          ?Hello, \{x}.?
>
> (I have added a period to the example to make the point clearer) is expanded into
>
>         ?Hello, ? + x + ?.?
>
> and in general
>
>         ?c0\{e1}c1\{e2}c2?\{en}cn?
>
> (where each ck is a possibly empty sequence of string characters and each ek is an expression)  is expanded into
>
>         ?c0? + (e1) + ?c1? + (e2) + ?c2? + ? + (en) + ?cn?
>
> The point is that, with this definition, ?c0\{e1}c1\{e2}c2?\{en}cn? is a constant expression iff every ek is a constant expression. This is handy for interpolating constant variables into a string that is itself intended to be constant.
>
> ?Guy
>
>
>
>
> ?
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240315/c505fbf6/attachment-0001.htm>

From maurizio.cimadamore at oracle.com  Fri Mar 15 14:54:01 2024
From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore)
Date: Fri, 15 Mar 2024 14:54:01 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <CAFOkWZbF5m-LB6zCqYrEYugd3GNyouVTXCpLCqn9uwx5vQhLag@mail.gmail.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>
 <B52FF02B-5379-438A-9A23-05519246AB1F@oracle.com>
 <bf368957-e3c2-4d79-b462-06d20e38a387@oracle.com>
 <E3598C9D-26BA-4AC0-AE01-EE7E7C42CA44@oracle.com>
 <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com>
 <c9374ab8-1e0e-4746-9f6b-b77135ddd809@oracle.com>
 <F217B131-C5FD-4587-B251-06760F08DD36@oracle.com>
 <f0f270b1-d186-4b6b-a451-d5f01d1bcd42@oracle.com>
 <E65EA9D5-312B-4A39-99F7-013D70C6E62C@oracle.com>
 <b221590a-8082-4543-a95f-a870e9dffb8a@oracle.com>
 <330FB2A7-3154-4CC5-AA34-D4ECBFBC713C@oracle.com>
 <7873F0D5-E053-4541-B2A5-2E41B536DD8D@oracle.com>
 <794eaa0d-f244-43a3-af5b-7ecf11ac8a33@oracle.com>
 <CAFOkWZbF5m-LB6zCqYrEYugd3GNyouVTXCpLCqn9uwx5vQhLag@mail.gmail.com>
Message-ID: <40a9ecc1-f9e7-4fc5-a81e-33a468d568d8@oracle.com>

Hi
Is not all that rosy :-) Comments inline

On 15/03/2024 13:48, Victor Nazarov wrote:

> I think the above can be translated almost word for word to 
> StringTemplates world:
>
> * stringy-literal that doesn't have holes-with-values can be both 
> String and StringTemplate
> *?stringy-literal that has holes-with-values can only be StringTemplate
> * m(String),m(StringTemplate) with?stringy-literal that can be both 
> String and StringTemplate selects String-overload
> * t = s works as when t is StringTemplate and s is String

I assume you mean ?s is a /constant/ String? here.

> * s = t is compile-time error
> * s = (String) t succeeds, when s is String and t is StringTemplate 
> (and does string concatenation)

Ok, here I note that you are defining cast conversion from 
StringTemplate to String as always successful (via interpolation).

> * t instanceof String succeeds on StringTemplate variable t as long as 
> t doesn't have any holes-with-values

This is inconsistent. You now have cases where ?t instanceof String? 
returns false, but where (String)t succeds.

> * s instanceof StringTemplate succeeds when s is String

Again, probably you mean ?constant String? here.

> * additionally?stringy-literal can use "t" or "T" *suffix* to denote 
> that it is really a template, this can be used to tweak 
> overload-selection and to certify, that some processing of values is 
> expected

Overall, while I agree this is not completely terrible, we are signing 
up for a lot of work here. There?s new conversion, relationship with 
pattern matching and instanceof to figure out, possible issues with 
overload resolution and inference. For instance, yesterday I mentioned 
this example:

|List<StringTemplate> ls = List.of("Hello"); |

Which won?t work. One way to look at it, is that it?s as broken as:

|List<Long> ls = List.of(1); |

But another way to look at it is that we?re adding more complexity to a 
part of the language that already is shaky. To me that feels like a big 
risk, especially given that the "payoff" is to leave an extra ?t? out at 
the beginning of the template. In orther words, we should be careful 
about right-sizing complexity.

Also, regarding:

> 2) avoiding proliferation of String-literal sublanguages as advocated 
> by Brian Goetz

I don?t read that in the same way as you do. I think what Brian meant is 
that anything inside quotes should be uniform. We would not like to have 
different kinds of rules for escaping etc. depending on what kind of 
literal you use. In that sense, sticking a ?t? in front is no different 
from using ??? to denote that what?s coming is a text block.

Maurizio

?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240315/765bc0c7/attachment.htm>

From guy.steele at oracle.com  Fri Mar 15 16:07:35 2024
From: guy.steele at oracle.com (Guy Steele)
Date: Fri, 15 Mar 2024 16:07:35 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <794eaa0d-f244-43a3-af5b-7ecf11ac8a33@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>
 <B52FF02B-5379-438A-9A23-05519246AB1F@oracle.com>
 <bf368957-e3c2-4d79-b462-06d20e38a387@oracle.com>
 <E3598C9D-26BA-4AC0-AE01-EE7E7C42CA44@oracle.com>
 <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com>
 <c9374ab8-1e0e-4746-9f6b-b77135ddd809@oracle.com>
 <F217B131-C5FD-4587-B251-06760F08DD36@oracle.com>
 <f0f270b1-d186-4b6b-a451-d5f01d1bcd42@oracle.com>
 <E65EA9D5-312B-4A39-99F7-013D70C6E62C@oracle.com>
 <b221590a-8082-4543-a95f-a870e9dffb8a@oracle.com>
 <330FB2A7-3154-4CC5-AA34-D4ECBFBC713C@oracle.com>
 <7873F0D5-E053-4541-B2A5-2E41B536DD8D@oracle.com>
 <794eaa0d-f244-43a3-af5b-7ecf11ac8a33@oracle.com>
Message-ID: <942AC667-A175-4F72-9495-684F2FF9236E@oracle.com>


On Mar 15, 2024, at 5:56?AM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com> wrote:On 15/03/2024 02:10, Guy Steele wrote:
Oh, I think I get it now; I misinterpreted "The compiler might require a prefix here? to mean "The compiler might require a prefix on a literal that is a method argument?, but I now see, from your later sentence "Basically, requiring all literals that have embedded expression to have a prefix . . .? that maybe you just want to adjust the syntax of literals to be roughly what Clement suggested:

???                             plain string literal, cannot contain \{?}, type is String
INTERPOLATION???     string interpolation, may contain \{?}, type is String
TEMPLATE???      string template, , may contain \{?}, type is StringTemplate

where the precise syntax for the prefixed INTERPOLATION and TEMPLATE is to be determined. Do I understand your proposal correctly now?
Yes, with the further tweak that the prefix (with syntax TBD) might be omitted in the "obvious cases" (but kept for clarity):

* "Hello" w/o prefix is just String
* "Hello \{world}" without prefix is just StringTemplate

Does this help? (I'm basically trying to get to a world where use of prefix will be relatively rare, as common cases have the right defaults).

Yes, this helps immensely.

So in the model you suggest, string templates would mostly not need prefixes, but in the example I raised where one might foresee editing templates so as to cross into or out of the edge case of zero template expressions, I could choose, if I wish, to write

SQL.process(TEMPLATE?CREATE TABLE foo;?);
SQL.process(TEMPLATE?ALTER TABLE foo ADD name varchar(40);?);
SQL.process(TEMPLATE?ALTER TABLE foo ADD title varchar(30);?);
SQL.process(TEMPLATE?INSERT INTO foo (name, title) VALUES (?Guy?, ?Hacker?);?);
SQL.process(TEMPLATE?INSERT INTO foo (name, title) VALUES (\{other name}, \{other
job});?);

rather than

SQL.process(?CREATE TABLE foo;?);
SQL.process(?ALTER TABLE foo ADD name varchar(40);?);
SQL.process(?ALTER TABLE foo ADD title varchar(30);?);
SQL.process(?INSERT INTO foo (name, title) VALUES (?Guy?, ?Hacker?);?);
SQL.process(TEMPLATE?INSERT INTO foo (name, title) VALUES (\{other name}, \{other
job});?);

That makes sense to me in this two-prefix model.

Then again, now that I ponder the space of use cases, it may be that, despite my initial enthusiasm, having a separate string interpolation syntax may not carry its weight if its uses are relatively rare. We always have the option of using a string template and then applying an interpolation processor (which might be spelled `String.of(<template>)` or `(<template>).interpolate()` or some other way), and about all we lose from that approach is the ability to use string interpolation to specify a constant expression?for which we still have the old-fashioned alternative of using `+` concatenation. If we drop string interpolation, we can then drop the INTERPOLATION prefix, and we are back to a single-prefix model, and the remaining question is whether that prefix is optional, at least in some cases. Okay, I think I now have a better understanding of the relationships among the various proposals in the design space. Thanks for your patience.


And now that I have that better understanding, I think I lean toward (a) abandoning string interpolation and (b) having a single, short, _non-optional_ prefix for templates (?$? would be a plausible choice), on the grounds that I think it makes code more readable if templates are always distinguished up front from strings?and this is especially helpful when the templates are rather long and any `\{` present might be far from the beginning. It has a minimal number of cases to explain:

???      string literal, must not contain \{?}, type String
$???    template literal, may contain \{?}, type StringTemplate

I think we have all made an honest effort to explain string templates as a simple and clean superset of string literals, but now that we have considered the typing and overloading issues, my opinion is that it just isn?t possible without some amount of unwanted complication. Strings and string templates are just different beasts, and we would do well to maintain that distinction rather than trying to conflate them. Yes, requiring a prefix on templates would impose a small cost?perhaps we should regard it as a "syn-tax" rather than a ?cover charge??on every template we write, but I judge that cost well worth it for the readability it would buy.

(By the way, I appreciate John?s suggestion of allowing a template to begin with  ?\{} , but this strikes me as kind of a hack rather than a natural use of the \{?} syntax. A distinctive single-character prefix would be better.)

?Guy


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240315/12764663/attachment-0001.htm>

From maurizio.cimadamore at oracle.com  Fri Mar 15 16:31:28 2024
From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore)
Date: Fri, 15 Mar 2024 16:31:28 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <942AC667-A175-4F72-9495-684F2FF9236E@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>
 <B52FF02B-5379-438A-9A23-05519246AB1F@oracle.com>
 <bf368957-e3c2-4d79-b462-06d20e38a387@oracle.com>
 <E3598C9D-26BA-4AC0-AE01-EE7E7C42CA44@oracle.com>
 <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com>
 <c9374ab8-1e0e-4746-9f6b-b77135ddd809@oracle.com>
 <F217B131-C5FD-4587-B251-06760F08DD36@oracle.com>
 <f0f270b1-d186-4b6b-a451-d5f01d1bcd42@oracle.com>
 <E65EA9D5-312B-4A39-99F7-013D70C6E62C@oracle.com>
 <b221590a-8082-4543-a95f-a870e9dffb8a@oracle.com>
 <330FB2A7-3154-4CC5-AA34-D4ECBFBC713C@oracle.com>
 <7873F0D5-E053-4541-B2A5-2E41B536DD8D@oracle.com>
 <794eaa0d-f244-43a3-af5b-7ecf11ac8a33@oracle.com>
 <942AC667-A175-4F72-9495-684F2FF9236E@oracle.com>
Message-ID: <21e82bd3-a5a4-45da-ad68-57990533466b@oracle.com>

Hi

On 15/03/2024 16:07, Guy Steele wrote:

> Then again, now that I ponder the space of use cases, it may be that, 
> despite my initial enthusiasm, having a separate string interpolation 
> syntax may not carry its weight if its uses are relatively rare. We 
> always have the option of using a string template and then applying an 
> interpolation processor (which might be spelled 
> `String.of(<template>)` or `(<template>).interpolate()` or some other 
> way), and about all we lose from that approach is the ability to use 
> string interpolation to specify a constant expression?for which we 
> still have the old-fashioned alternative of using `+` concatenation. 
> If we drop string interpolation, we can then drop the INTERPOLATION 
> prefix, and we are back to a single-prefix model, and the remaining 
> question is whether that prefix is optional, at least in some cases. 
> Okay, I think I now have a better understanding of the relationships 
> among the various proposals in the design space. Thanks for your patience.

I think the advantage for /not/ having a string interpolation prefix, is 
that then interpolation is ?just another processor? e.g. a static method 
somewhere that takes a string template and returns a String. Another 
String::format, in a way. So that leads to a rather uniform design.

>
>
> And now that I have that better understanding, I think I lean toward 
> (a) abandoning string interpolation and (b) having a single, short, 
> _non-optional_ prefix for templates (?$? would be a plausible choice), 
> on the grounds that I think it makes code more readable if templates 
> are always distinguished up front from strings?and this is especially 
> helpful when the templates are rather long and any `\{` present might 
> be far from the beginning. It has a minimal number of cases to explain:
>
> ??? ? ? ?string literal, must not contain \{?}, type String
> $??? ? ?template literal, may contain \{?}, type StringTemplate

Yep, I agreee this a very principled way to look at the problem.

>
> I think we have all made an honest effort to explain string templates 
> as a simple and clean superset of string literals, but now that we 
> have considered the typing and overloading issues, my opinion is that 
> it just isn?t possible without some amount of unwanted complication. 
> Strings and string templates are just different beasts, and we would 
> do well to maintain that distinction rather than trying to conflate 
> them. Yes, requiring a prefix on templates would impose a small 
> cost?perhaps we should regard it as a "syn-tax" rather than a ?cover 
> charge??on every template we write, but I judge that cost well worth 
> it for the readability it would buy.
>
> (By the way, I appreciate John?s suggestion of allowing a template to 
> begin with ??\{} , but this strikes me as kind of a hack rather than a 
> natural use of the \{?} syntax. A distinctive single-character prefix 
> would be better.)

I think there?s a place for {}. E.g. where {} shines IMHO, is to allow 
for /comments/ inside string templates:

|t""" hello this is \{ /* a comment here */ } a commented template """ |

But as a ?template? prefix, it?s a bit of a lousy choice. For instance, 
one can argue that ?1.0? is different from ?1?. But there?s only one 
place where to look for that ?.0?. Whereas, inside a text block, you can 
have many different places where the {} ends up. So, while this solves 
the problem for compiler writers, {} is not a good solution for us humans.

Maurizio

?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240315/892cf0db/attachment.htm>

From ccherlin at gmail.com  Fri Mar 15 18:53:00 2024
From: ccherlin at gmail.com (Clement Cherlin)
Date: Fri, 15 Mar 2024 13:53:00 -0500
Subject: Update on String Templates (JEP 459)
In-Reply-To: <21e82bd3-a5a4-45da-ad68-57990533466b@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>
 <B52FF02B-5379-438A-9A23-05519246AB1F@oracle.com>
 <bf368957-e3c2-4d79-b462-06d20e38a387@oracle.com>
 <E3598C9D-26BA-4AC0-AE01-EE7E7C42CA44@oracle.com>
 <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com>
 <c9374ab8-1e0e-4746-9f6b-b77135ddd809@oracle.com>
 <F217B131-C5FD-4587-B251-06760F08DD36@oracle.com>
 <f0f270b1-d186-4b6b-a451-d5f01d1bcd42@oracle.com>
 <E65EA9D5-312B-4A39-99F7-013D70C6E62C@oracle.com>
 <b221590a-8082-4543-a95f-a870e9dffb8a@oracle.com>
 <330FB2A7-3154-4CC5-AA34-D4ECBFBC713C@oracle.com>
 <7873F0D5-E053-4541-B2A5-2E41B536DD8D@oracle.com>
 <794eaa0d-f244-43a3-af5b-7ecf11ac8a33@oracle.com>
 <942AC667-A175-4F72-9495-684F2FF9236E@oracle.com>
 <21e82bd3-a5a4-45da-ad68-57990533466b@oracle.com>
Message-ID: <CALEU8=y31m83ppREsvN-pxNiyEeTJYgfR0WueGCc3qh-j27=qg@mail.gmail.com>

On Fri, Mar 15, 2024 at 11:39?AM Maurizio Cimadamore <
maurizio.cimadamore at oracle.com> wrote:

> Hi
>
> On 15/03/2024 16:07, Guy Steele wrote:
>
> Then again, now that I ponder the space of use cases, it may be that,
> despite my initial enthusiasm, having a separate string interpolation
> syntax may not carry its weight if its uses are relatively rare. We always
> have the option of using a string template and then applying an
> interpolation processor (which might be spelled `String.of(<template>)` or
> `(<template>).interpolate()` or some other way), and about all we lose from
> that approach is the ability to use string interpolation to specify a
> constant expression?for which we still have the old-fashioned alternative
> of using `+` concatenation. If we drop string interpolation, we can then
> drop the INTERPOLATION prefix, and we are back to a single-prefix model,
> and the remaining question is whether that prefix is optional, at least in
> some cases. Okay, I think I now have a better understanding of the
> relationships among the various proposals in the design space. Thanks for
> your patience.
>
> I think the advantage for *not* having a string interpolation prefix, is
> that then interpolation is ?just another processor? e.g. a static method
> somewhere that takes a string template and returns a String. Another
> String::format, in a way. So that leads to a rather uniform design.
>
>
>
> And now that I have that better understanding, I think I lean toward (a)
> abandoning string interpolation and (b) having a single, short,
> _non-optional_ prefix for templates (?$? would be a plausible choice), on
> the grounds that I think it makes code more readable if templates are
> always distinguished up front from strings?and this is especially helpful
> when the templates are rather long and any `\{` present might be far from
> the beginning. It has a minimal number of cases to explain:
>
> ???      string literal, must not contain \{?}, type String
> $???    template literal, may contain \{?}, type StringTemplate
>
> Yep, I agreee this a very principled way to look at the problem.
>
It's definitely the simplest and most consistent solution. That said,
having used the current preview, I've found STR to be both useful and
clear: It permits easy interpolation while being upfront that there is no
safety guarantee. I'd hate to give that up without a suitable replacement.
I suppose the standard library could provide "public static String
str(StringTemplate)".

Is it overly greedy to want someStringMethod(STR"interpolation \{here}")
instead of someStringMethod(str($"interpolation \{here}")) just so I can
avoid writing the extra ($ and ) ?

>
> I think we have all made an honest effort to explain string templates as a
> simple and clean superset of string literals, but now that we have
> considered the typing and overloading issues, my opinion is that it just
> isn?t possible without some amount of unwanted complication. Strings and
> string templates are just different beasts, and we would do well to
> maintain that distinction rather than trying to conflate them. Yes,
> requiring a prefix on templates would impose a small cost?perhaps we should
> regard it as a "syn-tax" rather than a ?cover charge??on every template we
> write, but I judge that cost well worth it for the readability it would buy.
>
> I agree. I don't think the ease of being able to use an un-prefixed
template *some of the time* justifies the mental, specification, and
implementation overhead of context-sensitive string-ish literal typing.

>
> (By the way, I appreciate John?s suggestion of allowing a template to
> begin with  ?\{} , but this strikes me as kind of a hack rather than a
> natural use of the \{?} syntax. A distinctive single-character prefix would
> be better.)
>
> I think there?s a place for {}. E.g. where {} shines IMHO, is to allow for
> *comments* inside string templates:
>
I can't argue with that.

In the preview, an empty interpolation creates an extra fragment and a null
reference in the value array, which is weird and not particularly useful. \{
/* optional comment */ } should probably be removed as though it was never
there.

t"""
>    hello
>    this is
>    \{ /* a comment here */ }
>   a commented
>   template
> """
>
> But as a ?template? prefix, it?s a bit of a lousy choice. For instance,
> one can argue that ?1.0? is different from ?1?. But there?s only one place
> where to look for that ?.0?. Whereas, inside a text block, you can have
> many different places where the {} ends up. So, while this solves the
> problem for compiler writers, {} is not a good solution for us humans.
>
> Maurizio
>
I agree.

Cheers,
Clement Cherlin

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240315/7db10f89/attachment-0001.htm>

From msterner at openjdk.mxy.se  Sat Mar 16 04:29:47 2024
From: msterner at openjdk.mxy.se (Mikael Sterner)
Date: Sat, 16 Mar 2024 06:29:47 +0200
Subject: String Template processors vs Code Reflection?
Message-ID: <8b09da5b-5224-4428-8344-2816e01e15bf@app.fastmail.com>

Hi Experts!

What is the relationship between string template processors and code reflection, and would it influence the design of string templates and their literals?

Is the need to process string templates, seen as code snippets aggregating static and dynamic strings, just a special case of a more general pattern of processing code snippets semi-lazily using custom rules? (Such as safe handling of dynamic strings, or contextual operator overloading.)

Examples:

// String template processor

String table = "foo bar";
ResultSet r = s.executeQuery("SELECT * FROM \{table\}"); // Escape dynamic table names

// Code reflection processors

String table = "foo bar";
ResultSet r = s.executeQuery(@CodeReflection () -> "SELECT * FROM " + table); // Escape dynamic table names

String value = "bar";
Pattern p = Pattern.compile(@CodeReflection () -> "foo = " + value); // Quote dynamic strings

Matrix m = Matrix.eval(@CodeReflection () -> Matrix.diag(1, 2, 3) * Matrix.col(2, 3, 4)); // Matrix multiplication

Document d = HTML.compile(@CodeReflection () -> Body.of(Div.of("Hello World!"))); // Escape strings

Each such code reflecting processor API defining it's own rules for how to handle code snippets, including processing of any raw string templates when/if they are added to the language. Other types than String and StringTemplate being handled as fits each API, and to the level of type safety wanted by the API user. Evaluation being done lazily or eagerly as appropriate.

(In the above "@CodeReflection () ->" is short for some syntax allowing inline code reflecting snippets, passed to the API processors in a way that allows them to reflect on the snippet code.)

Yours,
Mikael Sterner

From forax at univ-mlv.fr  Sat Mar 16 07:18:56 2024
From: forax at univ-mlv.fr (Remi Forax)
Date: Sat, 16 Mar 2024 08:18:56 +0100 (CET)
Subject: Update on String Templates (JEP 459)
In-Reply-To: <21e82bd3-a5a4-45da-ad68-57990533466b@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <E65EA9D5-312B-4A39-99F7-013D70C6E62C@oracle.com>
 <b221590a-8082-4543-a95f-a870e9dffb8a@oracle.com>
 <330FB2A7-3154-4CC5-AA34-D4ECBFBC713C@oracle.com>
 <7873F0D5-E053-4541-B2A5-2E41B536DD8D@oracle.com>
 <794eaa0d-f244-43a3-af5b-7ecf11ac8a33@oracle.com>
 <942AC667-A175-4F72-9495-684F2FF9236E@oracle.com>
 <21e82bd3-a5a4-45da-ad68-57990533466b@oracle.com>
Message-ID: <1004847945.30672029.1710573536109.JavaMail.zimbra@univ-eiffel.fr>

> From: "Maurizio Cimadamore" <maurizio.cimadamore at oracle.com>
> To: "Guy Steele" <guy.steele at oracle.com>
> Cc: "amber-spec-experts" <amber-spec-experts at openjdk.org>
> Sent: Friday, March 15, 2024 5:31:28 PM
> Subject: Re: Update on String Templates (JEP 459)

> Hi

> On 15/03/2024 16:07, Guy Steele wrote:

>> Then again, now that I ponder the space of use cases, it may be that, despite my
>> initial enthusiasm, having a separate string interpolation syntax may not carry
>> its weight if its uses are relatively rare. We always have the option of using
>> a string template and then applying an interpolation processor (which might be
>> spelled `String.of(<template>)` or `(<template>).interpolate()` or some other
>> way), and about all we lose from that approach is the ability to use string
>> interpolation to specify a constant expression?for which we still have the
>> old-fashioned alternative of using `+` concatenation. If we drop string
>> interpolation, we can then drop the INTERPOLATION prefix, and we are back to a
>> single-prefix model, and the remaining question is whether that prefix is
>> optional, at least in some cases. Okay, I think I now have a better
>> understanding of the relationships among the various proposals in the design
>> space. Thanks for your patience.

> I think the advantage for not having a string interpolation prefix, is that then
> interpolation is ?just another processor? e.g. a static method somewhere that
> takes a string template and returns a String. Another String::format, in a way.
> So that leads to a rather uniform design.

>> And now that I have that better understanding, I think I lean toward (a)
>> abandoning string interpolation and (b) having a single, short, _non-optional_
>> prefix for templates (?$? would be a plausible choice), on the grounds that I
>> think it makes code more readable if templates are always distinguished up
>> front from strings?and this is especially helpful when the templates are rather
>> long and any `\{` present might be far from the beginning. It has a minimal
>> number of cases to explain:

>> ??? string literal, must not contain \{?}, type String
>> $??? template literal, may contain \{?}, type StringTemplate

> Yep, I agreee this a very principled way to look at the problem.

[...] 

This is how i like to explain the design space to myself. 
We have two kind of strings, tainted string and untainted string (this is not new, see [1]). 
An untainted string is a string that can be escaped properly, in our case a StringTemplate. A tainted string is just a String. 

We do not want a String to be a StringTemplate, because it means all untainted strings are tainted strings. 
We do not want a StringTemplate to be a String, because it means that all tainted strings are untainted strings. 
So both are different types, with neither a subtype relationship nor an automatic conversion between them. 

For the literals, we need two different constructs otherwise we will have a conversion between tainted and untainted strings, 
we also need the literal to construct an untainted string to be different and upfront to easily distinguish an untainted string from a tainted string, so 
- "..." constructs a String, a tainted string, 
- TEMPLATE"..." constructs a StringTemplate, an untainted string. 

About string interpolation, this is another way to create a String and this is not directly related to a string being tainted or not, so it's a kind of orthogonal in term of design. 
It can not be a prefix like INTERPOLATE, because this is different in nature from TEMPLATE, TEMPLATE creates another kind of String, interpolation creates just a String. 
Having a static method (a processor) that creates a String from a StringTemplate creates a common conduit to get a tainted string from any untainted strings, which makes the distinction between untainted string and tainted string less relevant. So i would advise to not go in that direction. 

> Maurizio
> ?

R?mi 

[1] [ https://en.wikipedia.org/wiki/Taint_checking | https://en.wikipedia.org/wiki/Taint_checking ] 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240316/0fb46118/attachment-0001.htm>

From gavin.bierman at oracle.com  Sun Mar 17 17:14:10 2024
From: gavin.bierman at oracle.com (Gavin Bierman)
Date: Sun, 17 Mar 2024 17:14:10 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <1004847945.30672029.1710573536109.JavaMail.zimbra@univ-eiffel.fr>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <E65EA9D5-312B-4A39-99F7-013D70C6E62C@oracle.com>
 <b221590a-8082-4543-a95f-a870e9dffb8a@oracle.com>
 <330FB2A7-3154-4CC5-AA34-D4ECBFBC713C@oracle.com>
 <7873F0D5-E053-4541-B2A5-2E41B536DD8D@oracle.com>
 <794eaa0d-f244-43a3-af5b-7ecf11ac8a33@oracle.com>
 <942AC667-A175-4F72-9495-684F2FF9236E@oracle.com>
 <21e82bd3-a5a4-45da-ad68-57990533466b@oracle.com>
 <1004847945.30672029.1710573536109.JavaMail.zimbra@univ-eiffel.fr>
Message-ID: <43200EB3-9C78-407D-9F64-670036A4FE7D@oracle.com>

Hi Remi,

Yes, I think this is a good way to think about the design space. (It is a shame that the fact that this is NOT about string interpolation, but something much more general and focused on security - even though made explicit in the JEP - has been lost in some of the wider discussions.)

You can make the distinction even clearer - reading from the spec - a template "\{x} + \{y}? can be thought of as sugar for the expression new $HiddenClassImplementsStringTemplate(List.of("", " + ", ""), List.of(x, y)). So, sure, it?s an object that has the potential to be a string, but it?s an object with a couple of lists in it. The fact that the embedded values are kept as a separate list, and so can be validated and dealt with using domain-specific logic, is the key to safety. You need to write code to transform template values into something else (perhaps a string). In the old model, that was the role of the processor (and the reason why they came first - to remind you that the template needed processing to get a value), and with the new model will be a method. I agree with you that any design that makes it easy to conflate templates with strings is a road to another 30+ years of injection attacks.

Gavin


On 16 Mar 2024, at 07:18, Remi Forax <forax at univ-mlv.fr> wrote:


________________________________
From: "Maurizio Cimadamore" <maurizio.cimadamore at oracle.com>
To: "Guy Steele" <guy.steele at oracle.com>
Cc: "amber-spec-experts" <amber-spec-experts at openjdk.org>
Sent: Friday, March 15, 2024 5:31:28 PM
Subject: Re: Update on String Templates (JEP 459)

Hi

On 15/03/2024 16:07, Guy Steele wrote:

Then again, now that I ponder the space of use cases, it may be that, despite my initial enthusiasm, having a separate string interpolation syntax may not carry its weight if its uses are relatively rare. We always have the option of using a string template and then applying an interpolation processor (which might be spelled `String.of(<template>)` or `(<template>).interpolate()` or some other way), and about all we lose from that approach is the ability to use string interpolation to specify a constant expression?for which we still have the old-fashioned alternative of using `+` concatenation. If we drop string interpolation, we can then drop the INTERPOLATION prefix, and we are back to a single-prefix model, and the remaining question is whether that prefix is optional, at least in some cases. Okay, I think I now have a better understanding of the relationships among the various proposals in the design space. Thanks for your patience.

I think the advantage for not having a string interpolation prefix, is that then interpolation is ?just another processor? e.g. a static method somewhere that takes a string template and returns a String. Another String::format, in a way. So that leads to a rather uniform design.


And now that I have that better understanding, I think I lean toward (a) abandoning string interpolation and (b) having a single, short, _non-optional_ prefix for templates (?$? would be a plausible choice), on the grounds that I think it makes code more readable if templates are always distinguished up front from strings?and this is especially helpful when the templates are rather long and any `\{` present might be far from the beginning. It has a minimal number of cases to explain:

???      string literal, must not contain \{?}, type String
$???    template literal, may contain \{?}, type StringTemplate

Yep, I agreee this a very principled way to look at the problem.

[...]

This is how i like to explain the design space to myself.
We have two kind of strings, tainted string and untainted string (this is not new, see [1]).
An untainted string is a string that can be escaped properly, in our case a StringTemplate. A tainted string is just a String.

We do not want a String to be a StringTemplate, because it means all untainted strings are tainted strings.
We do not want a StringTemplate to be a String, because it means that all tainted strings are untainted strings.
So both are different types, with neither a subtype relationship nor an automatic conversion between them.

For the literals, we need two different constructs otherwise we will have a conversion between tainted and untainted strings,
we also need the literal to construct an untainted string to be different and upfront to easily distinguish an untainted string from a tainted string, so
- "..." constructs a String, a tainted string,
- TEMPLATE"..." constructs a StringTemplate, an untainted string.

About string interpolation, this is another way to create a String and this is not directly related to a string being tainted or not, so it's a kind of orthogonal in term of design.
It can not be a prefix like INTERPOLATE, because this is different in nature from TEMPLATE, TEMPLATE creates another kind of String, interpolation creates just a String.
Having a static method (a processor) that creates a String from a StringTemplate creates a common conduit to get a tainted string from any untainted strings, which makes the distinction between untainted string and tainted string less relevant. So i would advise to not go in that direction.


Maurizio

?

R?mi

[1] https://en.wikipedia.org/wiki/Taint_checking


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240317/fee34942/attachment-0001.htm>

From brian.goetz at oracle.com  Mon Mar 18 13:38:34 2024
From: brian.goetz at oracle.com (Brian Goetz)
Date: Mon, 18 Mar 2024 09:38:34 -0400
Subject: Update on String Templates (JEP 459)
In-Reply-To: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
Message-ID: <4ad249e1-49de-4e83-bf5a-3577e2aaa5df@oracle.com>

I think this has been a good discussion, and it looks like we're 
starting to see some convergence.

I think we keep trying to exploit ambiguity / implicitness, and it 
doesn't go well:

 ?- Many users want STR to be the "implicit processor", but that isn't 
good for security
 ?- We tried reusing the String delimiters for string templates to 
reduce the perception of how many different things there are here, but 
that creates cognitive load (can't tell strings from templates without 
parsing the entire contents), among other problems
 ?- We tried making String a poly expression (and other tricks) to 
reduce the number of explicit conversions, but that created problems too

John's characterization captures the feeling and eventual conclusion 
that I think many of us share:

> I kind of like Guy?s offensive-to-everyone suggestion that $ is required to make a true ST.

Indeed, my first reaction to the $ sigil was "please no", but I am 
grudgingly coming to the conclusion that we should stop trying to 
implicitly "just figure out what the user wants" and acknowledge the 
reality: templates are not strings, strings are not templates, and they 
can be converted to each other with ... methods, just like any other 
relatable types.? So string literals are as they always were; string 
templates are a new thing, whose syntax and type is disjoint from that 
of strings, as Guy also seems to be converging on:

> And now that I have that better understanding, I think I lean toward 
> (a) abandoning string interpolation and (b) having a single, short, 
> _non-optional_ prefix for templates (?$? would be a plausible choice), 
> on the grounds that I think it makes code more readable if templates 
> are always distinguished up front from strings?and this is especially 
> helpful when the templates are rather long and any `\{` present might 
> be far from the beginning. It has a minimal number of cases to explain:
>
> ??? ? ? ?string literal, must not contain \{?}, type String
> $??? ? ?template literal, may contain \{?}, type StringTemplate

(concrete syntax TBB (to be bikeshod), along with the spellings of S -> 
ST and ST -> S.)

Some more useful observations:

 ?- The toString behavior cannot be mere interpolation.? Besides the 
principled objections and inevitable propping-open-the-security-door 
that this would lead to, people will quickly learn to abuse "" + ST as 
the "fewest characters required" way to get interpolation, which is 
"clever" in the same way that John's "empty \{}" trick is clever, but 
not good for clarity.
 ?- We need a story to tell for how to write good overloads, which seems 
to be more subtle than initially thought.
 ?- If the only way to make a StringTemplate is the literal syntax, then 
STs gain a valuable security property: all fragments in the ST are 
strings that appeared literally in code, and therefore untainted.? This 
is probably too restrictive but we should be aware of what we are giving 
up as we explore the API options.
 ?- Processors should be encouraged to "flatten" embedded STs.

A few people have implied that only the tainted parts of an ST (the 
embedded expressions) need special processing, but I'll point out that 
the untainted parts may often require domain-specific validation.? For 
example, a ST representing a SQL query wants balanced quotes, and might 
want to require quotes around embedded expressions.


On 3/8/2024 1:35 PM, Brian Goetz wrote:
>
> Time to check in with where were are with String Templates. ?We?ve 
> gone through two rounds of preview, and have received some feedback.
>
> As a reminder, the primary goal of gathering feedback is to learn 
> things about the design or implementation that we don?t already know. 
> ?This could be bug reports, experience reports, code review, careful 
> analysis, novel alternatives, etc. ? ?And the best feedback usually 
> comes from using the feature??in anger??? trying to actually write 
> code with it. (?Some people would prefer a?different syntax? or??some 
> people would prefer we focused on string interpolation only??fall 
> squarely in the??things we already knew? camp.)
>
> In the course of?using this feature in the `jextract` project, we did 
> learn quite a few things we didn?t already know, and this was 
> conclusive enough that it has motivated us to adjust our approach in 
> this feature. ?Specifically, the role of processors is ?outsized? to 
> the value they offer, and, after further exploration, we now believe 
> it is possible to achieve the goals of the feature without an explicit 
> ?processor? abstraction at all! ?This is a very positive development.
>
> First, I want to affirm that that the goals of the project have not 
> changed. ?From JEP 459:
>
> Goals
>
> ? Simplify the writing of Java programs by making it easy to express 
> strings that include values computed at run time.
> ? Enhance the readability of expressions that mix text and 
> expressions, whether the text fits on a single source line (as with 
> string literals) or spans several source lines (as with text blocks).
> ? Improve the security of Java programs that compose strings from 
> user-provided values and pass them to other systems (e.g., building 
> queries for databases) by supporting validation and transformation of 
> both the template and the values of its embedded expressions.
> ? Retain flexibility by allowing Java libraries to define the 
> formatting syntax used in string templates.
> ? Simplify the use of APIs that accept strings written in non-Java 
> languages (e.g., SQL, XML, and JSON).
> ? Enable the creation of non-string values computed from literal text 
> and embedded expressions without having to transit through an 
> intermediate string representation.
>
> Non-Goals
> ? It is not a goal to introduce syntactic sugar for Java's string 
> concatenation operator (+), since that would circumvent the goal of 
> validation.
> ? It is not a goal to deprecate or remove the StringBuilder and 
> StringBuffer classes, which have traditionally been used for complex 
> or programmatic string composition.
>
> Another thing that has not changed is our view on the syntax for 
> embedding expressions. ?While many people did express the opinion of 
> ?why not ?just' do what Kotlin/Scala does?, this issue was more than 
> fully explored during the initial design round. ?(In fact, while 
> syntax disagreements are often purely subjective, this one was far 
> more clear ? the $-syntax is objectively worse, and would be doubly so 
> if injected into an existing language where there were already string 
> literals in the wild. ?This has all been more than adequately covered 
> elsewhere, so I won?t rehash it here.)
>
>
> Now, let?s talk about what we do think should change: the role of 
> processors and the StringTemplate type.
>
> Processors were envisioned as a means to abstract the transformation 
> of templates to their final form (whether string, or something else.) 
> ?However, Java already has a well established means of abstracting 
> behavior: methods. ? (In fact, a processor application can be viewed 
> as merely a new syntax for a method call.) ?Our experience using the 
> feature highlighted the question: When converting a SQL query 
> expressed as a template to the form required by the database (such as 
> PreparedStatement), why do we need to say:
>
> ??DB.?? template ??
>
> When we could use an ordinary Java library:
>
> ??Query q = Query.of(??template??)
>
> Indeed, one of the worst things about having processors in the 
> language is that API designers are put in the difficult situation of 
> not knowing whether to write a processor or an ordinary API, and often 
> have to make that choice before the consequences are fully understood. 
> ?(To add to this, processors raise similar questions at the use site.) 
> But the real?criticism here is that template capture and processing 
> are complected, when they should be separate, composable features.
>
> This motivated us to revisit some of the reasons why processors were 
> so central to the initial design in the first place. ?And it turned 
> out, this choice had been influenced ? perhaps overly so ? by early 
> implementation experiments. ?(One of the background design goals was 
> to enable expensive operations like `String::format` to be (much) 
> cheaper. ?Without digressing too deeply on performance, String::format 
> can be more than an order of magnitude worse than the equivalent 
> concatenation operation, and this in turn sometimes motivates 
> developers to use worse idioms for formatting. ?The FMT processor 
> brough that cost back in line with the equivalent concatenation.) 
> ?These early experiments biased the design towards needing to know the 
> processor at the point of template capture, but upon reexamination we 
> realized that there are other ways to achieve the desired performance 
> goals without requiring processors to be known at capture time. ?This, 
> in turn, enabled us to revisit a point in the design space we had 
> transited through earlier, where string templates were ?just a new 
> kind of literal? and the job performed by processors could instead be 
> performed by ordinary APIs.
>
> At this point, a simpler design and implementation emerged that met 
> the semantic, correctness, and performance goals: template literals 
> (?Hello \{name}?) are simply the literal form of StringTemplate:
>
> ??StringTemplate st = ?Hello \{name}?;
>
> String and StringTemplate remain unrelated types. ?(We explored a 
> number of ways to interconvert them, but they caused more trouble than 
> they solved.) ?Processing of string templates, including 
> interpolation, is done by ordinary APIs that deal in StringTemplate, 
> aided by some clever implementation tricks to ensure good performance.
>
> For APIs where interpolation is known to be safe in the domain, such 
> as PrintWriter, APIs can make that choice on behalf of the domain, by 
> providing overloads to embody this design choice:
>
> ???void println(String) { ? }
> ???void println(StringTemplate) { ? interpolate and delegate to 
> println(String) ?. }
>
> The upshot is that for interpolation-safe APIs like println, we can 
> use a template directly without giving up any safety:
>
> ???System.out.println(?Hello \{name}?);
>
> In this example, the string template evaluates to StringTemplate, not 
> String (no implicit interpolation), and chooses the StringTemplate 
> overload of println, which in turn chooses how to process the 
> template. This stays true to the design principle that interpolation 
> is dangerous enough that it should be an explicit choice in the code ? 
> but it allows that choice to be made by libraries when the library is 
> comfortable doing so.
>
> Similarly, the FMT processor is replaced by an overload of 
> String::format that interprets templates with embedded format 
> specifiers (e.g., ?%d?):
>
> ??String format(String formatString, Object? parameters) { ? same as 
> today ? }
> ??String format(StringTemplate template) {... equivalent of FMT ...}
>
> And users can call this as:
>
> ??String s = String.format(?Hello %12s\{name}?);
>
> Here, the String::format API has chosen to interpret string templates 
> according to the rules previously specified in the FMT processor (not 
> ordinary interpolation), but that choice is embedded in the library 
> semantics so no further explicit choice at the use site is required. 
> ?The user already chose to pass it to String::format; that?s all the 
> processing selection that is needed.
>
> Where APIs do not express a choice of what template expansion means, 
> users continue to be free to process them explicitly before passing 
> them, using APIs that do (such as String::format or ordinary 
> interpolation.).
>
> The result is:
>
> - The need for use-site "goop" (previously, the processor name; now, 
> static or instance methods to process a template) goes away entirely 
> when dealing with libraries that are already template-friendly.
> - Even with libraries that require use-site goop, it is no more 
> intrusive than before, and can be reduced over time as APIs get with 
> the program.
> - StringTemplate is just another type that APIs can support if they 
> want. ?The "DB" processor becomes an ordinary factory method that 
> accepts a string template or an ordinary builder API.
> - APIs now can have _more_ control over the timing and meaning of 
> template processing, because we are not biasing so strongly towards 
> early processing.
> - It becomes easier to abstract over template processing (i.e., 
> combine or manipulate templates as templates before processing)
> - Interpolation remains an explicit choice, but ST-aware libraries can 
> make this choice on behalf of the user.
> - The language feature and API surface get considerably smaller, which 
> is good. ?Core JDK APIs (e.g., println, format, exception 
> constructors) get upgraded to work with string templates.
>
> The remaining question that everyone is probably asking is: ?so how do 
> we do interpolation.? ?The answer there is ?ordinary library methods?. 
> ?This might be a static method (String.join(StringTemplate)) or an 
> instance method (template.join()), shed to be painted (but please, not 
> right now.).
>
> This is a sketch of direction, so feel free to pose questions/comments 
> on the direction. ?We?ll discuss the details as we go.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240318/8ad18105/attachment-0001.htm>

From ccherlin at gmail.com  Mon Mar 18 13:50:21 2024
From: ccherlin at gmail.com (Clement Cherlin)
Date: Mon, 18 Mar 2024 08:50:21 -0500
Subject: String Template processors vs Code Reflection?
In-Reply-To: <8b09da5b-5224-4428-8344-2816e01e15bf@app.fastmail.com>
References: <8b09da5b-5224-4428-8344-2816e01e15bf@app.fastmail.com>
Message-ID: <CALEU8=xWvQtKHPbrtHLeEQGyGtfWZb8UObDofVCJqRxrcRWzzQ@mail.gmail.com>

Hi Mikael,

It looks to me like you're talking about a generic macro processor, not a
string template processor. While that's an interesting idea, I think the
scope is so much greater than string templates, it would make more sense as
its own proposal (see https://openjdk.org/jeps/1 and
https://cr.openjdk.org/~mr/jep/jep-2.0-02.html for details on the JEP
process).

String templates, as currently designed, do not capture code snippets, but
values. The value arguments to a template expression are evaluated to
ordinary objects.

As stated elsewhere, a template object is simply a wrapper for the N values
and N+1 string fragments of the template expression. The values are
evaluated just like the parameters of any other Java constructor/method
call. The source code or bytecode of the expressions that created the
values is simply not available.

That said, if you want to pass Java code as strings to the template and do
some magic with the Classfile API (now in preview) at runtime to generate
code, you can. You can also use an annotation processor (like Lombok) or
Java compiler plugin (like Manifold) to do all sorts of advanced
manipulation of source/bytecode at compile time.

Lombok: https://projectlombok.org/
Manifold: https://github.com/manifold-systems/manifold

Cheers,
Clement Cherlin

On Fri, Mar 15, 2024 at 11:32?PM Mikael Sterner <msterner at openjdk.mxy.se>
wrote:

> Hi Experts!
>
> What is the relationship between string template processors and code
> reflection, and would it influence the design of string templates and their
> literals?
>
> Is the need to process string templates, seen as code snippets aggregating
> static and dynamic strings, just a special case of a more general pattern
> of processing code snippets semi-lazily using custom rules? (Such as safe
> handling of dynamic strings, or contextual operator overloading.)
>
> Examples:
>
> // String template processor
>
> String table = "foo bar";
> ResultSet r = s.executeQuery("SELECT * FROM \{table\}"); // Escape dynamic
> table names
>
> // Code reflection processors
>
> String table = "foo bar";
> ResultSet r = s.executeQuery(@CodeReflection () -> "SELECT * FROM " +
> table); // Escape dynamic table names
>
> String value = "bar";
> Pattern p = Pattern.compile(@CodeReflection () -> "foo = " + value); //
> Quote dynamic strings
>
> Matrix m = Matrix.eval(@CodeReflection () -> Matrix.diag(1, 2, 3) *
> Matrix.col(2, 3, 4)); // Matrix multiplication
>
> Document d = HTML.compile(@CodeReflection () -> Body.of(Div.of("Hello
> World!"))); // Escape strings
>
> Each such code reflecting processor API defining it's own rules for how to
> handle code snippets, including processing of any raw string templates
> when/if they are added to the language. Other types than String and
> StringTemplate being handled as fits each API, and to the level of type
> safety wanted by the API user. Evaluation being done lazily or eagerly as
> appropriate.
>
> (In the above "@CodeReflection () ->" is short for some syntax allowing
> inline code reflecting snippets, passed to the API processors in a way that
> allows them to reflect on the snippet code.)
>
> Yours,
> Mikael Sterner
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240318/623ef9ce/attachment.htm>

From forax at univ-mlv.fr  Mon Mar 18 14:32:37 2024
From: forax at univ-mlv.fr (Remi Forax)
Date: Mon, 18 Mar 2024 15:32:37 +0100 (CET)
Subject: Update on String Templates (JEP 459)
In-Reply-To: <4ad249e1-49de-4e83-bf5a-3577e2aaa5df@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <4ad249e1-49de-4e83-bf5a-3577e2aaa5df@oracle.com>
Message-ID: <1220366784.33577819.1710772356999.JavaMail.zimbra@univ-eiffel.fr>

> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "amber-spec-experts" <amber-spec-experts at openjdk.org>
> Sent: Monday, March 18, 2024 2:38:34 PM
> Subject: Re: Update on String Templates (JEP 459)

> I think this has been a good discussion, and it looks like we're starting to see
> some convergence.

[...] 

> A few people have implied that only the tainted parts of an ST (the embedded
> expressions) need special processing, but I'll point out that the untainted
> parts may often require domain-specific validation. For example, a ST
> representing a SQL query wants balanced quotes, and might want to require
> quotes around embedded expressions.
The important word here is validation, the "fragments" part of a string template is needed for validation while the "values" part should be escaped when doing the interpolation. 

And i'm not sure we can get away with restricting ST creation to only literal STs even if it would be nice. 

R?mi 

> On 3/8/2024 1:35 PM, Brian Goetz wrote:

>> Time to check in with where were are with String Templates. We?ve gone through
>> two rounds of preview, and have received some feedback.

>> As a reminder, the primary goal of gathering feedback is to learn things about
>> the design or implementation that we don?t already know. This could be bug
>> reports, experience reports, code review, careful analysis, novel alternatives,
>> etc. And the best feedback usually comes from using the feature ?in anger? ?
>> trying to actually write code with it. (?Some people would prefer a different
>> syntax? or ?some people would prefer we focused on string interpolation only?
>> fall squarely in the ?things we already knew? camp.)

>> In the course of using this feature in the `jextract` project, we did learn
>> quite a few things we didn?t already know, and this was conclusive enough that
>> it has motivated us to adjust our approach in this feature. Specifically, the
>> role of processors is ?outsized? to the value they offer, and, after further
>> exploration, we now believe it is possible to achieve the goals of the feature
>> without an explicit ?processor? abstraction at all! This is a very positive
>> development.

>> First, I want to affirm that that the goals of the project have not changed.
>> From JEP 459:

>> Goals

>> ? Simplify the writing of Java programs by making it easy to express strings
>> that include values computed at run time.
>> ? Enhance the readability of expressions that mix text and expressions, whether
>> the text fits on a single source line (as with string literals) or spans
>> several source lines (as with text blocks).
>> ? Improve the security of Java programs that compose strings from user-provided
>> values and pass them to other systems (e.g., building queries for databases) by
>> supporting validation and transformation of both the template and the values of
>> its embedded expressions.
>> ? Retain flexibility by allowing Java libraries to define the formatting syntax
>> used in string templates.
>> ? Simplify the use of APIs that accept strings written in non-Java languages
>> (e.g., SQL, XML, and JSON).
>> ? Enable the creation of non-string values computed from literal text and
>> embedded expressions without having to transit through an intermediate string
>> representation.

>> Non-Goals
>> ? It is not a goal to introduce syntactic sugar for Java's string concatenation
>> operator (+), since that would circumvent the goal of validation.
>> ? It is not a goal to deprecate or remove the StringBuilder and StringBuffer
>> classes, which have traditionally been used for complex or programmatic string
>> composition.

>> Another thing that has not changed is our view on the syntax for embedding
>> expressions. While many people did express the opinion of ?why not ?just' do
>> what Kotlin/Scala does?, this issue was more than fully explored during the
>> initial design round. (In fact, while syntax disagreements are often purely
>> subjective, this one was far more clear ? the $-syntax is objectively worse,
>> and would be doubly so if injected into an existing language where there were
>> already string literals in the wild. This has all been more than adequately
>> covered elsewhere, so I won?t rehash it here.)

>> Now, let?s talk about what we do think should change: the role of processors and
>> the StringTemplate type.

>> Processors were envisioned as a means to abstract the transformation of
>> templates to their final form (whether string, or something else.) However,
>> Java already has a well established means of abstracting behavior: methods. (In
>> fact, a processor application can be viewed as merely a new syntax for a method
>> call.) Our experience using the feature highlighted the question: When
>> converting a SQL query expressed as a template to the form required by the
>> database (such as PreparedStatement), why do we need to say:

>> DB.?? template ??

>> When we could use an ordinary Java library:

>> Query q = Query.of(??template??)

>> Indeed, one of the worst things about having processors in the language is that
>> API designers are put in the difficult situation of not knowing whether to
>> write a processor or an ordinary API, and often have to make that choice before
>> the consequences are fully understood. (To add to this, processors raise
>> similar questions at the use site.) But the real criticism here is that
>> template capture and processing are complected, when they should be separate,
>> composable features.

>> This motivated us to revisit some of the reasons why processors were so central
>> to the initial design in the first place. And it turned out, this choice had
>> been influenced ? perhaps overly so ? by early implementation experiments. (One
>> of the background design goals was to enable expensive operations like
>> `String::format` to be (much) cheaper. Without digressing too deeply on
>> performance, String::format can be more than an order of magnitude worse than
>> the equivalent concatenation operation, and this in turn sometimes motivates
>> developers to use worse idioms for formatting. The FMT processor brough that
>> cost back in line with the equivalent concatenation.) These early experiments
>> biased the design towards needing to know the processor at the point of
>> template capture, but upon reexamination we realized that there are other ways
>> to achieve the desired performance goals without requiring processors to be
>> known at capture time. This, in turn, enabled us to revisit a point in the
>> design space we had transited through earlier, where string templates were
>> ?just a new kind of literal? and the job performed by processors could instead
>> be performed by ordinary APIs.

>> At this point, a simpler design and implementation emerged that met the
>> semantic, correctness, and performance goals: template literals (?Hello
>> \{name}?) are simply the literal form of StringTemplate:

>> StringTemplate st = ?Hello \{name}?;

>> String and StringTemplate remain unrelated types. (We explored a number of ways
>> to interconvert them, but they caused more trouble than they solved.)
>> Processing of string templates, including interpolation, is done by ordinary
>> APIs that deal in StringTemplate, aided by some clever implementation tricks to
>> ensure good performance.

>> For APIs where interpolation is known to be safe in the domain, such as
>> PrintWriter, APIs can make that choice on behalf of the domain, by providing
>> overloads to embody this design choice:

>> void println(String) { ? }
>> void println(StringTemplate) { ? interpolate and delegate to println(String) ?.
>> }

>> The upshot is that for interpolation-safe APIs like println, we can use a
>> template directly without giving up any safety:

>> System.out.println(?Hello \{name}?);

>> In this example, the string template evaluates to StringTemplate, not String (no
>> implicit interpolation), and chooses the StringTemplate overload of println,
>> which in turn chooses how to process the template. This stays true to the
>> design principle that interpolation is dangerous enough that it should be an
>> explicit choice in the code ? but it allows that choice to be made by libraries
>> when the library is comfortable doing so.

>> Similarly, the FMT processor is replaced by an overload of String::format that
>> interprets templates with embedded format specifiers (e.g., ?%d?):

>> String format(String formatString, Object? parameters) { ? same as today ? }
>> String format(StringTemplate template) {... equivalent of FMT ...}

>> And users can call this as:

>> String s = String.format(?Hello %12s\{name}?);

>> Here, the String::format API has chosen to interpret string templates according
>> to the rules previously specified in the FMT processor (not ordinary
>> interpolation), but that choice is embedded in the library semantics so no
>> further explicit choice at the use site is required. The user already chose to
>> pass it to String::format; that?s all the processing selection that is needed.

>> Where APIs do not express a choice of what template expansion means, users
>> continue to be free to process them explicitly before passing them, using APIs
>> that do (such as String::format or ordinary interpolation.).

>> The result is:

>> - The need for use-site "goop" (previously, the processor name; now, static or
>> instance methods to process a template) goes away entirely when dealing with
>> libraries that are already template-friendly.
>> - Even with libraries that require use-site goop, it is no more intrusive than
>> before, and can be reduced over time as APIs get with the program.
>> - StringTemplate is just another type that APIs can support if they want. The
>> "DB" processor becomes an ordinary factory method that accepts a string
>> template or an ordinary builder API.
>> - APIs now can have _more_ control over the timing and meaning of template
>> processing, because we are not biasing so strongly towards early processing.
>> - It becomes easier to abstract over template processing (i.e., combine or
>> manipulate templates as templates before processing)
>> - Interpolation remains an explicit choice, but ST-aware libraries can make this
>> choice on behalf of the user.
>> - The language feature and API surface get considerably smaller, which is good.
>> Core JDK APIs (e.g., println, format, exception constructors) get upgraded to
>> work with string templates.

>> The remaining question that everyone is probably asking is: ?so how do we do
>> interpolation.? The answer there is ?ordinary library methods?. This might be a
>> static method (String.join(StringTemplate)) or an instance method
>> (template.join()), shed to be painted (but please, not right now.).

>> This is a sketch of direction, so feel free to pose questions/comments on the
>> direction. We?ll discuss the details as we go.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240318/bcdd5f08/attachment-0001.htm>

From ccherlin at gmail.com  Mon Mar 18 17:04:42 2024
From: ccherlin at gmail.com (Clement Cherlin)
Date: Mon, 18 Mar 2024 12:04:42 -0500
Subject: String Template processors vs Code Reflection?
In-Reply-To: <CALEU8=xWvQtKHPbrtHLeEQGyGtfWZb8UObDofVCJqRxrcRWzzQ@mail.gmail.com>
References: <8b09da5b-5224-4428-8344-2816e01e15bf@app.fastmail.com>
 <CALEU8=xWvQtKHPbrtHLeEQGyGtfWZb8UObDofVCJqRxrcRWzzQ@mail.gmail.com>
Message-ID: <CALEU8=w_x950xqfWj7ZZryzCHt4EfxZtgKS40rV_QX6gRoBPAA@mail.gmail.com>

 Oh, I see, you were talking about code reflection from Project Babylon
https://openjdk.org/projects/babylon/

You asked, "Is the need to process string templates, seen as code snippets
aggregating static and dynamic strings, just a special case of a more
general pattern of processing code snippets semi-lazily using custom rules?
(Such as safe handling of dynamic strings, or contextual operator
overloading.)"

No, because String Templates are not code snippets, they're simple
aggregations of strings and other values. Project Babylon deals in code
snippets. They are related in the sense of making Java more expressive and
able to deal with various DSLs and embedded expressions of various types,
but they way they go about it is very different.

Cheers,
Clement Cherlin

On Mon, Mar 18, 2024 at 8:50?AM Clement Cherlin <ccherlin at gmail.com> wrote:

> Hi Mikael,
>
> It looks to me like you're talking about a generic macro processor, not a
> string template processor. While that's an interesting idea, I think the
> scope is so much greater than string templates, it would make more sense as
> its own proposal (see https://openjdk.org/jeps/1 and
> https://cr.openjdk.org/~mr/jep/jep-2.0-02.html for details on the JEP
> process).
>
> String templates, as currently designed, do not capture code snippets, but
> values. The value arguments to a template expression are evaluated to
> ordinary objects.
>
> As stated elsewhere, a template object is simply a wrapper for the N
> values and N+1 string fragments of the template expression. The values are
> evaluated just like the parameters of any other Java constructor/method
> call. The source code or bytecode of the expressions that created the
> values is simply not available.
>
> That said, if you want to pass Java code as strings to the template and do
> some magic with the Classfile API (now in preview) at runtime to generate
> code, you can. You can also use an annotation processor (like Lombok) or
> Java compiler plugin (like Manifold) to do all sorts of advanced
> manipulation of source/bytecode at compile time.
>
> Lombok: https://projectlombok.org/
> Manifold: https://github.com/manifold-systems/manifold
>
> Cheers,
> Clement Cherlin
>
> On Fri, Mar 15, 2024 at 11:32?PM Mikael Sterner <msterner at openjdk.mxy.se>
> wrote:
>
>> Hi Experts!
>>
>> What is the relationship between string template processors and code
>> reflection, and would it influence the design of string templates and their
>> literals?
>>
>> Is the need to process string templates, seen as code snippets
>> aggregating static and dynamic strings, just a special case of a more
>> general pattern of processing code snippets semi-lazily using custom rules?
>> (Such as safe handling of dynamic strings, or contextual operator
>> overloading.)
>>
>> Examples:
>>
>> // String template processor
>>
>> String table = "foo bar";
>> ResultSet r = s.executeQuery("SELECT * FROM \{table\}"); // Escape
>> dynamic table names
>>
>> // Code reflection processors
>>
>> String table = "foo bar";
>> ResultSet r = s.executeQuery(@CodeReflection () -> "SELECT * FROM " +
>> table); // Escape dynamic table names
>>
>> String value = "bar";
>> Pattern p = Pattern.compile(@CodeReflection () -> "foo = " + value); //
>> Quote dynamic strings
>>
>> Matrix m = Matrix.eval(@CodeReflection () -> Matrix.diag(1, 2, 3) *
>> Matrix.col(2, 3, 4)); // Matrix multiplication
>>
>> Document d = HTML.compile(@CodeReflection () -> Body.of(Div.of("Hello
>> World!"))); // Escape strings
>>
>> Each such code reflecting processor API defining it's own rules for how
>> to handle code snippets, including processing of any raw string templates
>> when/if they are added to the language. Other types than String and
>> StringTemplate being handled as fits each API, and to the level of type
>> safety wanted by the API user. Evaluation being done lazily or eagerly as
>> appropriate.
>>
>> (In the above "@CodeReflection () ->" is short for some syntax allowing
>> inline code reflecting snippets, passed to the API processors in a way that
>> allows them to reflect on the snippet code.)
>>
>> Yours,
>> Mikael Sterner
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240318/37057999/attachment.htm>

From guy.steele at oracle.com  Mon Mar 18 17:53:43 2024
From: guy.steele at oracle.com (Guy Steele)
Date: Mon, 18 Mar 2024 17:53:43 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <4ad249e1-49de-4e83-bf5a-3577e2aaa5df@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <4ad249e1-49de-4e83-bf5a-3577e2aaa5df@oracle.com>
Message-ID: <550D7669-DEB4-4C74-AC16-12A4F1E35465@oracle.com>


> On Mar 18, 2024, at 9:38?AM, Brian Goetz <brian.goetz at oracle.com> wrote:
> . . .
> A few people have implied that only the tainted parts of an ST (the embedded expressions) need special processing, but I'll point out that the untainted parts may often require domain-specific validation.  For example, a ST representing a SQL query wants balanced quotes, and might want to require quotes around embedded expressions.

Thank you for mentioning this, especially in connection with SQL, which has bene much on my mind this last week. Yes, for complete safety, an SQL processor really ought to do a proper parse of the entire SQL statement represented by the fragments and verify that the ?holes? filled by the expressions make sense. In elaborate cases, it may be necessary to figure out what kind of thing is represented by the hole (value, name, data type) before it can properly validate and escape the associated expression.

?Guy


From msterner at openjdk.mxy.se  Mon Mar 18 21:29:04 2024
From: msterner at openjdk.mxy.se (Mikael Sterner)
Date: Mon, 18 Mar 2024 23:29:04 +0200
Subject: String Template processors vs Code Reflection?
In-Reply-To: <CALEU8=w_x950xqfWj7ZZryzCHt4EfxZtgKS40rV_QX6gRoBPAA@mail.gmail.com>
References: <8b09da5b-5224-4428-8344-2816e01e15bf@app.fastmail.com>
 <CALEU8=xWvQtKHPbrtHLeEQGyGtfWZb8UObDofVCJqRxrcRWzzQ@mail.gmail.com>
 <CALEU8=w_x950xqfWj7ZZryzCHt4EfxZtgKS40rV_QX6gRoBPAA@mail.gmail.com>
Message-ID: <c38147d9-3928-4de5-8933-edc534fc1222@app.fastmail.com>

Thanks for the response, and yes it was code reflection from Babylon I referred to, sorry for not being clearer.

Indeed full code reflection would be more powerful. My curiosity how the concepts relate was triggered by the fact that if the code reflection is limited to snippets with only aggregation (e.g. operator +) and two types (fragments and values) it seems to become very similar to string template processing:

// Code reflection processor
processor.apply(@CodeReflection () -> $"foo = " + bar); 

// String template processor
processor.apply($"foo = \{bar\}");

Where $"..." is shorthand for a fragment literal in the code reflection case, and (obviously) a string template in the string template case.

For the code reflection case you could in principle offer a shorthand $"foo = \{bar\}" to mean the same kind of aggregation as $"foo = " + bar.

And similarly for the string template case it seems you could offer $"foo = " + bar as an alternative aggregation syntax, i.e. translate it to the same string template as $"foo = \{bar\}".

(I guess in such a world it would be a personal preferences which kind of aggregation style you would prefer: the one with fragments and values delimited by + or the one with everything inline in one string template literal. One advantage of the former would be that string literal values wouldn't need escaping, i.e. $"foo = " + "bar" vs $"foo = \{\"bar\"\}", and also line breaks could be more naturally inserted between the parts without having to use multiline text blocks.)

Yours,
Mikael Sterner


On Mon, Mar 18, 2024, at 19:04, Clement Cherlin wrote:
> Oh, I see, you were talking about code reflection from Project Babylon https://openjdk.org/projects/babylon/
> 
> You asked, "Is the need to process string templates, seen as code snippets aggregating static and dynamic strings, just a special case of a more general pattern of processing code snippets semi-lazily using custom rules? (Such as safe handling of dynamic strings, or contextual operator overloading.)"
> 
> No, because String Templates are not code snippets, they're simple aggregations of strings and other values. Project Babylon deals in code snippets. They are related in the sense of making Java more expressive and able to deal with various DSLs and embedded expressions of various types, but they way they go about it is very different.
> 
> Cheers,
> Clement Cherlin
> 
> On Mon, Mar 18, 2024 at 8:50?AM Clement Cherlin <ccherlin at gmail.com> wrote:
>> Hi Mikael,
>> 
>> It looks to me like you're talking about a generic macro processor, not a string template processor. While that's an interesting idea, I think the scope is so much greater than string templates, it would make more sense as its own proposal (see https://openjdk.org/jeps/1 and https://cr.openjdk.org/~mr/jep/jep-2.0-02.html for details on the JEP process).
>> 
>> String templates, as currently designed, do not capture code snippets, but values. The value arguments to a template expression are evaluated to ordinary objects.
>> 
>> As stated elsewhere, a template object is simply a wrapper for the N values and N+1 string fragments of the template expression. The values are evaluated just like the parameters of any other Java constructor/method call. The source code or bytecode of the expressions that created the values is simply not available.
>> 
>> That said, if you want to pass Java code as strings to the template and do some magic with the Classfile API (now in preview) at runtime to generate code, you can. You can also use an annotation processor (like Lombok) or Java compiler plugin (like Manifold) to do all sorts of advanced manipulation of source/bytecode at compile time.
>> 
>> Lombok: https://projectlombok.org/
>> Manifold: https://github.com/manifold-systems/manifold
>> 
>> Cheers,
>> Clement Cherlin
>> 
>> On Fri, Mar 15, 2024 at 11:32?PM Mikael Sterner <msterner at openjdk.mxy.se> wrote:
>>> Hi Experts!
>>> 
>>> What is the relationship between string template processors and code reflection, and would it influence the design of string templates and their literals?
>>> 
>>> Is the need to process string templates, seen as code snippets aggregating static and dynamic strings, just a special case of a more general pattern of processing code snippets semi-lazily using custom rules? (Such as safe handling of dynamic strings, or contextual operator overloading.)
>>> 
>>> Examples:
>>> 
>>> // String template processor
>>> 
>>> String table = "foo bar";
>>> ResultSet r = s.executeQuery("SELECT * FROM \{table\}"); // Escape dynamic table names
>>> 
>>> // Code reflection processors
>>> 
>>> String table = "foo bar";
>>> ResultSet r = s.executeQuery(@CodeReflection () -> "SELECT * FROM " + table); // Escape dynamic table names
>>> 
>>> String value = "bar";
>>> Pattern p = Pattern.compile(@CodeReflection () -> "foo = " + value); // Quote dynamic strings
>>> 
>>> Matrix m = Matrix.eval(@CodeReflection () -> Matrix.diag(1, 2, 3) * Matrix.col(2, 3, 4)); // Matrix multiplication
>>> 
>>> Document d = HTML.compile(@CodeReflection () -> Body.of(Div.of("Hello World!"))); // Escape strings
>>> 
>>> Each such code reflecting processor API defining it's own rules for how to handle code snippets, including processing of any raw string templates when/if they are added to the language. Other types than String and StringTemplate being handled as fits each API, and to the level of type safety wanted by the API user. Evaluation being done lazily or eagerly as appropriate.
>>> 
>>> (In the above "@CodeReflection () ->" is short for some syntax allowing inline code reflecting snippets, passed to the API processors in a way that allows them to reflect on the snippet code.)
>>> 
>>> Yours,
>>> Mikael Sterner
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240318/b56771a7/attachment-0001.htm>

From john.mccourt000 at gmail.com  Mon Mar 18 21:41:09 2024
From: john.mccourt000 at gmail.com (John McCourt)
Date: Mon, 18 Mar 2024 21:41:09 +0000
Subject: String Template processors vs Code Reflection?
In-Reply-To: <c38147d9-3928-4de5-8933-edc534fc1222@app.fastmail.com>
References: <8b09da5b-5224-4428-8344-2816e01e15bf@app.fastmail.com>
 <CALEU8=xWvQtKHPbrtHLeEQGyGtfWZb8UObDofVCJqRxrcRWzzQ@mail.gmail.com>
 <CALEU8=w_x950xqfWj7ZZryzCHt4EfxZtgKS40rV_QX6gRoBPAA@mail.gmail.com>
 <c38147d9-3928-4de5-8933-edc534fc1222@app.fastmail.com>
Message-ID: <CAJZY4gnGtan4rrGx20Uj0voNSrfXEFKDsbGTLueaSPnmS8_FTw@mail.gmail.com>

unsubscribe

On Mon, Mar 18, 2024 at 9:30?PM Mikael Sterner <msterner at openjdk.mxy.se>
wrote:

> Thanks for the response, and yes it was code reflection from Babylon I
> referred to, sorry for not being clearer.
>
> Indeed full code reflection would be more powerful. My curiosity how the
> concepts relate was triggered by the fact that if the code reflection is
> limited to snippets with only aggregation (e.g. operator +) and two types
> (fragments and values) it seems to become very similar to string template
> processing:
>
> // Code reflection processor
> processor.apply(@CodeReflection () -> $"foo = " + bar);
>
> // String template processor
> processor.apply($"foo = \{bar\}");
>
> Where $"..." is shorthand for a fragment literal in the code reflection
> case, and (obviously) a string template in the string template case.
>
> For the code reflection case you could in principle offer a shorthand
> $"foo = \{bar\}" to mean the same kind of aggregation as $"foo = " + bar.
>
> And similarly for the string template case it seems you could offer $"foo
> = " + bar as an alternative aggregation syntax, i.e. translate it to the
> same string template as $"foo = \{bar\}".
>
> (I guess in such a world it would be a personal preferences which kind of
> aggregation style you would prefer: the one with fragments and values
> delimited by + or the one with everything inline in one string template
> literal. One advantage of the former would be that string literal values
> wouldn't need escaping, i.e. $"foo = " + "bar" vs $"foo = \{\"bar\"\}", and
> also line breaks could be more naturally inserted between the parts without
> having to use multiline text blocks.)
>
> Yours,
> Mikael Sterner
>
>
> On Mon, Mar 18, 2024, at 19:04, Clement Cherlin wrote:
>
> Oh, I see, you were talking about code reflection from Project Babylon
> https://openjdk.org/projects/babylon/
>
> You asked, "Is the need to process string templates, seen as code snippets
> aggregating static and dynamic strings, just a special case of a more
> general pattern of processing code snippets semi-lazily using custom rules?
> (Such as safe handling of dynamic strings, or contextual operator
> overloading.)"
>
> No, because String Templates are not code snippets, they're simple
> aggregations of strings and other values. Project Babylon deals in code
> snippets. They are related in the sense of making Java more expressive and
> able to deal with various DSLs and embedded expressions of various types,
> but they way they go about it is very different.
>
> Cheers,
> Clement Cherlin
>
> On Mon, Mar 18, 2024 at 8:50?AM Clement Cherlin <ccherlin at gmail.com>
> wrote:
>
> Hi Mikael,
>
> It looks to me like you're talking about a generic macro processor, not a
> string template processor. While that's an interesting idea, I think the
> scope is so much greater than string templates, it would make more sense as
> its own proposal (see https://openjdk.org/jeps/1 and
> https://cr.openjdk.org/~mr/jep/jep-2.0-02.html for details on the JEP
> process).
>
> String templates, as currently designed, do not capture code snippets, but
> values. The value arguments to a template expression are evaluated to
> ordinary objects.
>
> As stated elsewhere, a template object is simply a wrapper for the N
> values and N+1 string fragments of the template expression. The values are
> evaluated just like the parameters of any other Java constructor/method
> call. The source code or bytecode of the expressions that created the
> values is simply not available.
>
> That said, if you want to pass Java code as strings to the template and do
> some magic with the Classfile API (now in preview) at runtime to generate
> code, you can. You can also use an annotation processor (like Lombok) or
> Java compiler plugin (like Manifold) to do all sorts of advanced
> manipulation of source/bytecode at compile time.
>
> Lombok: https://projectlombok.org/
> Manifold: https://github.com/manifold-systems/manifold
>
> Cheers,
> Clement Cherlin
>
> On Fri, Mar 15, 2024 at 11:32?PM Mikael Sterner <msterner at openjdk.mxy.se>
> wrote:
>
> Hi Experts!
>
> What is the relationship between string template processors and code
> reflection, and would it influence the design of string templates and their
> literals?
>
> Is the need to process string templates, seen as code snippets aggregating
> static and dynamic strings, just a special case of a more general pattern
> of processing code snippets semi-lazily using custom rules? (Such as safe
> handling of dynamic strings, or contextual operator overloading.)
>
> Examples:
>
> // String template processor
>
> String table = "foo bar";
> ResultSet r = s.executeQuery("SELECT * FROM \{table\}"); // Escape dynamic
> table names
>
> // Code reflection processors
>
> String table = "foo bar";
> ResultSet r = s.executeQuery(@CodeReflection () -> "SELECT * FROM " +
> table); // Escape dynamic table names
>
> String value = "bar";
> Pattern p = Pattern.compile(@CodeReflection () -> "foo = " + value); //
> Quote dynamic strings
>
> Matrix m = Matrix.eval(@CodeReflection () -> Matrix.diag(1, 2, 3) *
> Matrix.col(2, 3, 4)); // Matrix multiplication
>
> Document d = HTML.compile(@CodeReflection () -> Body.of(Div.of("Hello
> World!"))); // Escape strings
>
> Each such code reflecting processor API defining it's own rules for how to
> handle code snippets, including processing of any raw string templates
> when/if they are added to the language. Other types than String and
> StringTemplate being handled as fits each API, and to the level of type
> safety wanted by the API user. Evaluation being done lazily or eagerly as
> appropriate.
>
> (In the above "@CodeReflection () ->" is short for some syntax allowing
> inline code reflecting snippets, passed to the API processors in a way that
> allows them to reflect on the snippet code.)
>
> Yours,
> Mikael Sterner
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240318/b1b10758/attachment.htm>

From amaembo at gmail.com  Tue Mar 19 12:55:39 2024
From: amaembo at gmail.com (Tagir Valeev)
Date: Tue, 19 Mar 2024 13:55:39 +0100
Subject: Does String extend StringTemplate? (Was: Update on String
 Templates (JEP 459))
In-Reply-To: <87dd98c8-e0ac-49e0-995a-5466a50219d3@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <636B984E-A544-4155-81D1-8752037A973B@oracle.com>
 <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com>
 <B6E205D1-C0BE-4506-A06C-7DF03A3910C1@oracle.com>
 <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com>
 <CAE+3fjYP3xo4BPomSuvoG-whTQCNTKs6tK3SOUCuHT5YvZ0iWw@mail.gmail.com>
 <87dd98c8-e0ac-49e0-995a-5466a50219d3@oracle.com>
Message-ID: <CAE+3fja8m3aVzNy047Gf==nND23idc5oBWUyrt4niPu0bB2+wA@mail.gmail.com>

Hello!

Thank you for splitting the thread. I think that String is a StringTemplate
in the same sense as zero is also a number, identity is also a function,
and empty set is also a set. A degenerate case is important for
generalization, as you don't have to think about it when it
actually appears.

That said, now I have started to doubt this idea. So far I advocated for
'string should be string template', but what I really want is that 'string
literal should be string template', which, while similar, is not the same.
Indeed, for unification we don't actually need to support non-literal
strings.
It would be interesting to create a subclass of String like StringLiteral,
which is constructed only from literals and implements StringTemplate.
However, it will be a huge compatibility disaster. Now, my thought goes
into some kind of implicit conversion from String literal (and only from
literal) to StringTemplate, which was already discussed elsewhere, so the
discussion is already far ahead of my thoughts :-)

To conclude, what I really wanted is a uniform way to specify
StringTemplate with 0 embedded expressions and StringTemplate with 1+
embedded expressions. E.g., if we require a prefix for all string templates
like ST"...", then this desire will be satisfied. If we don't introduce the
prefix but can use String literal in every context where StringTemplate
literal is possible (like it's in the current preview), then my desire is
also satisfied. I see the drawbacks in every solution, so for now I don't
have a strong preference.

With best regards,
Tagir Valeev.


On Tue, Mar 12, 2024 at 6:32?PM Brian Goetz <brian.goetz at oracle.com> wrote:

> Splitting off into a separate thread.
>
> I would like to redirect this discussion from the mechanical challenges
> and consequences to the goals and semantics.
>
> If we are considering "String extends StringTemplate", we are making a
> semantic statement that a String *is-a* StringTemplate.  While I can
> imagine convincing oneself that this is true "if you look at it right",
> this sets off all my "self-justification" detectors.
>
> So, I recommend we step back and examine why we think this is a good idea
> before we descend into the mechanics.  My suspicion is that this is
> motivated by "I want to be able to automatically use String where a
> StringTemplate is desired", and that this seems a clever-enough hack to get
> there.  (I think we probably also need to drill further, into "why do we
> think it is important to be able to use String where StringTemplate is
> desired", and I suspect further that part of it will be "but the APIs are
> not yet fully equilibrated" (which would be a truly bad reason to give
> String a new supertype.))
>
>
>
>
> On 3/12/2024 1:24 PM, Tagir Valeev wrote:
>
> Hello, Maurizio!
>
> Thank you for the detailed explanation!
>
> On Mon, Mar 11, 2024 at 1:16?PM Maurizio Cimadamore <
> maurizio.cimadamore at oracle.com> wrote:
>
>> Hi all,
>> we tried mainly three approaches to allow smoother interop between
>> strings and string templates: (a) make String a subclass of StringTemplate.
>> Or (b) make constant strings bs *convertible* to string templates. Or,
>> (c) use target-typing. All these approaches have some issues, discussed
>> below.
>>
>> The first approach is slightly simpler, because it can be achieved
>> entirely outside of the Java language. Unfortunately, adding ?String
>> implements StringTemplate? adds overload ambiguities in cases such as this:
>>
>> format(StringTemplate) // 1
>> format(String, Object...) // 2
>>
>> This is actually a very important case, as we predice that StringTemplate
>> will serve as a great replacement for methods out there accepting a
>> string/Object? pack.
>>
>> Unfortunatly, if String <: StringTemplate, this means that calling format
>> with a string literal will resolve to (1), not (2) as before. The problem
>> here is that (2) is not even applicable during the two overload resolution
>> phases (which is only allowed to use subtyping and conversions,
>> respectively), as it is a varargs method. Because of this, (1) will now
>> take the precedence, as that?s not varargs. While for String::format this
>> is probably harmless, changing results of overload selection is something
>> that should be done with care (esp. if different overloads have different
>> return types), as it could lead to source compatibility issues.
>>
> I would still like to advocate for String <: StringTemplate solution. I
> think that the overloading is not a big problem. Simply making String
> implements StringTemplate will not break any of existing code because there
> are no APIs yet that accept the StringTemplate instance. The problem may
> appear only when an API author actually adds such an overload and does this
> in an incompatible way with an existing String overload. This would be an
> extremely bad design choice, and the blame goes to the API author. You've
> correctly mentioned that for String::format this is harmless because the
> API is well-designed. We may suggest in StringTemplate documentation that
> the API designers should provide the same behavior for foo(String) and
> foo(StringTemplate) when they add an overload.
>
> I must say that we already had an experience of introducing new interfaces
> in the hierarchy of widely-used library classes. Closable got AutoClosable
> parent, StringBuilder became comparable, and so on. So far, the
> compatibility issues introduced were tolerable. Well, probably I'm missing
> something but we have preview rounds just for this purpose: to find out the
> disadvantages of the approach.
>
>
>
>> On top of these issues, making all strings be string templates has the
>> disadvantage of also considering ?messy? strings obtained via concatenation
>> of non-constant values string templates too, which seems bad.
>>
> I think that most of the APIs will still provide String overload. E.g.,
> for preparing an SQL statement, it's a perfectly reasonable scenario to
> have a constant string as the input. So prepareStatement(String) will stay
> along with prepareStatement(StringTemplate). And people will still be able
> to use concatenation. I don't think that the absence of String <:
> StringTemplate relation will protect anybody from using the concatenation.
> On the other hand, if String actually implements StringTemplate, it will be
> a very simple static analysis rule to warn if the concatenation occurs in
> this context. If the expected type for concatenation is StringTemplate,
> then something is definitely wrong. Without 'String implements
> StringTemplate', one will not be able to write a concatenation directly in
> StringTemplate context. Instead, String-accepting overload will be used,
> and the expected type will be String, so static analyzer will have to guess
> whether it's dangerous to use the concatenation here. In short, I think
> that it's actually an advantage: we have an additional hint here that
> concatenation is undesired. Even compilation warning could be possible to
> implement.
>
> So, I don't see these points as real disadvantages. I definitely like this
> approach much more than adding any kind of implicit conversion or another
> literal syntax, which would complicate the specification much more.
>
> With best regards,
> Tagir Valeev.
>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240319/2c631935/attachment-0001.htm>

From brian.goetz at oracle.com  Tue Mar 19 13:35:47 2024
From: brian.goetz at oracle.com (Brian Goetz)
Date: Tue, 19 Mar 2024 09:35:47 -0400
Subject: Does String extend StringTemplate? (Was: Update on String
 Templates (JEP 459))
In-Reply-To: <CAE+3fja8m3aVzNy047Gf==nND23idc5oBWUyrt4niPu0bB2+wA@mail.gmail.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <CAE+3fjY5tFd--nsmqTKtf=p6jAToi21YO_7nmaXzu_TYw5rc+g@mail.gmail.com>
 <636B984E-A544-4155-81D1-8752037A973B@oracle.com>
 <58147E22-8667-40E5-BB94-92B8EF3DC2AA@oracle.com>
 <B6E205D1-C0BE-4506-A06C-7DF03A3910C1@oracle.com>
 <20e98df0-9dc2-4804-8c71-a329260cabc1@oracle.com>
 <CAE+3fjYP3xo4BPomSuvoG-whTQCNTKs6tK3SOUCuHT5YvZ0iWw@mail.gmail.com>
 <87dd98c8-e0ac-49e0-995a-5466a50219d3@oracle.com>
 <CAE+3fja8m3aVzNy047Gf==nND23idc5oBWUyrt4niPu0bB2+wA@mail.gmail.com>
Message-ID: <3b0e72bf-472b-4eb9-a4cc-b7276a77cb7c@oracle.com>

I feel your pain!? We walked many of these steps, for many of the same 
reasons.? Each initially-promising approach (subtyping, poly 
expressions, target typing) turned out to have more drawbacks than 
benefits.? But I agree it would be nice if one could just use a string 
literal where a string template is needed -- no one would be confused 
(when it works right.)

On 3/19/2024 8:55 AM, Tagir Valeev wrote:
> Hello!
>
> Thank you for splitting?the thread. I think that String is a 
> StringTemplate in the same sense as zero is also a number, identity is 
> also a function, and empty set is also a set. A degenerate case is 
> important for generalization, as you don't have to think about it when 
> it actually?appears.
>
> That said, now I have started?to doubt this?idea. So far I advocated 
> for 'string should be string template', but what I really want is that 
> 'string literal should be string template', which, while similar, is 
> not the same. Indeed, for unification we don't actually need to 
> support non-literal strings.
> It would be interesting to create a subclass of String like 
> StringLiteral, which is constructed only from literals and implements 
> StringTemplate. However, it will be a huge compatibility disaster. 
> Now, my thought goes into some kind of implicit conversion from String 
> literal (and only from literal) to StringTemplate, which was already 
> discussed elsewhere, so the discussion is already far ahead of my 
> thoughts :-)
>
> To conclude, what I really wanted is a uniform way?to specify 
> StringTemplate with 0 embedded expressions and StringTemplate with 1+ 
> embedded expressions. E.g., if we require a prefix for all string 
> templates like ST"...", then this desire will be satisfied. If we 
> don't introduce the prefix but can use String literal in every context 
> where StringTemplate literal is possible (like it's in the current 
> preview), then my desire is also satisfied. I see the drawbacks in 
> every solution, so for now I don't have a strong preference.
>
> With best regards,
> Tagir Valeev.
>
>
> On Tue, Mar 12, 2024 at 6:32?PM Brian Goetz <brian.goetz at oracle.com> 
> wrote:
>
>     Splitting off into a separate thread.
>
>     I would like to redirect this discussion from the mechanical
>     challenges and consequences to the goals and semantics.
>
>     If we are considering "String extends StringTemplate", we are
>     making a semantic statement that a String *is-a* StringTemplate.?
>     While I can imagine convincing oneself that this is true "if you
>     look at it right", this sets off all my "self-justification"
>     detectors.
>
>     So, I recommend we step back and examine why we think this is a
>     good idea before we descend into the mechanics.? My suspicion is
>     that this is motivated by "I want to be able to automatically use
>     String where a StringTemplate is desired", and that this seems a
>     clever-enough hack to get there.? (I think we probably also need
>     to drill further, into "why do we think it is important to be able
>     to use String where StringTemplate is desired", and I suspect
>     further that part of it will be "but the APIs are not yet fully
>     equilibrated" (which would be a truly bad reason to give String a
>     new supertype.))
>
>
>
>
>     On 3/12/2024 1:24 PM, Tagir Valeev wrote:
>>     Hello, Maurizio!
>>
>>     Thank you for the detailed explanation!
>>
>>     On Mon, Mar 11, 2024 at 1:16?PM Maurizio Cimadamore
>>     <maurizio.cimadamore at oracle.com> wrote:
>>
>>         Hi all,
>>         we tried mainly three approaches to allow smoother interop
>>         between strings and string templates: (a) make String a
>>         subclass of StringTemplate. Or (b) make constant strings bs
>>         /convertible/ to string templates. Or, (c) use target-typing.
>>         All these approaches have some issues, discussed below.
>>
>>         The first approach is slightly simpler, because it can be
>>         achieved entirely outside of the Java language.
>>         Unfortunately, adding ?String implements StringTemplate? adds
>>         overload ambiguities in cases such as this:
>>
>>         |format(StringTemplate) // 1 format(String, Object...) // 2 |
>>
>>         This is actually a very important case, as we predice that
>>         StringTemplate will serve as a great replacement for methods
>>         out there accepting a string/Object? pack.
>>
>>         Unfortunatly, if String <: StringTemplate, this means that
>>         calling format with a string literal will resolve to (1), not
>>         (2) as before. The problem here is that (2) is not even
>>         applicable during the two overload resolution phases (which
>>         is only allowed to use subtyping and conversions,
>>         respectively), as it is a varargs method. Because of this,
>>         (1) will now take the precedence, as that?s not varargs.
>>         While for String::format this is probably harmless, changing
>>         results of overload selection is something that should be
>>         done with care (esp. if different overloads have different
>>         return types), as it could lead to source compatibility issues.
>>
>>     I would still like to advocate for String <: StringTemplate
>>     solution. I think that the overloading is not a big problem.
>>     Simply making String implements StringTemplate will not break any
>>     of existing code because there are no APIs yet that accept the
>>     StringTemplate instance. The problem may appear only when an API
>>     author actually adds such an overload and does this in an
>>     incompatible way with an existing String overload. This would be
>>     an extremely bad design choice, and the blame goes to the API
>>     author. You've correctly mentioned that for String::format this
>>     is harmless because the API is well-designed. We may suggest in
>>     StringTemplate documentation that the API designers should
>>     provide the same behavior for foo(String) and foo(StringTemplate)
>>     when they add an overload.
>>
>>     I must say that we already had an experience of introducing new
>>     interfaces in the hierarchy of widely-used library classes.
>>     Closable got AutoClosable parent, StringBuilder became
>>     comparable, and so on. So far, the compatibility issues
>>     introduced were tolerable. Well, probably I'm missing something
>>     but we have preview rounds just for this purpose: to find out the
>>     disadvantages of the approach.
>>
>>         On top of these issues, making all strings be string
>>         templates has the disadvantage of also considering ?messy?
>>         strings obtained via concatenation of non-constant values
>>         string templates too, which seems bad.
>>
>>     I think that most of the APIs will still provide String overload.
>>     E.g., for preparing an SQL statement, it's a perfectly reasonable
>>     scenario?to have a constant string as the input. So
>>     prepareStatement(String) will stay along with
>>     prepareStatement(StringTemplate). And people will still be able
>>     to use concatenation. I don't think that the absence of String <:
>>     StringTemplate relation will protect anybody from using the
>>     concatenation. On the other hand, if String actually implements
>>     StringTemplate, it will be a very simple static analysis rule to
>>     warn if the concatenation occurs in this context. If the expected
>>     type for concatenation is StringTemplate, then something is
>>     definitely wrong. Without 'String implements StringTemplate', one
>>     will not be able to write a concatenation directly in
>>     StringTemplate context. Instead, String-accepting overload will
>>     be used, and the expected type will be String, so static analyzer
>>     will have to guess whether it's dangerous to use the
>>     concatenation here. In short, I think that it's actually an
>>     advantage: we have an additional hint here that concatenation is
>>     undesired. Even compilation warning could be possible to implement.
>>
>>     So, I don't see these points as real disadvantages. I definitely
>>     like this approach much more than adding any kind of implicit
>>     conversion or another literal syntax, which would complicate the
>>     specification much more.
>>
>>     With best regards,
>>     Tagir Valeev.
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240319/296767f6/attachment-0001.htm>

From brian.goetz at oracle.com  Tue Mar 19 14:23:03 2024
From: brian.goetz at oracle.com (Brian Goetz)
Date: Tue, 19 Mar 2024 10:23:03 -0400
Subject: Update on String Templates (JEP 459)
In-Reply-To: <4ad249e1-49de-4e83-bf5a-3577e2aaa5df@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <4ad249e1-49de-4e83-bf5a-3577e2aaa5df@oracle.com>
Message-ID: <55c7a57f-fd1f-409c-8393-32f522b27ee5@oracle.com>

Let's pull on this string some more.? Assuming we settled on disjoint 
types and syntaxes, with no magic conversions, what library support do 
we need directly for ST?? I am thinking (please, let's focus on the 
functionality before we nitpick the names):

 ??? // on String
 ??? static join(StringTemplate)??? // previously STR

 ??? // on StringTemplate
 ??? String join()???????? ? ? ? ? ? ? ? ? ??????? // STR, 
instance/suffix version
 ??? static StringTemplate join(StringTemplate...) // + for string templates

This is a pleasantly short set; is anything missing?? (Not addressing 
the "which things were previously processors, but now need API points" 
right now -- that's a separate discussion.)

On 3/18/2024 9:38 AM, Brian Goetz wrote:
> I think this has been a good discussion, and it looks like we're 
> starting to see some convergence.
>
> I think we keep trying to exploit ambiguity / implicitness, and it 
> doesn't go well:
>
> ?- Many users want STR to be the "implicit processor", but that isn't 
> good for security
> ?- We tried reusing the String delimiters for string templates to 
> reduce the perception of how many different things there are here, but 
> that creates cognitive load (can't tell strings from templates without 
> parsing the entire contents), among other problems
> ?- We tried making String a poly expression (and other tricks) to 
> reduce the number of explicit conversions, but that created problems too
>
> John's characterization captures the feeling and eventual conclusion 
> that I think many of us share:
>
>> I kind of like Guy?s offensive-to-everyone suggestion that $ is required to make a true ST.
>
> Indeed, my first reaction to the $ sigil was "please no", but I am 
> grudgingly coming to the conclusion that we should stop trying to 
> implicitly "just figure out what the user wants" and acknowledge the 
> reality: templates are not strings, strings are not templates, and 
> they can be converted to each other with ... methods, just like any 
> other relatable types.? So string literals are as they always were; 
> string templates are a new thing, whose syntax and type is disjoint 
> from that of strings, as Guy also seems to be converging on:
>
>> And now that I have that better understanding, I think I lean toward 
>> (a) abandoning string interpolation and (b) having a single, short, 
>> _non-optional_ prefix for templates (?$? would be a plausible 
>> choice), on the grounds that I think it makes code more readable if 
>> templates are always distinguished up front from strings?and this is 
>> especially helpful when the templates are rather long and any `\{` 
>> present might be far from the beginning. It has a minimal number of 
>> cases to explain:
>>
>> ??? ? ? ?string literal, must not contain \{?}, type String
>> $??? ? ?template literal, may contain \{?}, type StringTemplate
>
> (concrete syntax TBB (to be bikeshod), along with the spellings of S 
> -> ST and ST -> S.)
>
> Some more useful observations:
>
> ?- The toString behavior cannot be mere interpolation.? Besides the 
> principled objections and inevitable propping-open-the-security-door 
> that this would lead to, people will quickly learn to abuse "" + ST as 
> the "fewest characters required" way to get interpolation, which is 
> "clever" in the same way that John's "empty \{}" trick is clever, but 
> not good for clarity.
> ?- We need a story to tell for how to write good overloads, which 
> seems to be more subtle than initially thought.
> ?- If the only way to make a StringTemplate is the literal syntax, 
> then STs gain a valuable security property: all fragments in the ST 
> are strings that appeared literally in code, and therefore untainted.? 
> This is probably too restrictive but we should be aware of what we are 
> giving up as we explore the API options.
> ?- Processors should be encouraged to "flatten" embedded STs.
>
> A few people have implied that only the tainted parts of an ST (the 
> embedded expressions) need special processing, but I'll point out that 
> the untainted parts may often require domain-specific validation.? For 
> example, a ST representing a SQL query wants balanced quotes, and 
> might want to require quotes around embedded expressions.
>
>
>
> On 3/8/2024 1:35 PM, Brian Goetz wrote:
>>
>> Time to check in with where were are with String Templates. ?We?ve 
>> gone through two rounds of preview, and have received some feedback.
>>
>> As a reminder, the primary goal of gathering feedback is to learn 
>> things about the design or implementation that we don?t already know. 
>> ?This could be bug reports, experience reports, code review, careful 
>> analysis, novel alternatives, etc. ? ?And the best feedback usually 
>> comes from using the feature??in anger??? trying to actually write 
>> code with it. (?Some people would prefer a?different syntax? or??some 
>> people would prefer we focused on string interpolation only??fall 
>> squarely in the??things we already knew? camp.)
>>
>> In the course of?using this feature in the `jextract` project, we did 
>> learn quite a few things we didn?t already know, and this was 
>> conclusive enough that it has motivated us to adjust our approach in 
>> this feature. ?Specifically, the role of processors is ?outsized? to 
>> the value they offer, and, after further exploration, we now believe 
>> it is possible to achieve the goals of the feature without an 
>> explicit ?processor? abstraction at all! ?This is a very positive 
>> development.
>>
>> First, I want to affirm that that the goals of the project have not 
>> changed. ?From JEP 459:
>>
>> Goals
>>
>> ? Simplify the writing of Java programs by making it easy to express 
>> strings that include values computed at run time.
>> ? Enhance the readability of expressions that mix text and 
>> expressions, whether the text fits on a single source line (as with 
>> string literals) or spans several source lines (as with text blocks).
>> ? Improve the security of Java programs that compose strings from 
>> user-provided values and pass them to other systems (e.g., building 
>> queries for databases) by supporting validation and transformation of 
>> both the template and the values of its embedded expressions.
>> ? Retain flexibility by allowing Java libraries to define the 
>> formatting syntax used in string templates.
>> ? Simplify the use of APIs that accept strings written in non-Java 
>> languages (e.g., SQL, XML, and JSON).
>> ? Enable the creation of non-string values computed from literal text 
>> and embedded expressions without having to transit through an 
>> intermediate string representation.
>>
>> Non-Goals
>> ? It is not a goal to introduce syntactic sugar for Java's string 
>> concatenation operator (+), since that would circumvent the goal of 
>> validation.
>> ? It is not a goal to deprecate or remove the StringBuilder and 
>> StringBuffer classes, which have traditionally been used for complex 
>> or programmatic string composition.
>>
>> Another thing that has not changed is our view on the syntax for 
>> embedding expressions. ?While many people did express the opinion of 
>> ?why not ?just' do what Kotlin/Scala does?, this issue was more than 
>> fully explored during the initial design round. ?(In fact, while 
>> syntax disagreements are often purely subjective, this one was far 
>> more clear ? the $-syntax is objectively worse, and would be doubly 
>> so if injected into an existing language where there were already 
>> string literals in the wild. ?This has all been more than adequately 
>> covered elsewhere, so I won?t rehash it here.)
>>
>>
>> Now, let?s talk about what we do think should change: the role of 
>> processors and the StringTemplate type.
>>
>> Processors were envisioned as a means to abstract the transformation 
>> of templates to their final form (whether string, or something else.) 
>> ?However, Java already has a well established means of abstracting 
>> behavior: methods. ? (In fact, a processor application can be viewed 
>> as merely a new syntax for a method call.) ?Our experience using the 
>> feature highlighted the question: When converting a SQL query 
>> expressed as a template to the form required by the database (such as 
>> PreparedStatement), why do we need to say:
>>
>> ??DB.?? template ??
>>
>> When we could use an ordinary Java library:
>>
>> ??Query q = Query.of(??template??)
>>
>> Indeed, one of the worst things about having processors in the 
>> language is that API designers are put in the difficult situation of 
>> not knowing whether to write a processor or an ordinary API, and 
>> often have to make that choice before the consequences are fully 
>> understood. ?(To add to this, processors raise similar questions at 
>> the use site.) But the real?criticism here is that template capture 
>> and processing are complected, when they should be separate, 
>> composable features.
>>
>> This motivated us to revisit some of the reasons why processors were 
>> so central to the initial design in the first place. ?And it turned 
>> out, this choice had been influenced ? perhaps overly so ? by early 
>> implementation experiments. ?(One of the background design goals was 
>> to enable expensive operations like `String::format` to be (much) 
>> cheaper. ?Without digressing too deeply on performance, 
>> String::format can be more than an order of magnitude worse than the 
>> equivalent concatenation operation, and this in turn sometimes 
>> motivates developers to use worse idioms for formatting. ?The FMT 
>> processor brough that cost back in line with the equivalent 
>> concatenation.) ?These early experiments biased the design towards 
>> needing to know the processor at the point of template capture, but 
>> upon reexamination we realized that there are other ways to achieve 
>> the desired performance goals without requiring processors to be 
>> known at capture time. ?This, in turn, enabled us to revisit a point 
>> in the design space we had transited through earlier, where string 
>> templates were ?just a new kind of literal? and the job performed by 
>> processors could instead be performed by ordinary APIs.
>>
>> At this point, a simpler design and implementation emerged that met 
>> the semantic, correctness, and performance goals: template literals 
>> (?Hello \{name}?) are simply the literal form of StringTemplate:
>>
>> ??StringTemplate st = ?Hello \{name}?;
>>
>> String and StringTemplate remain unrelated types. ?(We explored a 
>> number of ways to interconvert them, but they caused more trouble 
>> than they solved.) ?Processing of string templates, including 
>> interpolation, is done by ordinary APIs that deal in StringTemplate, 
>> aided by some clever implementation tricks to ensure good performance.
>>
>> For APIs where interpolation is known to be safe in the domain, such 
>> as PrintWriter, APIs can make that choice on behalf of the domain, by 
>> providing overloads to embody this design choice:
>>
>> ???void println(String) { ? }
>> ???void println(StringTemplate) { ? interpolate and delegate to 
>> println(String) ?. }
>>
>> The upshot is that for interpolation-safe APIs like println, we can 
>> use a template directly without giving up any safety:
>>
>> ???System.out.println(?Hello \{name}?);
>>
>> In this example, the string template evaluates to StringTemplate, not 
>> String (no implicit interpolation), and chooses the StringTemplate 
>> overload of println, which in turn chooses how to process the 
>> template. This stays true to the design principle that interpolation 
>> is dangerous enough that it should be an explicit choice in the code 
>> ? but it allows that choice to be made by libraries when the library 
>> is comfortable doing so.
>>
>> Similarly, the FMT processor is replaced by an overload of 
>> String::format that interprets templates with embedded format 
>> specifiers (e.g., ?%d?):
>>
>> ??String format(String formatString, Object? parameters) { ? same as 
>> today ? }
>> ??String format(StringTemplate template) {... equivalent of FMT ...}
>>
>> And users can call this as:
>>
>> ??String s = String.format(?Hello %12s\{name}?);
>>
>> Here, the String::format API has chosen to interpret string templates 
>> according to the rules previously specified in the FMT processor (not 
>> ordinary interpolation), but that choice is embedded in the library 
>> semantics so no further explicit choice at the use site is required. 
>> ?The user already chose to pass it to String::format; that?s all the 
>> processing selection that is needed.
>>
>> Where APIs do not express a choice of what template expansion means, 
>> users continue to be free to process them explicitly before passing 
>> them, using APIs that do (such as String::format or ordinary 
>> interpolation.).
>>
>> The result is:
>>
>> - The need for use-site "goop" (previously, the processor name; now, 
>> static or instance methods to process a template) goes away entirely 
>> when dealing with libraries that are already template-friendly.
>> - Even with libraries that require use-site goop, it is no more 
>> intrusive than before, and can be reduced over time as APIs get with 
>> the program.
>> - StringTemplate is just another type that APIs can support if they 
>> want. ?The "DB" processor becomes an ordinary factory method that 
>> accepts a string template or an ordinary builder API.
>> - APIs now can have _more_ control over the timing and meaning of 
>> template processing, because we are not biasing so strongly towards 
>> early processing.
>> - It becomes easier to abstract over template processing (i.e., 
>> combine or manipulate templates as templates before processing)
>> - Interpolation remains an explicit choice, but ST-aware libraries 
>> can make this choice on behalf of the user.
>> - The language feature and API surface get considerably smaller, 
>> which is good. ?Core JDK APIs (e.g., println, format, exception 
>> constructors) get upgraded to work with string templates.
>>
>> The remaining question that everyone is probably asking is: ?so how 
>> do we do interpolation.? ?The answer there is ?ordinary library 
>> methods?. ?This might be a static method 
>> (String.join(StringTemplate)) or an instance method 
>> (template.join()), shed to be painted (but please, not right now.).
>>
>> This is a sketch of direction, so feel free to pose 
>> questions/comments on the direction. ?We?ll discuss the details as we 
>> go.
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240319/acfa9571/attachment-0001.htm>

From asviraspossible at gmail.com  Tue Mar 19 14:33:57 2024
From: asviraspossible at gmail.com (Victor Nazarov)
Date: Tue, 19 Mar 2024 15:33:57 +0100
Subject: Update on String Templates (JEP 459)
In-Reply-To: <40a9ecc1-f9e7-4fc5-a81e-33a468d568d8@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <2f3725be-a2a9-48dc-bfd4-4b4a87f2fea8@oracle.com>
 <4AACBB71-AF69-4425-8841-4E6AE8A04518@oracle.com>
 <B52FF02B-5379-438A-9A23-05519246AB1F@oracle.com>
 <bf368957-e3c2-4d79-b462-06d20e38a387@oracle.com>
 <E3598C9D-26BA-4AC0-AE01-EE7E7C42CA44@oracle.com>
 <3F8C64A7-BEB8-4BA2-A9B1-E00C14578B28@oracle.com>
 <c9374ab8-1e0e-4746-9f6b-b77135ddd809@oracle.com>
 <F217B131-C5FD-4587-B251-06760F08DD36@oracle.com>
 <f0f270b1-d186-4b6b-a451-d5f01d1bcd42@oracle.com>
 <E65EA9D5-312B-4A39-99F7-013D70C6E62C@oracle.com>
 <b221590a-8082-4543-a95f-a870e9dffb8a@oracle.com>
 <330FB2A7-3154-4CC5-AA34-D4ECBFBC713C@oracle.com>
 <7873F0D5-E053-4541-B2A5-2E41B536DD8D@oracle.com>
 <794eaa0d-f244-43a3-af5b-7ecf11ac8a33@oracle.com>
 <CAFOkWZbF5m-LB6zCqYrEYugd3GNyouVTXCpLCqn9uwx5vQhLag@mail.gmail.com>
 <40a9ecc1-f9e7-4fc5-a81e-33a468d568d8@oracle.com>
Message-ID: <CAFOkWZaGWVsa-dSuh8J8Bd3O5-fkSFLgY8z+xt6_7jASnqHzLw@mail.gmail.com>

Below are some more comments, just to close the loop, but what I really
wanted it to highlight once more, that we can treat all Strings as
StringTemplates, not "constant" Strings-literals, but really all Strings.
Each string is a degenerate StringTemplate without holes, the same way as
each integer-number is a degenerate double-number without a non-integer
part.

On Fri, Mar 15, 2024 at 3:54?PM Maurizio Cimadamore <
maurizio.cimadamore at oracle.com> wrote:

> Hi
> Is not all that rosy :-) Comments inline
>
> On 15/03/2024 13:48, Victor Nazarov wrote:
>
> I think the above can be translated almost word for word to
> StringTemplates world:
>
> * stringy-literal that doesn't have holes-with-values can be both String
> and StringTemplate
> * stringy-literal that has holes-with-values can only be StringTemplate
> * m(String),m(StringTemplate) with stringy-literal that can be both String
> and StringTemplate selects String-overload
> * t = s works as when t is StringTemplate and s is String
>
> I assume you mean ?s is a *constant* String? here.
>

No, what I mean is that we can always assign a string to a string template.

StringTemplate t;
String s = "hello";
t = s; // no error, implicit conversion

the same way as we can do

long x;
int y = 5;
x = y;

but we can not do this in backward direction:

> * s = t is compile-time error
>
> > Incompatible types: possible lossy conversion from StringTemplate to
String

the same way as

int x;
long y = 5;
x = y;

results in

> Incompatible types: possible lossy conversion from long to int

> * s = (String) t succeeds, when s is String and t is StringTemplate (and
> does string concatenation)
>
>
Yes, cast is an interpolation, so

int x = 5;
int y = 10;
StringTemplate t = "\{x} + \{y}";
String s = (String) t;
assert s.equals("4 + 10");

Cast is some operation that causes loss of information and is generally
considered unsafe, this is similar to:

long x = 0x10_7F_FF_FF_FFL;
int y = (int) x;
assert y == 0x7F_FF_FF_FFL;
>
> Ok, here I note that you are defining cast conversion from StringTemplate
> to String as always successful (via interpolation).
>
> * t instanceof String succeeds on StringTemplate variable t as long as t
> doesn't have any holes-with-values
>
> This is inconsistent. You now have cases where ?t instanceof String?
> returns false, but where (String)t succeds.
>

I'm not sure this is inconsistent, I think this is exactly what is proposed
in JEP 455. I think the confusion here is the definition of "succeeds", I
think what you mean is throwing ClassCastException, but in JEP 455
"succeeds" means without the loss of information. One of the examples is

int i = 42;
i instanceof byte;        // true (exact)

int i = 1000;
i instanceof byte;        // false (not exact)

So this can be the same with StringTemplates and Strings

StringTemplate t = "hello";
t instanceof String;        // true (exact)

int i = 1000;
StringTemplate t = "i = \{i}"
i instanceof String;        // false (not exact)

* s instanceof StringTemplate succeeds when s is String
>
> Again, probably you mean ?constant String? here.
>

I mean any String succeeds here, the same way as

b instanceof int;         // true (unconditionally exact)


is always true, because there is never loss of any information.

* additionally stringy-literal can use "t" or "T" *suffix* to denote that
> it is really a template, this can be used to tweak overload-selection and
> to certify, that some processing of values is expected
>
> Overall, while I agree this is not completely terrible, we are signing up
> for a lot of work here. There?s new conversion, relationship with pattern
> matching and instanceof to figure out, possible issues with overload
> resolution and inference. For instance, yesterday I mentioned this example:
>
> List<StringTemplate> ls = List.of("Hello");
>
> Which won?t work. One way to look at it, is that it?s as broken as:
>
>  List<Long> ls = List.of(1);
>
> But another way to look at it is that we?re adding more complexity to a
> part of the language that already is shaky. To me that feels like a big
> risk, especially given that the "payoff" is to leave an extra ?t? out at
> the beginning of the template. In orther words, we should be careful about
> right-sizing complexity.
>
First of all, I think I'm in no position to talk about
implementation-complexity, but even from the specification complexity, I
think I mostly agree that the cost of leaving out some extra "t" or "$"
seems too big.
I think I was always against StringTemplates that look exactly like
Strings. As a matter of fact I was one of the initial set of people who
proposed a RAW-processor to replace implicit conversion from
StringTemplates to Strings.

What I wanted to state here is that there is still a way to make the
implicit-conversion story mostly consistent and there is already an
inspiration in the language to rely upon.
I think the thing that mostly defeats the proposed scheme to me, is that
the cast operation for reference types was always about subtyping, but here
we create a special case for two reference-types: String and StringTemplate
that are not properly subtypes of one-another, but have
conversion-relationship.

Also, regarding:
>
> 2) avoiding proliferation of String-literal sublanguages as advocated by
> Brian Goetz
>
> I don?t read that in the same way as you do. I think what Brian meant is
> that anything inside quotes should be uniform. We would not like to have
> different kinds of rules for escaping etc. depending on what kind of
> literal you use. In that sense, sticking a ?t? in front is no different
> from using ??? to denote that what?s coming is a text block.
>
I think there is a specter of things that a language designer can do to
underline or undermine similarities, surely having same delimiters
communicate much better, that rules inside are the same. John Roses
proposal about some special thing *inside* the quotes also communicates
this idea better, then some prefix/suffix right before/after the quotes.

--
Victor Nazarov

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240319/1bfa96c4/attachment-0001.htm>

From guy.steele at oracle.com  Tue Mar 19 17:41:37 2024
From: guy.steele at oracle.com (Guy Steele)
Date: Tue, 19 Mar 2024 17:41:37 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <55c7a57f-fd1f-409c-8393-32f522b27ee5@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <4ad249e1-49de-4e83-bf5a-3577e2aaa5df@oracle.com>
 <55c7a57f-fd1f-409c-8393-32f522b27ee5@oracle.com>
Message-ID: <EE3E29D0-8722-4BC5-B2B2-244173E42282@oracle.com>

On Mar 19, 2024, at 10:23?AM, Brian Goetz <brian.goetz at oracle.com> wrote:

Let's pull on this string some more.  Assuming we settled on disjoint types and syntaxes, with no magic conversions, what library support do we need directly for ST?  I am thinking (please, let's focus on the functionality before we nitpick the names):

    // on String
    static join(StringTemplate)    // previously STR

I assume the return type for the preceding method should be `String`.

    // on StringTemplate
    String join()                                 // STR, instance/suffix version
    static StringTemplate join(StringTemplate...) // + for string templates

This is a pleasantly short set; is anything missing?  (Not addressing the "which things were previously processors, but now need API points" right now -- that's a separate discussion.)

Actually, Brian, I _am_ going to nitpick your use of the name ?join? here for all three of those methods, because, given the comments, they do very different things; the first two do ?string interpolation? on a single template (and in the process convert the values in the template to strings) whereas the last combines multiple templates into a single template (but does not convert any of the values to strings).

Moreover, the existing `join` method of String does yet a different operation: concatenate a sequence of strings, using a given delimiter string (repeatedly, if necessary) as a separator. So I think ?join? was a particularly infelicitous choice of name for these three examples.

Here I set forth your three examples with new names that are related to those already used in the existing preview implementation of StringTemplate in JDK 21 (and JDK 22?I just checked). I do this not to suggest that these other names should be used, but only in the hopes of reducing confusion as we begin this discussion. Later we can decide whether the names ?process? and ?interpolate? and ?combine? should be changed (possibly all into the same single name).

    // on String
    static String process(StringTemplate)    // previously STR

    // on StringTemplate
    String interpolate()                             // STR, instance/suffix version
    static StringTemplate combine(StringTemplate...) // + for string templates

?Guy

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240319/10aff3d1/attachment.htm>

From mark.reinhold at oracle.com  Tue Mar 19 18:19:47 2024
From: mark.reinhold at oracle.com (Mark Reinhold)
Date: Tue, 19 Mar 2024 18:19:47 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <EE3E29D0-8722-4BC5-B2B2-244173E42282@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <4ad249e1-49de-4e83-bf5a-3577e2aaa5df@oracle.com>
 <55c7a57f-fd1f-409c-8393-32f522b27ee5@oracle.com>
 <EE3E29D0-8722-4BC5-B2B2-244173E42282@oracle.com>
Message-ID: <20240319141945.795026959@eggemoggin.niobe.net>

2024/3/19 13:41:37 -0400, guy.steele at oracle.com:
> Actually, Brian, I _am_ going to nitpick your use of the name ?join?
> here for all three of those methods, because, given the comments, they
> do very different things; the first two do ?string interpolation? on a
> single template (and in the process convert the values in the template
> to strings) whereas the last combines multiple templates into a single
> template (but does not convert any of the values to strings).
> 
> Moreover, the existing `join` method of String does yet a different
> operation: concatenate a sequence of strings, using a given delimiter
> string (repeatedly, if necessary) as a separator. So I think ?join?
> was a particularly infelicitous choice of name for these three
> examples.

Agreed.

> Here I set forth your three examples with new names that are related
> to those already used in the existing preview implementation of
> StringTemplate in JDK 21 (and JDK 22?I just checked). I do this not to
> suggest that these other names should be used, but only in the hopes
> of reducing confusion as we begin this discussion. Later we can decide
> whether the names ?process? and ?interpolate? and ?combine? should be
> changed (possibly all into the same single name).
> 
>     // on String
>     static String process(StringTemplate)    // previously STR
> 
>     // on StringTemplate
>     String interpolate()                             // STR, instance/suffix version
>     static StringTemplate combine(StringTemplate...) // + for string templates

Maybe I?m missing something, but: Why do we need both `String::process`
and `StringTemplate::interpolate`?  What are the use cases?

- Mark

From brian.goetz at oracle.com  Tue Mar 19 18:33:20 2024
From: brian.goetz at oracle.com (Brian Goetz)
Date: Tue, 19 Mar 2024 14:33:20 -0400
Subject: Update on String Templates (JEP 459)
In-Reply-To: <20240319141945.795026959@eggemoggin.niobe.net>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <4ad249e1-49de-4e83-bf5a-3577e2aaa5df@oracle.com>
 <55c7a57f-fd1f-409c-8393-32f522b27ee5@oracle.com>
 <EE3E29D0-8722-4BC5-B2B2-244173E42282@oracle.com>
 <20240319141945.795026959@eggemoggin.niobe.net>
Message-ID: <5e4ebcef-9773-42e4-838c-3506e0236616@oracle.com>

>> Here I set forth your three examples with new names that are related
>> to those already used in the existing preview implementation of
>> StringTemplate in JDK 21 (and JDK 22?I just checked). I do this not to
>> suggest that these other names should be used, but only in the hopes
>> of reducing confusion as we begin this discussion. Later we can decide
>> whether the names ?process? and ?interpolate? and ?combine? should be
>> changed (possibly all into the same single name).
>>
>>      // on String
>>      static String process(StringTemplate)    // previously STR
>>
>>      // on StringTemplate
>>      String interpolate()                             // STR, instance/suffix version
>>      static StringTemplate combine(StringTemplate...) // + for string templates
> Maybe I?m missing something, but: Why do we need both `String::process`
> and `StringTemplate::interpolate`?  What are the use cases?

For a similar reason we currently have String::valueOf(int) and 
Integer::toString(int).? In some use cases, the "prefix" usage (static 
method) feels more natural, whereas in others, the "suffix" usage 
(instance method) feels more natural.

Even if we end up with only one, I would rather not bias towards "of 
course it is the static version" at this early point; I am trying to 
sketch out scope right now.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240319/e404374b/attachment.htm>

From mark.reinhold at oracle.com  Tue Mar 19 18:49:45 2024
From: mark.reinhold at oracle.com (Mark Reinhold)
Date: Tue, 19 Mar 2024 18:49:45 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <5e4ebcef-9773-42e4-838c-3506e0236616@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <4ad249e1-49de-4e83-bf5a-3577e2aaa5df@oracle.com>
 <55c7a57f-fd1f-409c-8393-32f522b27ee5@oracle.com>
 <EE3E29D0-8722-4BC5-B2B2-244173E42282@oracle.com>
 <20240319141945.795026959@eggemoggin.niobe.net>
 <5e4ebcef-9773-42e4-838c-3506e0236616@oracle.com>
Message-ID: <20240319144942.761083307@eggemoggin.niobe.net>

2024/3/19 14:33:20 -0400, brian.goetz at oracle.com:
>>> Here I set forth your three examples with new names that are related
>>> to those already used in the existing preview implementation of
>>> StringTemplate in JDK 21 (and JDK 22?I just checked). I do this not to
>>> suggest that these other names should be used, but only in the hopes
>>> of reducing confusion as we begin this discussion. Later we can decide
>>> whether the names ?process? and ?interpolate? and ?combine? should be
>>> changed (possibly all into the same single name).
>>> 
>>>     // on String
>>>     static String process(StringTemplate)    // previously STR
>>> 
>>>     // on StringTemplate
>>>     String interpolate()                             // STR, instance/suffix version
>>>     static StringTemplate combine(StringTemplate...) // + for string templates
>> 
>> Maybe I?m missing something, but: Why do we need both `String::process`
>> and `StringTemplate::interpolate`?  What are the use cases?
> 
> For a similar reason we currently have String::valueOf(int) and 
> Integer::toString(int). In some use cases, the "prefix" usage (static 
> method) feels more natural, whereas in others, the "suffix" usage 
> (instance method) feels more natural.
> 
> Even if we end up with only one, I would rather not bias towards "of 
> course it is the static version" at this early point; I am trying to 
> sketch out scope right now.

Ah, so you?re really asking not about these three methods (modulo
naming) but about just two things, a way to interpolate a template and
a way to concatenate templates, and you consider the (not necessarily
bijective) mapping from those things to API points to be the bikeshed.

- Mark

From maurizio.cimadamore at oracle.com  Tue Mar 19 18:53:22 2024
From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore)
Date: Tue, 19 Mar 2024 18:53:22 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <5e4ebcef-9773-42e4-838c-3506e0236616@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <4ad249e1-49de-4e83-bf5a-3577e2aaa5df@oracle.com>
 <55c7a57f-fd1f-409c-8393-32f522b27ee5@oracle.com>
 <EE3E29D0-8722-4BC5-B2B2-244173E42282@oracle.com>
 <20240319141945.795026959@eggemoggin.niobe.net>
 <5e4ebcef-9773-42e4-838c-3506e0236616@oracle.com>
Message-ID: <3f96120b-b901-4f91-9270-742e8e61404e@oracle.com>

Of course a static method is enough (I find the name ?process? not very 
clear on what that method does though).

One case where the instance syntax comes out on top if in method context:

|foo($"Hello, \{world}".xyz()) |

Which reads better than:

|foo(xyz($"Hello, \{world}")) |

at least IMHO (no doubly nested parens).

Even though, at the end of the day, interpolation is just a processor, 
just a method which takes a StringTemplate and returns a String, it is 
also a very common one, so allowing for an instance method could be a 
possible way to offset the loss of the STR syntax.

Maurizio

On 19/03/2024 18:33, Brian Goetz wrote:

>>> Here I set forth your three examples with new names that are related
>>> to those already used in the existing preview implementation of
>>> StringTemplate in JDK 21 (and JDK 22?I just checked). I do this not to
>>> suggest that these other names should be used, but only in the hopes
>>> of reducing confusion as we begin this discussion. Later we can decide
>>> whether the names ?process? and ?interpolate? and ?combine? should be
>>> changed (possibly all into the same single name).
>>>
>>>      // on String
>>>      static String process(StringTemplate)    // previously STR
>>>
>>>      // on StringTemplate
>>>      String interpolate()                             // STR, instance/suffix version
>>>      static StringTemplate combine(StringTemplate...) // + for string templates
>> Maybe I?m missing something, but: Why do we need both `String::process`
>> and `StringTemplate::interpolate`?  What are the use cases?
>
> For a similar reason we currently have String::valueOf(int) and 
> Integer::toString(int).? In some use cases, the "prefix" usage (static 
> method) feels more natural, whereas in others, the "suffix" usage 
> (instance method) feels more natural.
>
> Even if we end up with only one, I would rather not bias towards "of 
> course it is the static version" at this early point; I am trying to 
> sketch out scope right now.
>
>
?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240319/4135d0fc/attachment-0001.htm>

From guy.steele at oracle.com  Tue Mar 19 18:59:58 2024
From: guy.steele at oracle.com (Guy Steele)
Date: Tue, 19 Mar 2024 18:59:58 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <5e4ebcef-9773-42e4-838c-3506e0236616@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <4ad249e1-49de-4e83-bf5a-3577e2aaa5df@oracle.com>
 <55c7a57f-fd1f-409c-8393-32f522b27ee5@oracle.com>
 <EE3E29D0-8722-4BC5-B2B2-244173E42282@oracle.com>
 <20240319141945.795026959@eggemoggin.niobe.net>
 <5e4ebcef-9773-42e4-838c-3506e0236616@oracle.com>
Message-ID: <17350387-551E-4749-8773-EF73D68CBD9F@oracle.com>


On Mar 19, 2024, at 2:33?PM, Brian Goetz <brian.goetz at oracle.com> wrote:


Here I set forth your three examples with new names that are related
to those already used in the existing preview implementation of
StringTemplate in JDK 21 (and JDK 22?I just checked). I do this not to
suggest that these other names should be used, but only in the hopes
of reducing confusion as we begin this discussion. Later we can decide
whether the names ?process? and ?interpolate? and ?combine? should be
changed (possibly all into the same single name).

    // on String
    static String process(StringTemplate)    // previously STR

    // on StringTemplate
    String interpolate()                             // STR, instance/suffix version
    static StringTemplate combine(StringTemplate...) // + for string templates


Maybe I?m missing something, but: Why do we need both `String::process`
and `StringTemplate::interpolate`?  What are the use cases?


For a similar reason we currently have String::valueOf(int) and Integer::toString(int).  In some use cases, the "prefix" usage (static method) feels more natural, whereas in others, the "suffix" usage (instance method) feels more natural.

Even if we end up with only one, I would rather not bias towards "of course it is the static version" at this early point; I am trying to sketch out scope right now.

Well, we can start by first examining the methods of the StringTemplate interface in JDK 22:

Factory methods:
static StringTemplate of(String)
static StringTemplate of(List<String>, List<?>)
Accessors:
List<String> fragments()
List<Object> values()
Combining:
* static StringTemplate combine(StringTemplate... stringTemplates)
static StringTemplate combine(List<StringTemplate> stringTemplates)
String interpolation:
* default String interpolate()
static String interpolate(List<String> fragments, List<?> values)
Conversion to diagnostic string:
static String toString(StringTemplate stringTemplate)
Processing:
default <R, E extends Throwable> R process(StringTemplate.Processor<? extends R,? extends E> processor)

I reckon we still need the factories and accessors.

As for combining, the first one (indicated by *) is on Brian?s list, and I am not sure the second one is needed, but it is not a big deal to include it, I suppose.

As for string interpolation, the first one (indicated by *) is on Brian?s list, and I am not sure we need the second, static one: is it not equivalent to `StringTemplate.of(frags, vals).interpolate()`? Is it there just to avoid allocating a StringTemplate, or perhaps to support the implementation of the instance method `interpolate()`?

I am not sure why this static `toString` method is there, rather than relying on the `toString()` instance method that every object must support. Is it present to support the implementation of the instance method? I guess I don?t see why this is not just a `default` implementation of the instance method.

I reckon we don?t need this particular `process` method in a design that does not have the `StringTemplate.Processor` interface.

So of these ten methods, I guess we surely need 6, don?t need 1, and I am uncertain about 3 for various reasons.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240319/9ee41e8f/attachment.htm>

From james.laskey at oracle.com  Tue Mar 19 19:07:49 2024
From: james.laskey at oracle.com (Jim Laskey)
Date: Tue, 19 Mar 2024 19:07:49 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <17350387-551E-4749-8773-EF73D68CBD9F@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <4ad249e1-49de-4e83-bf5a-3577e2aaa5df@oracle.com>
 <55c7a57f-fd1f-409c-8393-32f522b27ee5@oracle.com>
 <EE3E29D0-8722-4BC5-B2B2-244173E42282@oracle.com>
 <20240319141945.795026959@eggemoggin.niobe.net>
 <5e4ebcef-9773-42e4-838c-3506e0236616@oracle.com>
 <17350387-551E-4749-8773-EF73D68CBD9F@oracle.com>
Message-ID: <F7C66C2C-59AC-403C-9E62-192BDE32ABFD@oracle.com>

Note: the static implementations are there because you can?t default overridden methods. In this case methods inherited from Object.


On Mar 19, 2024, at 3:59?PM, Guy Steele <guy.steele at oracle.com> wrote:


On Mar 19, 2024, at 2:33?PM, Brian Goetz <brian.goetz at oracle.com> wrote:


Here I set forth your three examples with new names that are related
to those already used in the existing preview implementation of
StringTemplate in JDK 21 (and JDK 22?I just checked). I do this not to
suggest that these other names should be used, but only in the hopes
of reducing confusion as we begin this discussion. Later we can decide
whether the names ?process? and ?interpolate? and ?combine? should be
changed (possibly all into the same single name).

    // on String
    static String process(StringTemplate)    // previously STR

    // on StringTemplate
    String interpolate()                             // STR, instance/suffix version
    static StringTemplate combine(StringTemplate...) // + for string templates


Maybe I?m missing something, but: Why do we need both `String::process`
and `StringTemplate::interpolate`?  What are the use cases?


For a similar reason we currently have String::valueOf(int) and Integer::toString(int).  In some use cases, the "prefix" usage (static method) feels more natural, whereas in others, the "suffix" usage (instance method) feels more natural.

Even if we end up with only one, I would rather not bias towards "of course it is the static version" at this early point; I am trying to sketch out scope right now.

Well, we can start by first examining the methods of the StringTemplate interface in JDK 22:

Factory methods:
static StringTemplate of(String)
static StringTemplate of(List<String>, List<?>)
Accessors:
List<String> fragments()
List<Object> values()
Combining:
* static StringTemplate combine(StringTemplate... stringTemplates)
static StringTemplate combine(List<StringTemplate> stringTemplates)
String interpolation:
* default String interpolate()
static String interpolate(List<String> fragments, List<?> values)
Conversion to diagnostic string:
static String toString(StringTemplate stringTemplate)
Processing:
default <R, E extends Throwable> R process(StringTemplate.Processor<? extends R,? extends E> processor)

I reckon we still need the factories and accessors.

As for combining, the first one (indicated by *) is on Brian?s list, and I am not sure the second one is needed, but it is not a big deal to include it, I suppose.

As for string interpolation, the first one (indicated by *) is on Brian?s list, and I am not sure we need the second, static one: is it not equivalent to `StringTemplate.of(frags, vals).interpolate()`? Is it there just to avoid allocating a StringTemplate, or perhaps to support the implementation of the instance method `interpolate()`?

I am not sure why this static `toString` method is there, rather than relying on the `toString()` instance method that every object must support. Is it present to support the implementation of the instance method? I guess I don?t see why this is not just a `default` implementation of the instance method.

I reckon we don?t need this particular `process` method in a design that does not have the `StringTemplate.Processor` interface.

So of these ten methods, I guess we surely need 6, don?t need 1, and I am uncertain about 3 for various reasons.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240319/3cd451f3/attachment-0001.htm>

From guy.steele at oracle.com  Tue Mar 19 19:13:14 2024
From: guy.steele at oracle.com (Guy Steele)
Date: Tue, 19 Mar 2024 19:13:14 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <3f96120b-b901-4f91-9270-742e8e61404e@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <4ad249e1-49de-4e83-bf5a-3577e2aaa5df@oracle.com>
 <55c7a57f-fd1f-409c-8393-32f522b27ee5@oracle.com>
 <EE3E29D0-8722-4BC5-B2B2-244173E42282@oracle.com>
 <20240319141945.795026959@eggemoggin.niobe.net>
 <5e4ebcef-9773-42e4-838c-3506e0236616@oracle.com>
 <3f96120b-b901-4f91-9270-742e8e61404e@oracle.com>
Message-ID: <4B132C51-36B2-48AB-9B16-C4ADFE4B78F7@oracle.com>

On general principle, I would agree about instance methods often being better than static methods for various reasons, other things being equal.

But in this specific discussion we have heard concerns about confusing templates with string interpolation. When it comes to safety of (say) SQL processing, which of the following would we prefer:

SQL($"SELECT * FROM \{tableName} blah blah blah?.xyzpdq());

SQL(xyzpdq($"SELECT * FROM \{tableName} blah blah blah?));

The first one avoids nested parentheses. The second one says right up front, ?Danger, Will Robinson, danger! The template will not be processed in the way you might think!"

On Mar 19, 2024, at 2:53?PM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com> wrote:


Of course a static method is enough (I find the name ?process? not very clear on what that method does though).

One case where the instance syntax comes out on top if in method context:

foo($"Hello, \{world}".xyz())


Which reads better than:

 foo(xyz($"Hello, \{world}"))


at least IMHO (no doubly nested parens).

Even though, at the end of the day, interpolation is just a processor, just a method which takes a StringTemplate and returns a String, it is also a very common one, so allowing for an instance method could be a possible way to offset the loss of the STR syntax.

Maurizio

On 19/03/2024 18:33, Brian Goetz wrote:


Here I set forth your three examples with new names that are related
to those already used in the existing preview implementation of
StringTemplate in JDK 21 (and JDK 22?I just checked). I do this not to
suggest that these other names should be used, but only in the hopes
of reducing confusion as we begin this discussion. Later we can decide
whether the names ?process? and ?interpolate? and ?combine? should be
changed (possibly all into the same single name).

    // on String
    static String process(StringTemplate)    // previously STR

    // on StringTemplate
    String interpolate()                             // STR, instance/suffix version
    static StringTemplate combine(StringTemplate...) // + for string templates


Maybe I?m missing something, but: Why do we need both `String::process`
and `StringTemplate::interpolate`?  What are the use cases?


For a similar reason we currently have String::valueOf(int) and Integer::toString(int).  In some use cases, the "prefix" usage (static method) feels more natural, whereas in others, the "suffix" usage (instance method) feels more natural.

Even if we end up with only one, I would rather not bias towards "of course it is the static version" at this early point; I am trying to sketch out scope right now.


?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240319/99bdd6a8/attachment.htm>

From guy.steele at oracle.com  Tue Mar 19 19:17:16 2024
From: guy.steele at oracle.com (Guy Steele)
Date: Tue, 19 Mar 2024 19:17:16 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <F7C66C2C-59AC-403C-9E62-192BDE32ABFD@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <4ad249e1-49de-4e83-bf5a-3577e2aaa5df@oracle.com>
 <55c7a57f-fd1f-409c-8393-32f522b27ee5@oracle.com>
 <EE3E29D0-8722-4BC5-B2B2-244173E42282@oracle.com>
 <20240319141945.795026959@eggemoggin.niobe.net>
 <5e4ebcef-9773-42e4-838c-3506e0236616@oracle.com>
 <17350387-551E-4749-8773-EF73D68CBD9F@oracle.com>
 <F7C66C2C-59AC-403C-9E62-192BDE32ABFD@oracle.com>
Message-ID: <ABB26557-8B7F-45C9-9976-62E2183521C3@oracle.com>

Oh, right, duh. Therefore it really is there, at least in part, to support the (overriding) implementation of the `toString()` instance method that should be in every class that implements StringTemplate?

In this simplified world, do we still really contemplate wanting to support multiple implementations of StringTemplate, as opposed to just making it a class?

On Mar 19, 2024, at 3:07?PM, Jim Laskey <james.laskey at oracle.com> wrote:

Note: the static implementations are there because you can?t default overridden methods. In this case methods inherited from Object.


On Mar 19, 2024, at 3:59?PM, Guy Steele <guy.steele at oracle.com> wrote:


On Mar 19, 2024, at 2:33?PM, Brian Goetz <brian.goetz at oracle.com> wrote:


Here I set forth your three examples with new names that are related
to those already used in the existing preview implementation of
StringTemplate in JDK 21 (and JDK 22?I just checked). I do this not to
suggest that these other names should be used, but only in the hopes
of reducing confusion as we begin this discussion. Later we can decide
whether the names ?process? and ?interpolate? and ?combine? should be
changed (possibly all into the same single name).

    // on String
    static String process(StringTemplate)    // previously STR

    // on StringTemplate
    String interpolate()                             // STR, instance/suffix version
    static StringTemplate combine(StringTemplate...) // + for string templates


Maybe I?m missing something, but: Why do we need both `String::process`
and `StringTemplate::interpolate`?  What are the use cases?


For a similar reason we currently have String::valueOf(int) and Integer::toString(int).  In some use cases, the "prefix" usage (static method) feels more natural, whereas in others, the "suffix" usage (instance method) feels more natural.

Even if we end up with only one, I would rather not bias towards "of course it is the static version" at this early point; I am trying to sketch out scope right now.

Well, we can start by first examining the methods of the StringTemplate interface in JDK 22:

Factory methods:
static StringTemplate of(String)
static StringTemplate of(List<String>, List<?>)
Accessors:
List<String> fragments()
List<Object> values()
Combining:
* static StringTemplate combine(StringTemplate... stringTemplates)
static StringTemplate combine(List<StringTemplate> stringTemplates)
String interpolation:
* default String interpolate()
static String interpolate(List<String> fragments, List<?> values)
Conversion to diagnostic string:
static String toString(StringTemplate stringTemplate)
Processing:
default <R, E extends Throwable> R process(StringTemplate.Processor<? extends R,? extends E> processor)

I reckon we still need the factories and accessors.

As for combining, the first one (indicated by *) is on Brian?s list, and I am not sure the second one is needed, but it is not a big deal to include it, I suppose.

As for string interpolation, the first one (indicated by *) is on Brian?s list, and I am not sure we need the second, static one: is it not equivalent to `StringTemplate.of(frags, vals).interpolate()`? Is it there just to avoid allocating a StringTemplate, or perhaps to support the implementation of the instance method `interpolate()`?

I am not sure why this static `toString` method is there, rather than relying on the `toString()` instance method that every object must support. Is it present to support the implementation of the instance method? I guess I don?t see why this is not just a `default` implementation of the instance method.

I reckon we don?t need this particular `process` method in a design that does not have the `StringTemplate.Processor` interface.

So of these ten methods, I guess we surely need 6, don?t need 1, and I am uncertain about 3 for various reasons.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240319/08afd1a2/attachment-0001.htm>

From james.laskey at oracle.com  Tue Mar 19 19:24:55 2024
From: james.laskey at oracle.com (Jim Laskey)
Date: Tue, 19 Mar 2024 19:24:55 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <ABB26557-8B7F-45C9-9976-62E2183521C3@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <4ad249e1-49de-4e83-bf5a-3577e2aaa5df@oracle.com>
 <55c7a57f-fd1f-409c-8393-32f522b27ee5@oracle.com>
 <EE3E29D0-8722-4BC5-B2B2-244173E42282@oracle.com>
 <20240319141945.795026959@eggemoggin.niobe.net>
 <5e4ebcef-9773-42e4-838c-3506e0236616@oracle.com>
 <17350387-551E-4749-8773-EF73D68CBD9F@oracle.com>
 <F7C66C2C-59AC-403C-9E62-192BDE32ABFD@oracle.com>
 <ABB26557-8B7F-45C9-9976-62E2183521C3@oracle.com>
Message-ID: <91D3289C-2EA4-4C22-95AF-874186F71865@oracle.com>

The current prototype is a single final class, so the statics get folded in.


On Mar 19, 2024, at 4:17?PM, Guy Steele <guy.steele at oracle.com> wrote:

Oh, right, duh. Therefore it really is there, at least in part, to support the (overriding) implementation of the `toString()` instance method that should be in every class that implements StringTemplate?

In this simplified world, do we still really contemplate wanting to support multiple implementations of StringTemplate, as opposed to just making it a class?

On Mar 19, 2024, at 3:07?PM, Jim Laskey <james.laskey at oracle.com> wrote:

Note: the static implementations are there because you can?t default overridden methods. In this case methods inherited from Object.


On Mar 19, 2024, at 3:59?PM, Guy Steele <guy.steele at oracle.com> wrote:


On Mar 19, 2024, at 2:33?PM, Brian Goetz <brian.goetz at oracle.com> wrote:


Here I set forth your three examples with new names that are related
to those already used in the existing preview implementation of
StringTemplate in JDK 21 (and JDK 22?I just checked). I do this not to
suggest that these other names should be used, but only in the hopes
of reducing confusion as we begin this discussion. Later we can decide
whether the names ?process? and ?interpolate? and ?combine? should be
changed (possibly all into the same single name).

    // on String
    static String process(StringTemplate)    // previously STR

    // on StringTemplate
    String interpolate()                             // STR, instance/suffix version
    static StringTemplate combine(StringTemplate...) // + for string templates


Maybe I?m missing something, but: Why do we need both `String::process`
and `StringTemplate::interpolate`?  What are the use cases?


For a similar reason we currently have String::valueOf(int) and Integer::toString(int).  In some use cases, the "prefix" usage (static method) feels more natural, whereas in others, the "suffix" usage (instance method) feels more natural.

Even if we end up with only one, I would rather not bias towards "of course it is the static version" at this early point; I am trying to sketch out scope right now.

Well, we can start by first examining the methods of the StringTemplate interface in JDK 22:

Factory methods:
static StringTemplate of(String)
static StringTemplate of(List<String>, List<?>)
Accessors:
List<String> fragments()
List<Object> values()
Combining:
* static StringTemplate combine(StringTemplate... stringTemplates)
static StringTemplate combine(List<StringTemplate> stringTemplates)
String interpolation:
* default String interpolate()
static String interpolate(List<String> fragments, List<?> values)
Conversion to diagnostic string:
static String toString(StringTemplate stringTemplate)
Processing:
default <R, E extends Throwable> R process(StringTemplate.Processor<? extends R,? extends E> processor)

I reckon we still need the factories and accessors.

As for combining, the first one (indicated by *) is on Brian?s list, and I am not sure the second one is needed, but it is not a big deal to include it, I suppose.

As for string interpolation, the first one (indicated by *) is on Brian?s list, and I am not sure we need the second, static one: is it not equivalent to `StringTemplate.of(frags, vals).interpolate()`? Is it there just to avoid allocating a StringTemplate, or perhaps to support the implementation of the instance method `interpolate()`?

I am not sure why this static `toString` method is there, rather than relying on the `toString()` instance method that every object must support. Is it present to support the implementation of the instance method? I guess I don?t see why this is not just a `default` implementation of the instance method.

I reckon we don?t need this particular `process` method in a design that does not have the `StringTemplate.Processor` interface.

So of these ten methods, I guess we surely need 6, don?t need 1, and I am uncertain about 3 for various reasons.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240319/7c9398b3/attachment.htm>

From ccherlin at gmail.com  Tue Mar 19 20:01:28 2024
From: ccherlin at gmail.com (Clement Cherlin)
Date: Tue, 19 Mar 2024 15:01:28 -0500
Subject: Update on String Templates (JEP 459)
In-Reply-To: <4B132C51-36B2-48AB-9B16-C4ADFE4B78F7@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <4ad249e1-49de-4e83-bf5a-3577e2aaa5df@oracle.com>
 <55c7a57f-fd1f-409c-8393-32f522b27ee5@oracle.com>
 <EE3E29D0-8722-4BC5-B2B2-244173E42282@oracle.com>
 <20240319141945.795026959@eggemoggin.niobe.net>
 <5e4ebcef-9773-42e4-838c-3506e0236616@oracle.com>
 <3f96120b-b901-4f91-9270-742e8e61404e@oracle.com>
 <4B132C51-36B2-48AB-9B16-C4ADFE4B78F7@oracle.com>
Message-ID: <CALEU8=wAe5GDijjSE2DddzubSiz8zGD=2DybMyFW3rk28o_h8g@mail.gmail.com>

On Tue, Mar 19, 2024 at 2:13?PM Guy Steele <guy.steele at oracle.com> wrote:

> On general principle, I would agree about instance methods often being
> better than static methods for various reasons, other things being equal.
>
> But in this specific discussion we have heard concerns about confusing
> templates with string interpolation. When it comes to safety of (say) SQL
> processing, which of the following would we prefer:
>
> SQL($"SELECT * FROM \{tableName} blah blah blah?.xyzpdq());
>
> SQL(xyzpdq($"SELECT * FROM \{tableName} blah blah blah?));
>
> The first one avoids nested parentheses. The second one says right up
> front, ?Danger, Will Robinson, danger! The template will not be processed
> in the way you might think!"
>

If xyzpdq() is STR/interpolate/join, and SQL() only accepts StringTemplate
(as it should), then either one will fail to compile.

I am increasingly convinced that the right framing of String vs
StringTemplate is not a family relationship, like String vs CharSequence,
but an arms-length relationship like String vs byte[]. I would extremely
rarely write two same-named overloads, one accepting byte[] and the other
String.

While String and StringTemplate (and byte[]) are strongly related, and you
can readily create any one from any other, they serve distinct purposes and
have very different security properties. Like byte[], a StringTemplate can
contain mutable data, and also like byte[], you need additional context to
safely and correctly process one into a String or other object. Unlike both
byte[], and String, a StringTemplate contains arbitrary objects with
arbitrary behaviors, which gives StringTemplate a completely different
threat model.

For all of these reasons, I would extremely rarely write same-named
overloads, one accepting String and the other StringTemplate. A client
having to deal with a compiler error if they call the wrong one is a
feature.

I think the release notes and documentation for StringTemplate should
explicitly recommend different names for methods accepting String vs.
StringTemplate, rather than same-name overloads, for both safety and
clarity.

Cheers,
Clement Cherlin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240319/30512ac4/attachment-0001.htm>

From brian.goetz at oracle.com  Tue Mar 19 20:05:32 2024
From: brian.goetz at oracle.com (Brian Goetz)
Date: Tue, 19 Mar 2024 16:05:32 -0400
Subject: Update on String Templates (JEP 459)
In-Reply-To: <17350387-551E-4749-8773-EF73D68CBD9F@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <4ad249e1-49de-4e83-bf5a-3577e2aaa5df@oracle.com>
 <55c7a57f-fd1f-409c-8393-32f522b27ee5@oracle.com>
 <EE3E29D0-8722-4BC5-B2B2-244173E42282@oracle.com>
 <20240319141945.795026959@eggemoggin.niobe.net>
 <5e4ebcef-9773-42e4-838c-3506e0236616@oracle.com>
 <17350387-551E-4749-8773-EF73D68CBD9F@oracle.com>
Message-ID: <90b61ab1-c443-40d0-b1e4-38b3ad3e1177@oracle.com>


> Well, we can start by first examining the methods of the 
> StringTemplate interface in JDK 22:
>
> I reckon we still need the factories and accessors.

I would like to question the "factories" part of that.? As Ron has 
pointed out, having string / string template literals be the sole source 
of fragments in STs preserves a valuable non-tainting property.? (Remi 
has expressed reasonable concerns that we will not be able to preserve 
this property in the end, but it is worth trying.)?? This is part of the 
motivation for `combine`; preserving this safety invariant as well as 
providing a convenient operation.

> As for string interpolation, the first one (indicated by *) is on 
> Brian?s list, and I am not sure we need the second, static one: is it 
> not equivalent to `StringTemplate.of(frags, vals).interpolate()`? Is 
> it there just to avoid allocating a StringTemplate, or perhaps to 
> support the implementation of the instance method `interpolate()`?

There is a performance aspect to this.? The details vary from the 
initial version that we previewed previously to the latest proposal, but 
the essence is the same: if we can tie a string template instance to its 
point of capture in the source code, then we can cache a method handle 
that has full knowledge of the template fragments and the embedded 
expression types to optimize the conversion.? If the same capture site 
captures a template with different embedded expressions (such as code in 
a loop, or the toString method of some object that uses templates), we 
can reuse that MH.? There are many possible optimizations here (and even 
more for a complex processor like FMT, that has to do a lot of work when 
scanning the format string.)

If a ST has its origin in an actual capture site, as a ST literal does, 
then we have a place to cache this.? The current design uses the 
invokedynamic linkage state, but there are other possible places too.? 
Calling `StringTemplate.of(frags, vals).interpolate()` has no such place 
to cache the analysis of the constant parts (fragments and expressions 
types.)

> I am not sure why this static `toString` method is there, rather than 
> relying on the `toString()` instance method that every object must 
> support. Is it present to support the implementation of the instance 
> method? I guess I don?t see why this is not just a `default` 
> implementation of the instance method.

I think that's right (Jim might have some context here.)


From guy.steele at oracle.com  Tue Mar 19 20:21:47 2024
From: guy.steele at oracle.com (Guy Steele)
Date: Tue, 19 Mar 2024 20:21:47 +0000
Subject: Update on String Templates (JEP 459)
In-Reply-To: <90b61ab1-c443-40d0-b1e4-38b3ad3e1177@oracle.com>
References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com>
 <4ad249e1-49de-4e83-bf5a-3577e2aaa5df@oracle.com>
 <55c7a57f-fd1f-409c-8393-32f522b27ee5@oracle.com>
 <EE3E29D0-8722-4BC5-B2B2-244173E42282@oracle.com>
 <20240319141945.795026959@eggemoggin.niobe.net>
 <5e4ebcef-9773-42e4-838c-3506e0236616@oracle.com>
 <17350387-551E-4749-8773-EF73D68CBD9F@oracle.com>
 <90b61ab1-c443-40d0-b1e4-38b3ad3e1177@oracle.com>
Message-ID: <20380106-D3B9-473A-BAAC-58DA9B9BBD34@oracle.com>

Good; thanks for explaining the motivation. This all sounds right to me.

Other than that, I cannot think of any other clearly needed operations on templates. They are essentially a specialized sequenced immutable aggregate structure, but a lot of the generic operations one might want on such structures (map, reduce, sort) don't seem terribly relevant to templates.

> On Mar 19, 2024, at 4:05?PM, Brian Goetz <brian.goetz at oracle.com> wrote:
> 
> 
>> Well, we can start by first examining the methods of the StringTemplate interface in JDK 22:
>> 
>> I reckon we still need the factories and accessors.
> 
> I would like to question the "factories" part of that.  As Ron has pointed out, having string / string template literals be the sole source of fragments in STs preserves a valuable non-tainting property.  (Remi has expressed reasonable concerns that we will not be able to preserve this property in the end, but it is worth trying.)   This is part of the motivation for `combine`; preserving this safety invariant as well as providing a convenient operation.
> 
>> As for string interpolation, the first one (indicated by *) is on Brian?s list, and I am not sure we need the second, static one: is it not equivalent to `StringTemplate.of(frags, vals).interpolate()`? Is it there just to avoid allocating a StringTemplate, or perhaps to support the implementation of the instance method `interpolate()`?
> 
> There is a performance aspect to this.  The details vary from the initial version that we previewed previously to the latest proposal, but the essence is the same: if we can tie a string template instance to its point of capture in the source code, then we can cache a method handle that has full knowledge of the template fragments and the embedded expression types to optimize the conversion.  If the same capture site captures a template with different embedded expressions (such as code in a loop, or the toString method of some object that uses templates), we can reuse that MH.  There are many possible optimizations here (and even more for a complex processor like FMT, that has to do a lot of work when scanning the format string.)
> 
> If a ST has its origin in an actual capture site, as a ST literal does, then we have a place to cache this.  The current design uses the invokedynamic linkage state, but there are other possible places too.  Calling `StringTemplate.of(frags, vals).interpolate()` has no such place to cache the analysis of the constant parts (fragments and expressions types.)
> 
>> I am not sure why this static `toString` method is there, rather than relying on the `toString()` instance method that every object must support. Is it present to support the implementation of the instance method? I guess I don?t see why this is not just a `default` implementation of the instance method.
> 
> I think that's right (Jim might have some context here.)
> 
> 


From ccherlin at gmail.com  Tue Mar 19 20:42:38 2024
From: ccherlin at gmail.com (Clement Cherlin)
Date: Tue, 19 Mar 2024 15:42:38 -0500
Subject: String Template processors vs Code Reflection?
In-Reply-To: <c38147d9-3928-4de5-8933-edc534fc1222@app.fastmail.com>
References: <8b09da5b-5224-4428-8344-2816e01e15bf@app.fastmail.com>
 <CALEU8=xWvQtKHPbrtHLeEQGyGtfWZb8UObDofVCJqRxrcRWzzQ@mail.gmail.com>
 <CALEU8=w_x950xqfWj7ZZryzCHt4EfxZtgKS40rV_QX6gRoBPAA@mail.gmail.com>
 <c38147d9-3928-4de5-8933-edc534fc1222@app.fastmail.com>
Message-ID: <CALEU8=yPnKRQou3Lvw=E7OrLfovj75RLdZU_ggpY2c_rUwBFkA@mail.gmail.com>

String Template syntax is actually already much more convenient. You close
an embedded expression with '}', not '\}', and embedded expressions are
regular Java code, no escaping required. $"foo = \{ "bar" }" is valid, as
are line breaks and comments inside an embedded expression.

Template syntax deliberately does not use '+' for several reasons. You can
already use '+' to join Strings and other objects, implicitly calling
.toString() on the other objects. StringTemplate syntax is intended to be
easier to read and write; $"literal \{object} literal \{object}" is much
nicer than "literal " + object + " literal" + object. Finally, simple
interpolation is far from the only use case of StringTemplates: they are
intended to be passed to methods that convert them into more complex
things, like correctly-escaped SQL, JSON, HTML/XML, etc, and other
arbitrary non-character-based data types. Using '+' would imply eager
direct interpolation, which is not how string templates work.

Cheers,
Clement Cherlin

On Mon, Mar 18, 2024 at 4:30?PM Mikael Sterner <msterner at openjdk.mxy.se>
wrote:

> Thanks for the response, and yes it was code reflection from Babylon I
> referred to, sorry for not being clearer.
>
> Indeed full code reflection would be more powerful. My curiosity how the
> concepts relate was triggered by the fact that if the code reflection is
> limited to snippets with only aggregation (e.g. operator +) and two types
> (fragments and values) it seems to become very similar to string template
> processing:
>
> // Code reflection processor
> processor.apply(@CodeReflection () -> $"foo = " + bar);
>
> // String template processor
> processor.apply($"foo = \{bar\}");
>
> Where $"..." is shorthand for a fragment literal in the code reflection
> case, and (obviously) a string template in the string template case.
>
> For the code reflection case you could in principle offer a shorthand
> $"foo = \{bar\}" to mean the same kind of aggregation as $"foo = " + bar.
>
> And similarly for the string template case it seems you could offer $"foo
> = " + bar as an alternative aggregation syntax, i.e. translate it to the
> same string template as $"foo = \{bar\}".
>
> (I guess in such a world it would be a personal preferences which kind of
> aggregation style you would prefer: the one with fragments and values
> delimited by + or the one with everything inline in one string template
> literal. One advantage of the former would be that string literal values
> wouldn't need escaping, i.e. $"foo = " + "bar" vs $"foo = \{\"bar\"\}", and
> also line breaks could be more naturally inserted between the parts without
> having to use multiline text blocks.)
>
> Yours,
> Mikael Sterner
>
>
> On Mon, Mar 18, 2024, at 19:04, Clement Cherlin wrote:
>
> Oh, I see, you were talking about code reflection from Project Babylon
> https://openjdk.org/projects/babylon/
>
> You asked, "Is the need to process string templates, seen as code snippets
> aggregating static and dynamic strings, just a special case of a more
> general pattern of processing code snippets semi-lazily using custom rules?
> (Such as safe handling of dynamic strings, or contextual operator
> overloading.)"
>
> No, because String Templates are not code snippets, they're simple
> aggregations of strings and other values. Project Babylon deals in code
> snippets. They are related in the sense of making Java more expressive and
> able to deal with various DSLs and embedded expressions of various types,
> but they way they go about it is very different.
>
> Cheers,
> Clement Cherlin
>
> On Mon, Mar 18, 2024 at 8:50?AM Clement Cherlin <ccherlin at gmail.com>
> wrote:
>
> Hi Mikael,
>
> It looks to me like you're talking about a generic macro processor, not a
> string template processor. While that's an interesting idea, I think the
> scope is so much greater than string templates, it would make more sense as
> its own proposal (see https://openjdk.org/jeps/1 and
> https://cr.openjdk.org/~mr/jep/jep-2.0-02.html for details on the JEP
> process).
>
> String templates, as currently designed, do not capture code snippets, but
> values. The value arguments to a template expression are evaluated to
> ordinary objects.
>
> As stated elsewhere, a template object is simply a wrapper for the N
> values and N+1 string fragments of the template expression. The values are
> evaluated just like the parameters of any other Java constructor/method
> call. The source code or bytecode of the expressions that created the
> values is simply not available.
>
> That said, if you want to pass Java code as strings to the template and do
> some magic with the Classfile API (now in preview) at runtime to generate
> code, you can. You can also use an annotation processor (like Lombok) or
> Java compiler plugin (like Manifold) to do all sorts of advanced
> manipulation of source/bytecode at compile time.
>
> Lombok: https://projectlombok.org/
> Manifold: https://github.com/manifold-systems/manifold
>
> Cheers,
> Clement Cherlin
>
> On Fri, Mar 15, 2024 at 11:32?PM Mikael Sterner <msterner at openjdk.mxy.se>
> wrote:
>
> Hi Experts!
>
> What is the relationship between string template processors and code
> reflection, and would it influence the design of string templates and their
> literals?
>
> Is the need to process string templates, seen as code snippets aggregating
> static and dynamic strings, just a special case of a more general pattern
> of processing code snippets semi-lazily using custom rules? (Such as safe
> handling of dynamic strings, or contextual operator overloading.)
>
> Examples:
>
> // String template processor
>
> String table = "foo bar";
> ResultSet r = s.executeQuery("SELECT * FROM \{table\}"); // Escape dynamic
> table names
>
> // Code reflection processors
>
> String table = "foo bar";
> ResultSet r = s.executeQuery(@CodeReflection () -> "SELECT * FROM " +
> table); // Escape dynamic table names
>
> String value = "bar";
> Pattern p = Pattern.compile(@CodeReflection () -> "foo = " + value); //
> Quote dynamic strings
>
> Matrix m = Matrix.eval(@CodeReflection () -> Matrix.diag(1, 2, 3) *
> Matrix.col(2, 3, 4)); // Matrix multiplication
>
> Document d = HTML.compile(@CodeReflection () -> Body.of(Div.of("Hello
> World!"))); // Escape strings
>
> Each such code reflecting processor API defining it's own rules for how to
> handle code snippets, including processing of any raw string templates
> when/if they are added to the language. Other types than String and
> StringTemplate being handled as fits each API, and to the level of type
> safety wanted by the API user. Evaluation being done lazily or eagerly as
> appropriate.
>
> (In the above "@CodeReflection () ->" is short for some syntax allowing
> inline code reflecting snippets, passed to the API processors in a way that
> allows them to reflect on the snippet code.)
>
> Yours,
> Mikael Sterner
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240319/add3b50f/attachment-0001.htm>

From gavin.bierman at oracle.com  Tue Mar 26 17:48:21 2024
From: gavin.bierman at oracle.com (Gavin Bierman)
Date: Tue, 26 Mar 2024 17:48:21 +0000
Subject: Draft Spec for Preview of Derived Record Creation (JEP 468)
In-Reply-To: <20240228200401.D42EB6C2F78@eggemoggin.niobe.net>
References: <20240228200401.D42EB6C2F78@eggemoggin.niobe.net>
Message-ID: <034746C7-6C34-4759-A359-874372BD876A@oracle.com>

Dear experts:

The first draft of a spec covering JEP 468 (Derived Record Creation (Preview)) is available at:

https://cr.openjdk.org/~gbierman/jep468/latest/

Feel free to contact me directly or on this list with any comments.

Thanks
Gavin

On 28 Feb 2024, at 20:04, Mark Reinhold <mark.reinhold at oracle.com> wrote:

https://openjdk.org/jeps/468

 Summary: Enhance the Java language with derived creation for
 records. Records are immutable objects, so developers frequently create
 new records from old records to model new data. Derived creation
 streamlines code by deriving a new record from an existing record,
 specifying only the components that are different. This is a preview
 language feature.

- Mark

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240326/fdb18438/attachment.htm>

From forax at univ-mlv.fr  Thu Mar 28 09:05:47 2024
From: forax at univ-mlv.fr (Remi Forax)
Date: Thu, 28 Mar 2024 10:05:47 +0100 (CET)
Subject: String template interpolation as a two steps process
Message-ID: <237706846.41203165.1711616747142.JavaMail.zimbra@univ-eiffel.fr>

Hello,
over last week-end, i've implemented an XML template processor using the Java 22 state of the spec (using old template processor syntax) and i would like to propose to see the processing of a string template as a two steps process.

I will use the XML template processor i've developed as an example,
  https://github.com/forax/html-component/blob/master/src/test/java/Demo.java

Here is how it works, the idea is that if i want to generate the XML of a product, i will write something like this.

record Product(String name, int price) implements Component {
  public Renderer render() {
    return $."""
          <tr class=".product">
            <td>\{name}</td><td>\{price * 1.20}</td>
          </tr>
          """;
  }
}

Component is an interface with only one method render() that returns a Renderer and a Renderer is also an interface that is able to send XML events.
And "$" is the name of the template processor defined in Component as a static field.

The code of the template processor is here
  https://github.com/forax/html-component/blob/master/src/main/java/com/github/forax/htmlcomponent/ComponentTemplateProcessor.java#L193

Conceptually, what a template processor should do is a two step process, first validate the template, in my case validate that the template is a valid XML fragment and then interpolate the result of the validation using the arguments of the template.

So processing a sting template is currently
  process(StringTemplate) <=> { validate(StringTemplate); interpolate(StringTemplate); }

There are two main shortcomings of the idea that processing a string template is equivalent to calling a method that takes a StringTemplate.
- (notypes) the types of the holes are no propagated to the StringTemplate, so the validation part can not verify that the template is correctly typed.
- (cache) the validation part has to be re-executed each time.

To illustrate the issue (notype), I can have a XML fragment that depends on another class, but i've no way to test if the referenced Product is a record that takes a name of type String and a price of type int because while those types are known by the compiler, they are not available into the String Template.

record Cart() implements Component {
  public Renderer render() {
    return $."""
          <table>
            <Product name="wood" price="\{10}"/>
            <Product name="cristal" price="\{300}"/>
          </table>
          """;
  }
}  

To illustrate the issue (cache), in the code above, i've two calls to rend a Product with different attributes, but for each call to Product::render(), the validation step will be re-executed. As an implementer, I can try to cache the result of the validation but that's far from easy, very bug prone and ultimately not very efficient.

Given that a string template literal is a literal, i propose that the Java runtime helps by doing the caching of the validation step.

The simplest way I see for that is to separate string template in two, a constant template part composed of the fragments (List<String>) and the types (List<Class<?>>) from the non constant part, the arguments of the template (List<Object>).

For that, we need a user-defined intermediary object that correspond to the result of the validation, the creation of this object is the proof that the string template is validated and this object can be cached by the JDK runtime.

In that case, processing a string template is equivalent to
  var cached userDefinedValidatedTemplate = validateAndCreate(List<String> fragment, List<Class<?>> types);
  process(userDefinedValidatedTemplate, arguments); }


So
- I propose that StringTemplate is the tuple List<String> fragment, List<Class<?>> types.

- Users can create a special template validated class, with a factory method that takes a StringTemplate and is tagged a being a template validator
   
  for example
    __template_validated__ class ValidatedXMLDOM {
        ...
        public static __template__validator__ ValidatedXMLDOM of(StringTemplate stringTemplate) { ... }
    }
   
- a processor method is a method that takes a __template_validated__ object followed by parameters storing the template string arguments
  By example
    processXML(ValidatedXMLDOM dom, Object... arguments)

At compile time, either processXML is called using an invokedynamic or the __template_validated__ instance is computed with a constant dynamic or both.
But the idea is that the generated bytecode ensure that the __template_validated__ instance is created once and cached.

This is a rough sketch, a lot of details are up to debate but i think we should start to think that the template processing is a two steps process.

R?mi

From brian.goetz at oracle.com  Fri Mar 29 21:58:54 2024
From: brian.goetz at oracle.com (Brian Goetz)
Date: Fri, 29 Mar 2024 17:58:54 -0400
Subject: Member Patterns -- the bikeshed
Message-ID: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com>

We now come to the long-awaited bikeshed discussion on what member 
patterns should look like.

Bikeshed disclaimer for EG:
 ? - This is likely to evoke strong opinions, so please take pains to be 
especially constructive
 ? - Long reply-to-reply threads should be avoided even more than usual
 ? - Holistic, considered replies preferred
 ? - Please change subject line if commenting on a sub-topic or tangential
 ??? concern

Special reminders for Remi:
 ?- Use of words like "should", "must", "shouldn't", "mistake", "wrong", 
"broken"
 ?? are strictly forbidden.
 ?- If in doubt, ask questions first.

Notes for external observers:
 ?- This is a working document for the EG; the discussion may continue for a
 ?? while before there is an official proposal.? Please be patient.


# Pattern declaration: the bikeshed

We've largely identified the model for what kinds of patterns we need to
express, but there are still several degrees of freedom in the syntax.

As the model has simplified during the design process, the space of syntax
choices has been pruned back, which is a good thing.? However, there are 
still
quite a few smaller decisions to be made.? Not all of the considerations are
orthogonal, so while they are presented individually, this is not a 
"pick one
from each column" menu.

Some of these simplifications include:

 ?- Patterns with "input arguments" have been removed; another way to 
get to what
 ?? this gave us may come back in another form.
 ?- I have grown increasingly skeptical of the value of the imperative 
`match`
 ?? statement.? With better totality analysis, I think it can be eliminated.

We can discuss these separately but I would like to sync first on the broad
strokes for how patterns are expressed.

## Object model requirements

As outlined in "Towards Member Patterns", the basic model is that 
patterns are
the dual of other executable members (constructors, static methods, instance
methods.)? While they are like methods in that they have inputs, 
outputs, names,
and an imperative body, they have additional degrees of freedom that
constructors and methods lack:

 ?- Patterns are, in general, _conditional_ (they can succeed or fail), 
and only
 ?? produce bindings (outputs) when they succeed.? This conditionality is
 ?? understood by the language's flow analysis, and is used for 
computing scoping
 ?? and definite assignment.
 ?- Methods can return at most one value; when a pattern completes 
successfully,
 ?? it may bind multiple values.
 ?- All patterns have a _match candidate_, which is a distinguished,
 ?? possibly-implicit parameter.? Some patterns also have a receiver, 
which is
 ?? also a distinguished, possibly-implicit parameter.? In some such 
cases the
 ?? receiver and match candidate are aliased, but in others these may 
refer to
 ?? different objects.

So a pattern is a named executable member that takes a _match candidate_ 
as a
possibly-implicit parameter, maybe takes a receiver as an implicit 
parameter,
and has zero or more conditional _bindings_.? Its body can perform 
imperative
computation, and can terminate either with match failure or success.? In the
success case, it must provide a value for each binding.

Deconstruction patterns are special in many of the same ways 
constructors are:
they are constrained in their name, inheritance, and probably their
conditionality (they should probably always succeed).? Just as the 
syntax for
constructors differs slightly from that of instance methods, the syntax for
deconstructors may differ slightly from that of instance patterns.? Static
patterns, like static methods, have no receiver and do not have access 
to the
type parameters of the enclosing class.

Like constructors and methods, patterns can be overloaded, but in accordance
with their duality to constructors and methods, the overloading happens 
on the
_bindings_, not the inputs.

## Use-site syntax

There are several kinds of type-driven patterns built into the language: 
type
patterns and record patterns.? A type pattern in a `switch` looks like:

 ??? case String s: ...

And a record pattern looks like:

 ??? case MyRecord(P1, P2, ...): ...

where `P1..Pn` are nested patterns that are recursively matched to the
components of the record.? This use-site syntax for record patterns was 
chosen
for its similarity to the construction syntax, to highlight that a record
pattern is the dual of record construction.

**Deconstruction patterns.**? The simplest kind of member pattern, a
deconstruction pattern, will have the same use-site syntax as a record 
pattern;
record patterns can be thought of as a deconstruction pattern "acquired for
free" by records, just as records do with constructors, accessors, object
methods, etc.? So the use of a deconstruction pattern for `Point` looks 
like:

 ??? case Point(var x, var y): ...

whether `Point` is a record or an ordinary class equipped with a suitable
deconstruction pattern.

**Static patterns.**? Continuing with the idea that the destructuring syntax
should evoke the aggregation syntax, there is an obvious candidate for the
use-site syntax for static patterns:

 ??? case Optional.of(var e): ...
 ??? case Optional.empty(): ...

**Instance patterns.**? Uses of instance patterns will likely come in 
two forms,
analogous to bound and unbound instance method references, depending on 
whether
the receiver and the match candidate are the same object.? In the 
unbound form,
used when the receiver is the same object as the match candidate, the 
pattern
name is qualified by a _type_:

```
Class<?> k = ...
switch (k) {
 ??? // Qualified by type
 ??? case Class.arrayClass(var componentType): ...
}
```

This means that we _resolve_ the pattern `arrayClass` starting at 
`Class` and
_select_ the pattern using the receiver, `k`.? We may also be able to 
omit the
class qualifier if the static type of the match candidate is sufficient to
resolve the desired pattern.

In the bound form, used when the receiver is distinct from the match 
candidate,
the pattern name is qualified with an explicit _receiver expression_.? As an
example, consider an interface that captures primitive widening and 
narrowing
conversions, such as those between `int` and `long`.? In the widening 
direction,
conversion is unconditional, so this can be modeled as a method from 
`int` to
`long`.? In the other direction, conversion is conditional, so this is 
better
modeled as a _pattern_ whose match candidate is `long` and which binds 
an `int`
on success.? Since these are instance methods of some class (say,
`NumericConversion<T,U>`), we need to provide the receiver instance in 
order to
resolve the pattern:

```
NumericConversion<int, long> nc = ...

switch (aLong) {
 ??? case nc.narrowed(int i):
 ??? ...
}
```

The explicit receiver syntax would also be used if we exposed regular 
expression
matching as a pattern on the `j.u.r.Pattern` object (the name collision on
`Pattern` is unfortunate).? Imagine we added a `matching` instance 
pattern to
`j.u.r.Pattern`; then we could use it in `instanceof` as follows:

```
static final java.util.regex.Pattern P = Pattern.compile("(a*)(b*)");
...
if (aString instanceof P.matching(String as, String bs)) { ... }
```

Each of these use-site syntaxes is modeled after the use-site syntax for a
method invocation or method reference.

## Declaration-site syntax

To avoid being biased by the simpler cases, we're going to work all the 
cases
concurrently rather than starting with the simpler cases and working 
up.? (It
might seem sensible to start with deconstructors, since they are the "easy"
case, but if we did that, we would likely be biased by their simplicity 
and then
find ourselves painted into a corner.)? As our example gallery, we will 
consider:

 ?- Deconstruction pattern for `Point`;
 ?- Static patterns for `Optional::of` and `Optional::empty`;
 ?- Static pattern for "power of two" (illustrating a computations where 
success
 ?? or failure, and computation of bindings, cannot easily be separated);
 ?- Instance pattern for `Class::arrayClass` (used unbound);
 ?- Instance pattern for `Pattern::matching` on regular expressions 
(used bound).

Member patterns, like methods, have _names_.? (We can think of 
constructors as
being named for their enclosing classes, and the same for 
deconstructors.)? All
member patterns have a (possibly empty) ordered list of _bindings_, 
which are
the dual of constructor or method parameters.? Bindings, in turn, have 
names and
types.? And like constructors and methods, member patterns have a _body_ 
which
is a block statement.? Member patterns also have a _match candidate_, 
which is a
likely-implicit method parameter.

### Member patterns as inverse methods and constructors

Regardless of syntax, let us remind ourselves that that deconstructors 
are the
categorical dual to constructors (coconstructors), and pattern methods 
are the
categorical dual to methods (comethods).? They are dual in their 
structure: a
constructor or method takes N arguments and produces a result, the 
corresponding
member pattern consumes a match candidate and (conditionally) produces N
bindings.

Moreover, they are semantically dual: the return value produced by 
construction
or factory invocation is the match candidate for the corresponding member
pattern, and the bindings produced by a member pattern are the answers 
to the
_Pattern Question_ -- "could this object have come from an invocation of my
dual, and if so, with what arguments."

### What do we call them?

Given the significant overlap between methods and patterns, the first 
question
about the declaration we need to settle is how to identify a member pattern
declaration as distinct from a method or constructor declaration. _Towards
Member Patterns_ tried out a syntax that recognized these as _inverse_ 
methods
and constructors:

 ??? public Point(int x, int y) { ... }
 ??? public inverse Point(int x, int y) { ... }

While this is a principled choice which clearly highlights the duality, 
and one
that might be good for specification and verbal description, it is 
questionable
whether this would be a great syntax for reading and writing programs.

A more traditional option is to choose a "noun" (conditional) keyword, 
such as
`pattern`, `matcher`, `extractor`, `view`, etc:

 ??? public pattern Point(int x, int y) { ... }

If we are using a noun keyword to identify pattern declarations, we 
could use
the same noun for all of them, or we could choose a different one for
deconstruction patterns:

 ??? public deconstructor Point(int x, int y) { ... }

Alternately, we could reach for a symbol to indicate that we are talking 
about
an inverted member.? C++ fans might suggest

 ??? public ~Point(int x, int y) { ... }

but this is too cryptic (it's evocative once you see it, but then it becomes
less evocative as we move away from deconstructors towards instance 
patterns.)

If we wish to offer finer-grained control over conditionality, we might
additionally need a `total` / `partial` modifier, though I would prefer 
to avoid
that.

Of the keyword candidates, there is one that stands out (for good and bad)
because it connects to something that is already in the language: 
`pattern`.? On
the one hand, using the term `pattern` for the declaration is a slight 
abuse; on
the other, users will immediately connect it with "ah, so that's how I 
make a
new pattern" or "so that's what happens when I match against this pattern."
(Lisps would resolve this tension by calling it `defpattern`.)

The others (`matcher`, `view`, `extractor`, etc) are all made-up terms that
don't connect to anything else in the language, for better or worse.? If 
we pick
one of these, we are asking users to sort out _three_ separate new things in
their heads: (use-site) patterns, (declaration-site) matchers, and the 
rules of
how patterns and matchers are connected.? Calling them both "patterns", 
despite
the mild abuse of terminology, ties them together in a way that 
recognizes their
connection.

My personal position: `pattern` is the strongest candidate here, despite 
some
flaws.

### Binding lists and match candidates

There are two obvious alternatives for describing the binding list and match
candidate of a pattern declaration, both with their roots in the 
constructor and
method syntax:

 ?- Pretend that a pattern declaration is like a method with multiple 
return, and
 ?? put the binding list in the "return position", and make the match 
candidate
 ?? an ordinary parameter;
 ?- Lean into the inverse relationship between constructors and methods (and
 ?? consistency with the use-site syntax), and put the binding list in the
 ?? "parameter list position". For static patterns and some instance 
patterns,
 ?? which need to explicitly identify the match candidate type, there 
are several
 ?? sub-options:
 ?? - Lean further into the duality, putting the match candidate type in the
 ???? "return position";
 ?? - Put the match candidate type somewhere else, where it is less 
likely to be
 ???? confused for a method return.

The "method-like" approach might look like this:

```
class Point {
 ??? // Constructor and deconstructor
 ??? public Point(int x, int y) { ... }
 ??? public pattern (int x, int y) Point(Point target) { ... }
 ??? ...
}

class Optional<T> {
 ??? // Static factory and pattern
 ??? public static<T> Optional<T> of(T t) { ... }
 ??? public static<T> pattern (T t) of(Optional<T> target) { ... }
 ??? ...
}
```

The "inverse" approach might look like:

```
class Point {
 ??? // Constructor and deconstructor
 ??? public Point(int x, int y) { ... }
 ??? public pattern Point(int x, int y) { ... }
 ??? ...
}

class Optional<T> {
 ??? // Static factory and pattern (using the first sub-option)
 ??? public static<T> Optional<T> of(T t) { ... }
 ??? public static<T> pattern Optional<T> of(T t) { ... }
 ??? ...
}
```

With the "method-like" approach, the match candidate gets an explicit name
selected by the author; with the inverse approach, we can go with a 
predefined
name such as `that`.? (Because deconstructors do not have receivers, we 
could by
abuse of notation arrange for the keyword `this` to refer instead to the 
match
candidate within the body of a deconstructor.? While this might seem to 
lead to
a more familiar notation for writing deconstructors, it would create a
gratuitous asymmetry between the bodies of deconstruction patterns and 
those of
other patterns.)

Between these choices, nearly all the considerations favor the "inverse"
approach:

 ?- The "inverse" approach makes the declaration look like the use 
site.? This
 ?? highlights that `pattern Point(int x, int y)` is what gets invoked 
when you
 ?? match against the pattern use `Point(int x, int y)`.? (This point is so
 ?? strong that we should probably just stop here.)
 ?- The "inverse" members also look like their duals; the only 
difference is the
 ?? `pattern` keyword (and possibly the placement of the match candidate 
type).
 ?? This makes matched pairs much more obvious, and such matched pairs 
will be
 ?? critical both for future language features and for library idioms.
 ?- The method-like approach is suggestive of multiple return or tuples, 
which is
 ?? probably helpful for the first few minutes but actually harmful in 
the long
 ?? term. This feature is _not_ (much as some people would like to 
believe) about
 ?? multiple return or tuples, and playing into this misperception will 
only make
 ?? it harder to truly understand.? So this suggestion ends up propping 
up the
 ?? wrong mental model.

The main downside of the "inverse" approach is the one-time speed bump 
of the
unfamiliarity of the inverted syntax.? (The "method-like" syntax also 
has its
own speed bumps, it is just unfamiliar in different ways.)? But unlike the
advantages of the inverse approach, which continue to add value forever, 
this
speed bump is a one-time hurdle to get over.

To smooth out the speed bumps of the inverse approach, we can consider 
moving
the position of the match candidate for static and (suitable) instance 
pattern
declarations, such as:

```
class Optional<T> {
 ??? // the usual static factory
 ??? public static<T> Optional<T> of(T t) { ... }

 ??? // Various ways of writing the corresponding pattern
 ??? public static<T> pattern of(T t) for Optional<T> { ... }
 ??? // or ...
 ??? public static<T> pattern(Optional<T>) of(T t) { ... }
 ??? // or ...
 ??? public static<T> pattern(Optional<T> that) of(T t) { ... }
 ??? // or ...
 ??? public static<T> pattern<Optional<T>> of(T t) { ... }
 ??? ...
}
```

(The deconstructor example looks the same with either variant.) Of these,
treating the match candidate like a "parameter" of "pattern" is probably the
most evocative:

```
public static<T> pattern(Optional<T> that) of(T t) { ... }
```

as it can be read as "pattern taking the parameter `Optional<T> that` called
`of`, binding `T`, and is a short departure from the inverse syntax.

The main value of the various rearrangements is that users don't need to 
think
about things operating in reverse to parse the syntax.? This trades some 
of the
secondary point (patterns looking almost exactly like their inverses) for a
certain amount of cognitive load, while maintaining the most important
consideration: that the declaration site look like the use site.

For instance pattern declarations, if the match candidate type is the 
same as
the receiver type, the match candidate type can be elided as it is with
deconstructors.

My personal position: the "multiple return" version is terrible; all the
sub-variants of the inverse version are probably workable.

### Naming the match candidate

We've been assuming so far that the match candidate always has a fixed name,
such as `that`; this is an entirely workable approach.? Some of the 
variants are
also amenable to allowing authors to explicitly select a name for the match
candidate.? For example, if we put the match candidate as a "parameter" 
to the `pattern` keyword, there is an obvious place to put the name:

```
static<T> pattern(Optional<T> target) of(T t) { ... }
```

My personal opinion: I don't think this degree of freedom buys us much, 
and in
the long run readability probably benefits by picking a fixed name like 
`that`
and sticking with it.? Even with a fixed name, if there is a sensible 
position
for the name, allowing users to type `that` for explicitness is fine (as 
we do
with instance methods, though many people don't know this.)? We may even 
want to
require it.

## Body types

Just as there are two obvious approaches for the declaration, there are two
obvious approaches we could take for the body (though there is some coupling
between them.)? We'll call the two body approaches _imperative_ and
_functional_.

The imperative approach treats bindings as initially-DU variables that 
must be
DA on successful completion, getting their value through ordinary 
assignment;
the functional approach sets all the bindings at once, positionally.? Either
way, member patterns (except maybe deconstructors) also need a way to
differentiate a successful match from a failed match.

Here is the `Point` deconstructor with both imperative and functional 
style. The
functional style uses a placeholder `match` statement to indicate a 
successful
match and provision of bindings:

```
class Point {
 ??? int x, y;

 ??? Point(int x, int y) {
 ??????? this.x = x;
 ??????? this.y = y;
 ??? }

 ??? // Imperative style, deconstructor always succeeds
 ??? pattern Point(int x, int y) {
 ??????? x = that.x;
 ??????? y = that.y;
 ??? }

 ??? // Functional style
 ??? pattern Point(int x, int y) {
 ??????? match(that.x, that.y);
 ??? }
}
```

There are some obvious differences here.? In the imperative style, the 
dtor body
looks much more like the reverse of the ctor body. The functional style 
is more
concise (and amenable to further concision via the "concise method bodies"
mechanism in the future), as well as a number of less obvious 
differences.? For
deconstructors, the imperative approach is likely to feel more natural 
because
of the obvious symmetry with constructors.

In reality, it is _premature at this point to have an opinion_, because we
haven't yet seen the full scope of the problem; deconstructors are a special
case in many ways, which almost surely is distorting our initial 
opinion.? As we
move towards conditional patterns (and pattern lambdas), our opinions 
may flip.

Regardless of which we pick, there are some additional syntactic choices 
to be
made -- what syntax to use to indicate success (we used `match` in the above
example) or failure.? (We should be especially careful around trying to 
reuse
words like `return`, `break`, or `yield` because, in the case where 
there are
zero bindings (which is allowable), it becomes unclear whether they mean 
"fail"
or "succeed with zero bindings".)

### Success and failure

Except for possibly deconstructors, which we may require to be total, a 
pattern
declaration needs a way to indicate success and failure.? In the 
examples above,
we posited a `match` statement to indicate success in the functional 
approach,
and in both examples leaned on the "implicit success" of deconstructors 
(under
the assumption they always succeed).? Now let's look at the more general 
case to
figure out what else is needed.

For a static pattern like `Optional::of`, success is conditional. Using
`match-fail` as a placeholder for "the match failed", this might look like
(functional version):

```
public static<T> pattern(Optional<T> that) of(T t) {
 ??? if (that.isPresent())
 ??????? match (that.get());
 ??? else
 ??????? match-fail;
}
```

The imperative version is less pretty, though.? Using `match-success` as a
placeholder:

```
public static<T> pattern(Optional<T> that) of(T t) {
 ??? if (that.isPresent()) {
 ??????? t = that.get();
 ??????? match-success;
 ??? }
 ??? else
 ??????? match-fail;
}
```

Both arms of the `if` feel excessively ceremonial here.? And if we chose 
to not
make all deconstruction patterns unconditional, deconstructors would 
likely need
some explicit success as well:

```
pattern Point(int x, int y) {
 ??? x = that.x;
 ??? y = that.y;
 ??? match-success;
}
```

It might be tempting to try and eliminate the need for explicit success by
inferring it from whether or not the bindings are DA or not, but this is
error-prone, is less type-checkable, and falls apart completely for patterns
with no bindings.

### Implicit failure in the functional approach

One of the ceremonial-seeming aspects of `Optional::of` above is having 
to say
`else match-fail`, which doesn't feel like it adds a lot of value.? 
Perhaps we
can be more concise without losing clarity.

Most conditional patterns will have a predicate to determine matching, 
and then
some conditional code to compute the bindings and claim success. Having 
to say
"and if the predicate didn't hold, then I fail" seems like ceremony for the
author and noise for the reader.? Instead, if a conditional pattern 
falls off
the end without matching, we could treat that as simply not matching:

```
public static<T> pattern(Optional<T> that) of(T t) {
 ??? if (that.isPresent())
 ??????? match (that.get());
}
```

This says what we mean: if the optional is present, then this pattern 
succeeds
and bind the contents of the `Optional`.? As long as our "succeed" construct
strongly enough connotes that we are terminating abruptly and 
successfully, this
code is perfectly clear.? And most conditional patterns will look a lot like
`Optional::of`; do some sort of test and if it succeeds, extract the 
state and
bind it.

At first glance, this "implicit fail" idiom may seem error-prone or 
sloppy.? But
after writing a few dozen patterns, one quickly tires of saying "else
match-fail" -- and the reader doesn't necessarily appreciate reading it 
either.

Implicit failure also simplifies the selection of how we explicitly indicate
failure; using `return` in a pattern for "no match" becomes pretty much 
a forced
move.? We observe that (in a void method), "return" and "falling off the 
end"
are equivalent; if "falling off the end" means "no match", then so should an
explicit `return`.? So in those few cases where we need to explicitly 
signal "no
match", we can just use `return`.? It won't come up that often, but 
here's an
example where it does:

```
static pattern(int that) powerOfTwo(int exp) {
 ??? int exp = 0;

 ??? if (that < 1)
 ??????? return; // explicit fail

 ??? while (that > 1) {
 ??????? if (that % 2 == 0) {
 ??????????? that /= 2;
 ??????????? ++exp;
 ??????? }
 ??????? else
 ??????????? return; // explicit fail
 ??? }
 ??? match (exp);
}
```

As a bonus, if `return` as match failure is a forced move, we need only 
select a
term for "successful match" (which obviously can't be `return`). We 
could use
`match` as we have in the examples, or a variant like `matched` or 
`matches`.
But rather than just creating a new control operator, we have an 
opportunity to
lean into the duality a little harder, by including the pattern syntax 
in the
match:

```
matches of(that.get());
```

or the (optionally?) qualified (inferring type arguments, as we do at 
the use
site):

```
matches Optional.of(that.get());
```

These "use the name" approaches trades a small amount of verbosity to gain a
higher degree of fidelity to the pattern use site (and to evoke the comethod
completion.)

If we don't choose "implicit fail", we would have to invent _two_ new 
control
flow statements to indicate "success" and "failure".

My personal position: for the functional approach, implicit failure both 
makes
the code simpler and clearer, and after you get used to it, you don't 
want to go
back.? Whether we say `match` or `matches` or `matches <pattern-name>` 
are all
workable, though I like some variant that names the pattern.

### Implicit success in the imperative approach

In the imperative approach, we can be implicit as well, but it feels more
natural (at least, initially) to choose implicit success rather than 
failure.
This works great for unconditional patterns:

```
pattern Point(int x, int y) {
 ??? x = that.x;
 ??? y = that.y;
 ??? // implicit success
}
```

but not quite as well for conditional patterns:

```
static<T> pattern(Optional<T> that) of(T t) {
 ??? if (that.isPresent()) {
 ??????? t = that.get();
 ??? }
 ??? else
 ??????? match-fail;
 ??? // implicit success
}
```

We can eliminate one of the arms of the if, with the more concise (but
convoluted) inversion:

```
static<T> pattern(Optional<T> that) of(T t) {
 ??? if (!that.isPresent())
 ??????? match-fail;
 ??? t = that.get();
 ??? // implicit success
}
```

Just as with the functional approach, if we choose imperative and "implicit
success", using `return` to indicate success is pretty much a forced move.

### Imperative is a trap

If we assume that functional implies implicit failure, and imperative 
implies
implicit success, then our choices become:

```
class Optional<T> {
 ??? public static<T> Optional<T> of(T t) { ... }

 ??? // imperative, implicit success
 ??? public static<T> pattern(Optional<T> that) of(T t) {
 ??????? if (that.isPresent()) {
 ??????????? t = that.get();
 ??????? }
 ??????? else
 ??????????? match-fail;
 ??? }

 ??? // functional, implicit failure
 ??? public static<T> pattern(Optional<T> that) of(T t) {
 ??????? if (that.isPresent())
 ??????????? matches of(that.get());
 ??? }
}
```

Once we get past deconstructors, the imperative approach looks worse by
comparison because we need to assign all the bindings (which is _O(n)_
assignments) _and also_ indicate success or failure somehow, whereas in the
functional style all can be done together with a single `matches` statement.

Looking at the alternatives, except maybe for unconditional patterns, the
functional example above seems a lot more natural.? The imperative approach
works with deconstructors (assuming they are not conditional), but does not
scale so well to conditionality -- which is the essence of patterns.

 From a theoretical perspective, the method-comethod duality also gives us a
forceful nudge towards the functional approach.? In a method, the method
arguments are specified as a positional list of expressions at the use 
site:

 ??? m(a, b, c)

and these values are invisibly copied into the parameter slots of the method
prior to frame activation.? The dual to that for a comethod to similarly 
convey
the bindings in a positional list of expressions (as they must either all be
produced or none), where they are copied into the slots provided at the use
site, as is indicated by `matches` in the above examples.

My personal position: the imperative style feels like a trap.? It seems
"obvious" at first if we start with deconstructors, but becomes increasingly
difficult when we get past this case, and gets in the way of other
opportunities.? The last gasp before acceptance is the discomfort that 
dtor and
ctor bodies are written in different styles, but in the rear-view 
mirror, this
feels like a non-issue.

### Derive imperative from functional?

If we start with "functional with implicit failure", we can possibly rescue
imperative by deriving a version of imperative from functional, by 
"overloading"
the match-success operator.

If we have a pattern whose binding names are `b1..bn` of types `B1..Bn`, 
then
the `matches` operator must take a list of expressions `e1..en` whose 
arity and
types are compatible with `B1..Bn`.? But we could allow `matches` to 
also have a
nilary form, which would have the effect of being shorthand for

 ??? matches <pattern-name>(b1, b2, ..., bn)

where each of `b1..bn` must be DA at the point of matching.? This means 
that we
could express patterns in either form:

```
class Optional<T> {
 ??? public static<T> Optional<T> of(T t) { ... }

 ??? // imperative, derived from functional with implicit failure
 ??? public static<T> pattern(Optional<T> that) of(T t) {
 ??????? if (that.isPresent()) {
 ??????????? t = that.get();
 ??????????? matches of;
 ??????? }
 ??? }

 ??? public static<T> pattern(Optional<T> that) of(T t) {
 ??????? if (that.isPresent())
 ??????????? matches of(that.get());
 ??? }
}
```

This flexibility allows users to select a more verbose expression in 
exchange
for a clearer association of expressions and bindings, though as we'll 
see, it
does come with some additional constraints.

### Wrapping an existing API

Nearly every library has methods (sometimes sets of methods) that are 
patterns
in disguise, such as the pair of methods `isArray` and `getComponentType` in
`Class`, or the `Matcher` helper type in `java.util.regex`. Library 
maintainers
will likely want to wrap (or replace) these with real patterns, so these can
participate more effectively in conditional contexts, and in some cases,
highlight their duality with factory methods.

Matching a string against a `j.u.r.Pattern` regular expression has all 
the same
elements as a pattern, just with an ad-hoc API (and one that I have to 
look up
every time).? But we can fairly easily wrap a true pattern around the 
existing
API.? To match against a `Pattern` today, we pass the match candidate to
`Pattern::matcher`, which returns a `Matcher` with accessors 
`Matcher::matches`
(did it match) and `Matcher::group` (conditionally extract a particular 
capture
group.)? If we want to wrap this with a pattern called `regexMatch`:

```
pattern(String that) regexMatch(String... groups) {
 ??? Matcher m = this.matcher(that);
 ??? if (m.matches())
 ??????? matches Pattern.regexMatch(IntStream.range(1, m.groupCount())
 ??????????????????????????????????????????? .map(Matcher::group)
.toArray(String[]::new));
 ??? // whole lotta matchin' goin' on
}
```

This says that a `j.u.r.Pattern` has an instance pattern called `regex`, 
whose
match candidate is `String`, and which binds a varargs of `String` 
corresponding
to the capture groups.? The implementation simply delegates to the existing
`j.u.r.Matcher` API.? This means that `j.u.r.Pattern` becomes a sort of 
"pattern
object", and we can use it as a receiver at the use site:

```
static Pattern As = Pattern.compile("(a*)");
static Pattern Bs = Pattern.compile("(b*)");
...
switch (string) {
 ??? case As.regexMatch(var as): ...
 ??? case Bs.regexMatch(var bs): ...
 ??? ...
}
```

### Odds and ends

There are a number of loose ends here.? We could choose other names for the
match-success and match-fail operations, including trying to reuse 
`break` or
`yield`.? But, this reuse is tricky; it must be very clear whether a 
given form
of abrupt completion means "success" or "failure", because in the case of
patterns with no bindings, we will have no other syntactic cues to help
disambiguate.? (I think having a single `matches`, with implicit failure and
`return` meaning failure, is the sweet spot here.)

Another question is whether the binding list introduces corresponding 
variables
into the scope of the body.? For imperative, the answer is "surely yes"; for
functional, the answer is "maybe" (unless we want to do the trick where we
derive imperative from functional, in which case the answer is "yes" again.)

If the binding list does not correspond to variables in the body, this 
may be
initially discomforting; because they do not declare program elements, 
they may
feel that they are left "dangling".? But even if they are not declaring
_program_ elements, they are still declaring _API_ elements (similar to the
return type of a method.)? We will want to provide Javadoc on the 
bindings, just
like with parameters; we will want to match up binding names in 
deconstructors
with parameter names in constructors; we may even someday want to support
by-name binding at the use site (e.g., `case Foo(a: var a)`).? The names are
needed for all of these, just not for the body. Names still matter.? My take
here is that this is a transient "different is scary" reaction, one that we
would get over quickly.

A final question is whether we should consider unqualified names as 
implicitly
qualified by `that` (and also `this`, for instance patterns, with some 
conflict
resolution).? Users will probably grow tired of typing `that.` all the 
time, and most of the time, the unqualified use is perfectly readable.

## Exhaustiveness

There is one last syntax question in front of us: how to indicate that a 
set of
patterns are (claimed to be) exhaustive on a given match candidate 
type.? We see
this with `Optional::of` and `Optional::empty`; it would be sad if the 
compiler
did not realize that these two patterns together were exhaustive on 
`Optional`.
This is not a feature that will be used often, but not having it at all 
will be
a repeated irritant.

The best I've come up with is to call these `case` patterns, where a set of
`case` patterns for a given match candidate type in a given class are 
asserted
to be an exhaustive set:

```
class Optional<T> {
 ??? static<T> Optional<T> of(T t) { ... }
 ??? static<T> Optional<T> empty() { ... }

 ??? static<T> case pattern of(T t) for Optional<T> { ... }
 ??? static<T> case pattern empty() for Optional<T> { ... }
}
```

Because they may not be truly exhaustive, `switch` constructs will have 
to back
up the static assumption of exhaustiveness with a dynamic check, as we 
do for
other sets of exhaustive patterns that may have remainder.

I've experimented with variants of `sealed` but it felt more forced, so 
this is
the best I've come up with.

## Example: patterns delegating to other patterns

Pattern implementations must compose.? Just as a subclass constructor 
delegates
to a superclass constructor, the same should be true for deconstructors.
Here's a typical superclass-subclass pair:

```
class A {
 ??? private final int a;

 ??? public A(int a) { this.a = a; }
 ??? public pattern A(int a) { matches A(that.a); }
}

class B extends A {
 ??? private final int b;

 ??? public B(int a, int b) {
 ??????? super(a);
 ??????? this.b = b;
 ??? }

 ??? // Imperative style
 ??? public pattern B(int a, int b) {
 ??????? if (that instanceof super(var aa)) {
 ??????????? a = aa;
 ??????????? b = that.b;
 ??????????? matches B;
 ??????? }
 ??? }

 ??? // Functional style
 ??? public pattern B(int a, int b) {
 ??????? if (that instanceof super(var a))
 ??????????? matches B(a, b);
 ??? }
}
```

(Ignore the flow analysis and totality for the time being; we'll come 
back to
this in a separate document.)

The first thing that jumps out at us is that, in the imperative version, 
we had
to create a "garbage" variable `aa` to receive the binding, because `a` was
already in scope, and then we have to copy the garbage variable into the 
real
binding variable. Users will surely balk at this, and rightly so. In the
functional version (depending on the choices from "Odds and Ends") we 
are free
to use the more natural name and avoid the roundabout locution.

We might be tempted to fix the "garbage variable" problem by inventing 
another
sub-feature: the ability to use an existing variable as the target of a 
binding,
such as:

```
pattern Point(int a, int b) {
 ??? if (this instanceof A(__bind a))
 ??????? b = this.b;
}
```

But, I think the language is stronger without this feature, for two reasons.
First, having to reason about whether a pattern match introduces a new 
binding
or assigns to an existing variables is additional cognitive load for 
users to
reason about, and second, having assignment to locals happening through
something other than assignment introduces additional complexity in finding
where a variable is modified.? While we can argue about the general 
utility of
this feature, bringing it in just to solve the garbage-variable problem is
particularly unattractive.

## Pattern lambdas

One final consideration is is that patterns may also have a lambda 
form.? Given
a single-abstract-pattern (SAP) interface:

```
interface Converter<T,U> {
 ??? pattern(T t) convert(U u);
}
```

one can implement such a pattern with a lambda. Such a lambda has one 
parameter
(the match candidate), and its body looks like the body of a declared 
pattern:

```
Converter<Integer, Short> c =
 ??? i -> {
 ??????? if (i >= Short.MIN_VALUE && i <= Short.MAX_VALUE)
 ??????????? matches Converter.convert((short) i);
 ??? };
```

Because the bindings of the pattern lambda are defined in the interface, 
not in
the lambda, this is one more reason not to like the imperative version: 
it is
brittle, and alpha-renaming bindings in the interface would be a
source-incompatible change.

## Example gallery

Here's all the pattern examples so far, and a few more, using the suggested
style (functional, implicit fail, implicit `that`-qualification):

```
// Point dtor
pattern Point(int x, int y) {
 ??? matches Point(x, y);
}

// Optional -- static patterns for Optional::of, Optional::empty
static<T> case pattern(Optional<T> that) of(T t) {
 ??? if (isPresent())
 ??????? matches of(t);
}

static<T> case pattern(Optional<T> that) empty() {
 ??? if (!isPresent())
 ??????? matches empty();
}

// Class -- instance pattern for arrayClass (match candidate type inferred)
pattern arrayClass(Class<?> componentType) {
 ??? if (that.isArray())
 ??????? matches arrayClass(that.getComponentType());
}

// regular expression -- instance pattern in j.u.r.Pattern
pattern(String that) regexMatch(String... groups) {
 ??? Matcher m = matcher(that);
 ??? if (m.matches())
 ??????? matches Pattern.regexMatch(IntStream.range(1, m.groupCount())
 ??????????????????????????????????????????? .map(Matcher::group)
.toArray(String[]::new));
}

// power of two (somewhere)
static pattern(int that) powerOfTwo(int exp) {
 ??? int exp = 0;

 ??? if (that < 1)
 ??????? return;

 ??? while (that > 1) {
 ??????? if (that % 2 == 0) {
 ??????????? that /= 2;
 ??????????? exp++;
 ??????? }
 ??????? else
 ??????????? return;
 ??? }
 ??? matches powerOfTwo(exp);
}
```

## Closing thoughts

I came out of this exploration with very different conclusions than I 
expected
when going in.? At first, the "inverse" syntax seemed stilted, but over 
time it
started to seem more obvious.? Similarly, I went in expecting to prefer the
imperative approach for the body, but over time, started to warm to the
functional approach, and eventually concluded it was basically a forced 
move if
we want to support more than just deconstructors.? And I started out 
skeptical
of "implicit fail", but after writing a few dozen patterns with it, 
going back
to fully explicit felt painful.? All of this is to say, you should hold your
initial opinions at arm's length, and give the alternatives a chance to 
sink in.

For most _conditional_ patterns (and conditionality is at the heart of 
pattern
matching), the functional approach cleanly highlights both the match 
predicate
and the flow of values, and is considerably less fussy than the imperative
approach in the same situation; `Optional::of`, `Class::arrayClass`, and 
`regex`
look great here, much better than the would with imperative.? None of these
illustrate delegation, but in the presence of delegation, the gap gets even
wider.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240329/4888d8d4/attachment-0001.htm>

From asviraspossible at gmail.com  Sat Mar 30 19:23:11 2024
From: asviraspossible at gmail.com (Victor Nazarov)
Date: Sat, 30 Mar 2024 20:23:11 +0100
Subject: Member Patterns -- the bikeshed
In-Reply-To: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com>
References: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com>
Message-ID: <CAFOkWZZbAq1mmmHiHEFdEQHbnDg_priA=z5ic1Wu4FX+CRopSg@mail.gmail.com>

I have two points that I think may be good to consider in the list of
options.

1. I'm not sure if this was considered, but I find explicit lists of
covering patterns
rather natural and more flexible than using case as a pattern-modifier.

Explicit lists may look like:

````
    // Matches declaration "matches (of|empty)" states that
    // "of" and "empty" covers full set of Optional<T> values
    class Optional<T> matches of|empty {
    }
````

The important feature of explicit lists is that there may be more than one
covering set of patterns.

````
    // There can be multiple sets of patterns, were each set covers all
possibilities
    class List matches headAndTail|empty, initAndLast|empty {
        // ...
    }
    class Glass matches empty|nonEmpty, full|nonFull {
        // ...
    }
````

2. I think that there is a middle ground between functional and imperative
pattern body definition style that may look cumbersome at first, but
nevertheless gives you best of both worlds:

    * deconstructor patterns look dual to constructors
    * names from the list of pattern variables are actually used and
checked by the compiler
    * control flow is still functional, which is more natural

The downside that is retained from the imperative style is the need for
alpha-renaming,
but I think we still have to deal with shadowing and renaming
local-variable seems natural and easy.

Middle ground may be used like a special form that can be used in the
pattern body.
This form works mostly the same way as `with`-clause as defined in the
"Derived Record Instances" JEP.

Here is the long list of examples to fully illustrate different
interactions:

````
    class Optional<T> matches (of|empty) {
        public static <T> pattern<Optional<T>> of(T value) {
            if (that.isPresent()) {
                match {
                    value = that.get();
                }
            }
        }

        public static <T> pattern<Optional<T>> empty() {
            if (that.isEmpty())
                match {}
        }
    }

    class Pattern {
        public pattern<String> regexMatch(String... groups) {
            Matcher m = this.matcher(that);
            if (m.matches()) {
                match {
                    groups =
                            IntStream.range(1, m.groupCount())
                                    .map(Matcher::group)
                                    .toArray(String[]::new);
                }
            }
        }
    }

    class A {
        private final int a;

        public A(int a) {
            this.a = a;
        }
        public pattern A(int a) {
            match {
                a = that.a;
            }
        }
    }

    class B extends A {
        private final int b;

        public B(int a, int b) {
            super(a);
            this.b = b;
        }

        public pattern B(int a, int b) {
            if (that instanceof super(var aa)) {
                match {
                    a = aa;
                    b = that.b;
                }
            }
        }
    }

    interface Converter<T,U> {
        pattern<T> convert(U u);
    }
    Converter<Integer, Short> c =
        pattern (s) -> {
            if (that >= Short.MIN_VALUE && that <= Short.MAX_VALUE)
                match {
                    s = (short) that;
                }
        };
````

--
Victor Nazarov


On Fri, Mar 29, 2024 at 10:59?PM Brian Goetz <brian.goetz at oracle.com> wrote:

> We now come to the long-awaited bikeshed discussion on what member
> patterns should look like.
>
> Bikeshed disclaimer for EG:
>   - This is likely to evoke strong opinions, so please take pains to be
> especially constructive
>   - Long reply-to-reply threads should be avoided even more than usual
>   - Holistic, considered replies preferred
>   - Please change subject line if commenting on a sub-topic or tangential
>     concern
>
> Special reminders for Remi:
>  - Use of words like "should", "must", "shouldn't", "mistake", "wrong",
> "broken"
>    are strictly forbidden.
>  - If in doubt, ask questions first.
>
> Notes for external observers:
>  - This is a working document for the EG; the discussion may continue for a
>    while before there is an official proposal.  Please be patient.
>
>
> # Pattern declaration: the bikeshed
>
> We've largely identified the model for what kinds of patterns we need to
> express, but there are still several degrees of freedom in the syntax.
>
> As the model has simplified during the design process, the space of syntax
> choices has been pruned back, which is a good thing.  However, there are
> still
> quite a few smaller decisions to be made.  Not all of the considerations
> are
> orthogonal, so while they are presented individually, this is not a "pick
> one
> from each column" menu.
>
> Some of these simplifications include:
>
>  - Patterns with "input arguments" have been removed; another way to get
> to what
>    this gave us may come back in another form.
>  - I have grown increasingly skeptical of the value of the imperative
> `match`
>    statement.  With better totality analysis, I think it can be eliminated.
>
> We can discuss these separately but I would like to sync first on the broad
> strokes for how patterns are expressed.
>
> ## Object model requirements
>
> As outlined in "Towards Member Patterns", the basic model is that patterns
> are
> the dual of other executable members (constructors, static methods,
> instance
> methods.)  While they are like methods in that they have inputs, outputs,
> names,
> and an imperative body, they have additional degrees of freedom that
> constructors and methods lack:
>
>  - Patterns are, in general, _conditional_ (they can succeed or fail), and
> only
>    produce bindings (outputs) when they succeed.  This conditionality is
>    understood by the language's flow analysis, and is used for computing
> scoping
>    and definite assignment.
>  - Methods can return at most one value; when a pattern completes
> successfully,
>    it may bind multiple values.
>  - All patterns have a _match candidate_, which is a distinguished,
>    possibly-implicit parameter.  Some patterns also have a receiver, which
> is
>    also a distinguished, possibly-implicit parameter.  In some such cases
> the
>    receiver and match candidate are aliased, but in others these may refer
> to
>    different objects.
>
> So a pattern is a named executable member that takes a _match candidate_
> as a
> possibly-implicit parameter, maybe takes a receiver as an implicit
> parameter,
> and has zero or more conditional _bindings_.  Its body can perform
> imperative
> computation, and can terminate either with match failure or success.  In
> the
> success case, it must provide a value for each binding.
>
> Deconstruction patterns are special in many of the same ways constructors
> are:
> they are constrained in their name, inheritance, and probably their
> conditionality (they should probably always succeed).  Just as the syntax
> for
> constructors differs slightly from that of instance methods, the syntax for
> deconstructors may differ slightly from that of instance patterns.  Static
> patterns, like static methods, have no receiver and do not have access to
> the
> type parameters of the enclosing class.
>
> Like constructors and methods, patterns can be overloaded, but in
> accordance
> with their duality to constructors and methods, the overloading happens on
> the
> _bindings_, not the inputs.
>
> ## Use-site syntax
>
> There are several kinds of type-driven patterns built into the language:
> type
> patterns and record patterns.  A type pattern in a `switch` looks like:
>
>     case String s: ...
>
> And a record pattern looks like:
>
>     case MyRecord(P1, P2, ...): ...
>
> where `P1..Pn` are nested patterns that are recursively matched to the
> components of the record.  This use-site syntax for record patterns was
> chosen
> for its similarity to the construction syntax, to highlight that a record
> pattern is the dual of record construction.
>
> **Deconstruction patterns.**  The simplest kind of member pattern, a
> deconstruction pattern, will have the same use-site syntax as a record
> pattern;
> record patterns can be thought of as a deconstruction pattern "acquired for
> free" by records, just as records do with constructors, accessors, object
> methods, etc.  So the use of a deconstruction pattern for `Point` looks
> like:
>
>     case Point(var x, var y): ...
>
> whether `Point` is a record or an ordinary class equipped with a suitable
> deconstruction pattern.
>
> **Static patterns.**  Continuing with the idea that the destructuring
> syntax
> should evoke the aggregation syntax, there is an obvious candidate for the
> use-site syntax for static patterns:
>
>     case Optional.of(var e): ...
>     case Optional.empty(): ...
>
> **Instance patterns.**  Uses of instance patterns will likely come in two
> forms,
> analogous to bound and unbound instance method references, depending on
> whether
> the receiver and the match candidate are the same object.  In the unbound
> form,
> used when the receiver is the same object as the match candidate, the
> pattern
> name is qualified by a _type_:
>
> ```
> Class<?> k = ...
> switch (k) {
>     // Qualified by type
>     case Class.arrayClass(var componentType): ...
> }
> ```
>
> This means that we _resolve_ the pattern `arrayClass` starting at `Class`
> and
> _select_ the pattern using the receiver, `k`.  We may also be able to omit
> the
> class qualifier if the static type of the match candidate is sufficient to
> resolve the desired pattern.
>
> In the bound form, used when the receiver is distinct from the match
> candidate,
> the pattern name is qualified with an explicit _receiver expression_.  As
> an
> example, consider an interface that captures primitive widening and
> narrowing
> conversions, such as those between `int` and `long`.  In the widening
> direction,
> conversion is unconditional, so this can be modeled as a method from `int`
> to
> `long`.  In the other direction, conversion is conditional, so this is
> better
> modeled as a _pattern_ whose match candidate is `long` and which binds an
> `int`
> on success.  Since these are instance methods of some class (say,
> `NumericConversion<T,U>`), we need to provide the receiver instance in
> order to
> resolve the pattern:
>
> ```
> NumericConversion<int, long> nc = ...
>
> switch (aLong) {
>     case nc.narrowed(int i):
>     ...
> }
> ```
>
> The explicit receiver syntax would also be used if we exposed regular
> expression
> matching as a pattern on the `j.u.r.Pattern` object (the name collision on
> `Pattern` is unfortunate).  Imagine we added a `matching` instance pattern
> to
> `j.u.r.Pattern`; then we could use it in `instanceof` as follows:
>
> ```
> static final java.util.regex.Pattern P = Pattern.compile("(a*)(b*)");
> ...
> if (aString instanceof P.matching(String as, String bs)) { ... }
> ```
>
> Each of these use-site syntaxes is modeled after the use-site syntax for a
> method invocation or method reference.
>
> ## Declaration-site syntax
>
> To avoid being biased by the simpler cases, we're going to work all the
> cases
> concurrently rather than starting with the simpler cases and working up.
> (It
> might seem sensible to start with deconstructors, since they are the "easy"
> case, but if we did that, we would likely be biased by their simplicity
> and then
> find ourselves painted into a corner.)  As our example gallery, we will
> consider:
>
>  - Deconstruction pattern for `Point`;
>  - Static patterns for `Optional::of` and `Optional::empty`;
>  - Static pattern for "power of two" (illustrating a computations where
> success
>    or failure, and computation of bindings, cannot easily be separated);
>  - Instance pattern for `Class::arrayClass` (used unbound);
>  - Instance pattern for `Pattern::matching` on regular expressions (used
> bound).
>
> Member patterns, like methods, have _names_.  (We can think of
> constructors as
> being named for their enclosing classes, and the same for
> deconstructors.)  All
> member patterns have a (possibly empty) ordered list of _bindings_, which
> are
> the dual of constructor or method parameters.  Bindings, in turn, have
> names and
> types.  And like constructors and methods, member patterns have a _body_
> which
> is a block statement.  Member patterns also have a _match candidate_,
> which is a
> likely-implicit method parameter.
>
> ### Member patterns as inverse methods and constructors
>
> Regardless of syntax, let us remind ourselves that that deconstructors are
> the
> categorical dual to constructors (coconstructors), and pattern methods are
> the
> categorical dual to methods (comethods).  They are dual in their
> structure: a
> constructor or method takes N arguments and produces a result, the
> corresponding
> member pattern consumes a match candidate and (conditionally) produces N
> bindings.
>
> Moreover, they are semantically dual: the return value produced by
> construction
> or factory invocation is the match candidate for the corresponding member
> pattern, and the bindings produced by a member pattern are the answers to
> the
> _Pattern Question_ -- "could this object have come from an invocation of my
> dual, and if so, with what arguments."
>
> ### What do we call them?
>
> Given the significant overlap between methods and patterns, the first
> question
> about the declaration we need to settle is how to identify a member pattern
> declaration as distinct from a method or constructor declaration.  _Towards
> Member Patterns_ tried out a syntax that recognized these as _inverse_
> methods
> and constructors:
>
>     public Point(int x, int y) { ... }
>     public inverse Point(int x, int y) { ... }
>
> While this is a principled choice which clearly highlights the duality,
> and one
> that might be good for specification and verbal description, it is
> questionable
> whether this would be a great syntax for reading and writing programs.
>
> A more traditional option is to choose a "noun" (conditional) keyword,
> such as
> `pattern`, `matcher`, `extractor`, `view`, etc:
>
>     public pattern Point(int x, int y) { ... }
>
> If we are using a noun keyword to identify pattern declarations, we could
> use
> the same noun for all of them, or we could choose a different one for
> deconstruction patterns:
>
>     public deconstructor Point(int x, int y) { ... }
>
> Alternately, we could reach for a symbol to indicate that we are talking
> about
> an inverted member.  C++ fans might suggest
>
>     public ~Point(int x, int y) { ... }
>
> but this is too cryptic (it's evocative once you see it, but then it
> becomes
> less evocative as we move away from deconstructors towards instance
> patterns.)
>
> If we wish to offer finer-grained control over conditionality, we might
> additionally need a `total` / `partial` modifier, though I would prefer to
> avoid
> that.
>
> Of the keyword candidates, there is one that stands out (for good and bad)
> because it connects to something that is already in the language:
> `pattern`.  On
> the one hand, using the term `pattern` for the declaration is a slight
> abuse; on
> the other, users will immediately connect it with "ah, so that's how I
> make a
> new pattern" or "so that's what happens when I match against this pattern."
> (Lisps would resolve this tension by calling it `defpattern`.)
>
> The others (`matcher`, `view`, `extractor`, etc) are all made-up terms that
> don't connect to anything else in the language, for better or worse.  If
> we pick
> one of these, we are asking users to sort out _three_ separate new things
> in
> their heads: (use-site) patterns, (declaration-site) matchers, and the
> rules of
> how patterns and matchers are connected.  Calling them both "patterns",
> despite
> the mild abuse of terminology, ties them together in a way that recognizes
> their
> connection.
>
> My personal position: `pattern` is the strongest candidate here, despite
> some
> flaws.
>
> ### Binding lists and match candidates
>
> There are two obvious alternatives for describing the binding list and
> match
> candidate of a pattern declaration, both with their roots in the
> constructor and
> method syntax:
>
>  - Pretend that a pattern declaration is like a method with multiple
> return, and
>    put the binding list in the "return position", and make the match
> candidate
>    an ordinary parameter;
>  - Lean into the inverse relationship between constructors and methods (and
>    consistency with the use-site syntax), and put the binding list in the
>    "parameter list position". For static patterns and some instance
> patterns,
>    which need to explicitly identify the match candidate type, there are
> several
>    sub-options:
>    - Lean further into the duality, putting the match candidate type in the
>      "return position";
>    - Put the match candidate type somewhere else, where it is less likely
> to be
>      confused for a method return.
>
> The "method-like" approach might look like this:
>
> ```
> class Point {
>     // Constructor and deconstructor
>     public Point(int x, int y) { ... }
>     public pattern (int x, int y) Point(Point target) { ... }
>     ...
> }
>
> class Optional<T> {
>     // Static factory and pattern
>     public static<T> Optional<T> of(T t) { ... }
>     public static<T> pattern (T t) of(Optional<T> target) { ... }
>     ...
> }
> ```
>
> The "inverse" approach might look like:
>
> ```
> class Point {
>     // Constructor and deconstructor
>     public Point(int x, int y) { ... }
>     public pattern Point(int x, int y) { ... }
>     ...
> }
>
> class Optional<T> {
>     // Static factory and pattern (using the first sub-option)
>     public static<T> Optional<T> of(T t) { ... }
>     public static<T> pattern Optional<T> of(T t) { ... }
>     ...
> }
> ```
>
> With the "method-like" approach, the match candidate gets an explicit name
> selected by the author; with the inverse approach, we can go with a
> predefined
> name such as `that`.  (Because deconstructors do not have receivers, we
> could by
> abuse of notation arrange for the keyword `this` to refer instead to the
> match
> candidate within the body of a deconstructor.  While this might seem to
> lead to
> a more familiar notation for writing deconstructors, it would create a
> gratuitous asymmetry between the bodies of deconstruction patterns and
> those of
> other patterns.)
>
> Between these choices, nearly all the considerations favor the "inverse"
> approach:
>
>  - The "inverse" approach makes the declaration look like the use site.
> This
>    highlights that `pattern Point(int x, int y)` is what gets invoked when
> you
>    match against the pattern use `Point(int x, int y)`.  (This point is so
>    strong that we should probably just stop here.)
>  - The "inverse" members also look like their duals; the only difference
> is the
>    `pattern` keyword (and possibly the placement of the match candidate
> type).
>    This makes matched pairs much more obvious, and such matched pairs will
> be
>    critical both for future language features and for library idioms.
>  - The method-like approach is suggestive of multiple return or tuples,
> which is
>    probably helpful for the first few minutes but actually harmful in the
> long
>    term. This feature is _not_ (much as some people would like to believe)
> about
>    multiple return or tuples, and playing into this misperception will
> only make
>    it harder to truly understand.  So this suggestion ends up propping up
> the
>    wrong mental model.
>
> The main downside of the "inverse" approach is the one-time speed bump of
> the
> unfamiliarity of the inverted syntax.  (The "method-like" syntax also has
> its
> own speed bumps, it is just unfamiliar in different ways.)  But unlike the
> advantages of the inverse approach, which continue to add value forever,
> this
> speed bump is a one-time hurdle to get over.
>
> To smooth out the speed bumps of the inverse approach, we can consider
> moving
> the position of the match candidate for static and (suitable) instance
> pattern
> declarations, such as:
>
> ```
> class Optional<T> {
>     // the usual static factory
>     public static<T> Optional<T> of(T t) { ... }
>
>     // Various ways of writing the corresponding pattern
>     public static<T> pattern of(T t) for Optional<T> { ... }
>     // or ...
>     public static<T> pattern(Optional<T>) of(T t) { ... }
>     // or ...
>     public static<T> pattern(Optional<T> that) of(T t) { ... }
>     // or ...
>     public static<T> pattern<Optional<T>> of(T t) { ... }
>     ...
> }
> ```
>
> (The deconstructor example looks the same with either variant.)  Of these,
> treating the match candidate like a "parameter" of "pattern" is probably
> the
> most evocative:
>
> ```
> public static<T> pattern(Optional<T> that) of(T t) { ... }
> ```
>
> as it can be read as "pattern taking the parameter `Optional<T> that`
> called
> `of`, binding `T`, and is a short departure from the inverse syntax.
>
> The main value of the various rearrangements is that users don't need to
> think
> about things operating in reverse to parse the syntax.  This trades some
> of the
> secondary point (patterns looking almost exactly like their inverses) for a
> certain amount of cognitive load, while maintaining the most important
> consideration: that the declaration site look like the use site.
>
> For instance pattern declarations, if the match candidate type is the same
> as
> the receiver type, the match candidate type can be elided as it is with
> deconstructors.
>
> My personal position: the "multiple return" version is terrible; all the
> sub-variants of the inverse version are probably workable.
>
> ### Naming the match candidate
>
> We've been assuming so far that the match candidate always has a fixed
> name,
> such as `that`; this is an entirely workable approach.  Some of the
> variants are
> also amenable to allowing authors to explicitly select a name for the match
> candidate.  For example, if we put the match candidate as a "parameter" to
> the `pattern` keyword, there is an obvious place to put the name:
>
> ```
> static<T> pattern(Optional<T> target) of(T t) { ... }
> ```
>
> My personal opinion: I don't think this degree of freedom buys us much,
> and in
> the long run readability probably benefits by picking a fixed name like
> `that`
> and sticking with it.  Even with a fixed name, if there is a sensible
> position
> for the name, allowing users to type `that` for explicitness is fine (as
> we do
> with instance methods, though many people don't know this.)  We may even
> want to
> require it.
>
> ## Body types
>
> Just as there are two obvious approaches for the declaration, there are two
> obvious approaches we could take for the body (though there is some
> coupling
> between them.)  We'll call the two body approaches _imperative_ and
> _functional_.
>
> The imperative approach treats bindings as initially-DU variables that
> must be
> DA on successful completion, getting their value through ordinary
> assignment;
> the functional approach sets all the bindings at once, positionally.
> Either
> way, member patterns (except maybe deconstructors) also need a way to
> differentiate a successful match from a failed match.
>
> Here is the `Point` deconstructor with both imperative and functional
> style. The
> functional style uses a placeholder `match` statement to indicate a
> successful
> match and provision of bindings:
>
> ```
> class Point {
>     int x, y;
>
>     Point(int x, int y) {
>         this.x = x;
>         this.y = y;
>     }
>
>     // Imperative style, deconstructor always succeeds
>     pattern Point(int x, int y) {
>         x = that.x;
>         y = that.y;
>     }
>
>     // Functional style
>     pattern Point(int x, int y) {
>         match(that.x, that.y);
>     }
> }
> ```
>
> There are some obvious differences here.  In the imperative style, the
> dtor body
> looks much more like the reverse of the ctor body. The functional style is
> more
> concise (and amenable to further concision via the "concise method bodies"
> mechanism in the future), as well as a number of less obvious
> differences.  For
> deconstructors, the imperative approach is likely to feel more natural
> because
> of the obvious symmetry with constructors.
>
> In reality, it is _premature at this point to have an opinion_, because we
> haven't yet seen the full scope of the problem; deconstructors are a
> special
> case in many ways, which almost surely is distorting our initial opinion.
> As we
> move towards conditional patterns (and pattern lambdas), our opinions may
> flip.
>
> Regardless of which we pick, there are some additional syntactic choices
> to be
> made -- what syntax to use to indicate success (we used `match` in the
> above
> example) or failure.  (We should be especially careful around trying to
> reuse
> words like `return`, `break`, or `yield` because, in the case where there
> are
> zero bindings (which is allowable), it becomes unclear whether they mean
> "fail"
> or "succeed with zero bindings".)
>
> ### Success and failure
>
> Except for possibly deconstructors, which we may require to be total, a
> pattern
> declaration needs a way to indicate success and failure.  In the examples
> above,
> we posited a `match` statement to indicate success in the functional
> approach,
> and in both examples leaned on the "implicit success" of deconstructors
> (under
> the assumption they always succeed).  Now let's look at the more general
> case to
> figure out what else is needed.
>
> For a static pattern like `Optional::of`, success is conditional.  Using
> `match-fail` as a placeholder for "the match failed", this might look like
> (functional version):
>
> ```
> public static<T> pattern(Optional<T> that) of(T t) {
>     if (that.isPresent())
>         match (that.get());
>     else
>         match-fail;
> }
> ```
>
> The imperative version is less pretty, though.  Using `match-success` as a
> placeholder:
>
> ```
> public static<T> pattern(Optional<T> that) of(T t) {
>     if (that.isPresent()) {
>         t = that.get();
>         match-success;
>     }
>     else
>         match-fail;
> }
> ```
>
> Both arms of the `if` feel excessively ceremonial here.  And if we chose
> to not
> make all deconstruction patterns unconditional, deconstructors would
> likely need
> some explicit success as well:
>
> ```
> pattern Point(int x, int y) {
>     x = that.x;
>     y = that.y;
>     match-success;
> }
> ```
>
> It might be tempting to try and eliminate the need for explicit success by
> inferring it from whether or not the bindings are DA or not, but this is
> error-prone, is less type-checkable, and falls apart completely for
> patterns
> with no bindings.
>
> ### Implicit failure in the functional approach
>
> One of the ceremonial-seeming aspects of `Optional::of` above is having to
> say
> `else match-fail`, which doesn't feel like it adds a lot of value.
> Perhaps we
> can be more concise without losing clarity.
>
> Most conditional patterns will have a predicate to determine matching, and
> then
> some conditional code to compute the bindings and claim success.  Having
> to say
> "and if the predicate didn't hold, then I fail" seems like ceremony for the
> author and noise for the reader.  Instead, if a conditional pattern falls
> off
> the end without matching, we could treat that as simply not matching:
>
> ```
> public static<T> pattern(Optional<T> that) of(T t) {
>     if (that.isPresent())
>         match (that.get());
> }
> ```
>
> This says what we mean: if the optional is present, then this pattern
> succeeds
> and bind the contents of the `Optional`.  As long as our "succeed"
> construct
> strongly enough connotes that we are terminating abruptly and
> successfully, this
> code is perfectly clear.  And most conditional patterns will look a lot
> like
> `Optional::of`; do some sort of test and if it succeeds, extract the state
> and
> bind it.
>
> At first glance, this "implicit fail" idiom may seem error-prone or
> sloppy.  But
> after writing a few dozen patterns, one quickly tires of saying "else
> match-fail" -- and the reader doesn't necessarily appreciate reading it
> either.
>
> Implicit failure also simplifies the selection of how we explicitly
> indicate
> failure; using `return` in a pattern for "no match" becomes pretty much a
> forced
> move.  We observe that (in a void method), "return" and "falling off the
> end"
> are equivalent; if "falling off the end" means "no match", then so should
> an
> explicit `return`.  So in those few cases where we need to explicitly
> signal "no
> match", we can just use `return`.  It won't come up that often, but here's
> an
> example where it does:
>
> ```
> static pattern(int that) powerOfTwo(int exp) {
>     int exp = 0;
>
>     if (that < 1)
>         return; // explicit fail
>
>     while (that > 1) {
>         if (that % 2 == 0) {
>             that /= 2;
>             ++exp;
>         }
>         else
>             return; // explicit fail
>     }
>     match (exp);
> }
> ```
>
> As a bonus, if `return` as match failure is a forced move, we need only
> select a
> term for "successful match" (which obviously can't be `return`).  We could
> use
> `match` as we have in the examples, or a variant like `matched` or
> `matches`.
> But rather than just creating a new control operator, we have an
> opportunity to
> lean into the duality a little harder, by including the pattern syntax in
> the
> match:
>
> ```
> matches of(that.get());
> ```
>
> or the (optionally?) qualified (inferring type arguments, as we do at the
> use
> site):
>
> ```
> matches Optional.of(that.get());
> ```
>
> These "use the name" approaches trades a small amount of verbosity to gain
> a
> higher degree of fidelity to the pattern use site (and to evoke the
> comethod
> completion.)
>
> If we don't choose "implicit fail", we would have to invent _two_ new
> control
> flow statements to indicate "success" and "failure".
>
> My personal position: for the functional approach, implicit failure both
> makes
> the code simpler and clearer, and after you get used to it, you don't want
> to go
> back.  Whether we say `match` or `matches` or `matches <pattern-name>` are
> all
> workable, though I like some variant that names the pattern.
>
> ### Implicit success in the imperative approach
>
> In the imperative approach, we can be implicit as well, but it feels more
> natural (at least, initially) to choose implicit success rather than
> failure.
> This works great for unconditional patterns:
>
> ```
> pattern Point(int x, int y) {
>     x = that.x;
>     y = that.y;
>     // implicit success
> }
> ```
>
> but not quite as well for conditional patterns:
>
> ```
> static<T> pattern(Optional<T> that) of(T t) {
>     if (that.isPresent()) {
>         t = that.get();
>     }
>     else
>         match-fail;
>     // implicit success
> }
> ```
>
> We can eliminate one of the arms of the if, with the more concise (but
> convoluted) inversion:
>
> ```
> static<T> pattern(Optional<T> that) of(T t) {
>     if (!that.isPresent())
>         match-fail;
>     t = that.get();
>     // implicit success
> }
> ```
>
> Just as with the functional approach, if we choose imperative and "implicit
> success", using `return` to indicate success is pretty much a forced move.
>
>
> ### Imperative is a trap
>
> If we assume that functional implies implicit failure, and imperative
> implies
> implicit success, then our choices become:
>
> ```
> class Optional<T> {
>     public static<T> Optional<T> of(T t) { ... }
>
>     // imperative, implicit success
>     public static<T> pattern(Optional<T> that) of(T t) {
>         if (that.isPresent()) {
>             t = that.get();
>         }
>         else
>             match-fail;
>     }
>
>     // functional, implicit failure
>     public static<T> pattern(Optional<T> that) of(T t) {
>         if (that.isPresent())
>             matches of(that.get());
>     }
> }
> ```
>
> Once we get past deconstructors, the imperative approach looks worse by
> comparison because we need to assign all the bindings (which is _O(n)_
> assignments) _and also_ indicate success or failure somehow, whereas in the
> functional style all can be done together with a single `matches`
> statement.
>
> Looking at the alternatives, except maybe for unconditional patterns, the
> functional example above seems a lot more natural.  The imperative approach
> works with deconstructors (assuming they are not conditional), but does not
> scale so well to conditionality -- which is the essence of patterns.
>
> From a theoretical perspective, the method-comethod duality also gives us a
> forceful nudge towards the functional approach.  In a method, the method
> arguments are specified as a positional list of expressions at the use
> site:
>
>     m(a, b, c)
>
> and these values are invisibly copied into the parameter slots of the
> method
> prior to frame activation.  The dual to that for a comethod to similarly
> convey
> the bindings in a positional list of expressions (as they must either all
> be
> produced or none), where they are copied into the slots provided at the use
> site, as is indicated by `matches` in the above examples.
>
> My personal position: the imperative style feels like a trap.  It seems
> "obvious" at first if we start with deconstructors, but becomes
> increasingly
> difficult when we get past this case, and gets in the way of other
> opportunities.  The last gasp before acceptance is the discomfort that
> dtor and
> ctor bodies are written in different styles, but in the rear-view mirror,
> this
> feels like a non-issue.
>
> ### Derive imperative from functional?
>
> If we start with "functional with implicit failure", we can possibly rescue
> imperative by deriving a version of imperative from functional, by
> "overloading"
> the match-success operator.
>
> If we have a pattern whose binding names are `b1..bn` of types `B1..Bn`,
> then
> the `matches` operator must take a list of expressions `e1..en` whose
> arity and
> types are compatible with `B1..Bn`.  But we could allow `matches` to also
> have a
> nilary form, which would have the effect of being shorthand for
>
>     matches <pattern-name>(b1, b2, ..., bn)
>
> where each of `b1..bn` must be DA at the point of matching.  This means
> that we
> could express patterns in either form:
>
> ```
> class Optional<T> {
>     public static<T> Optional<T> of(T t) { ... }
>
>     // imperative, derived from functional with implicit failure
>     public static<T> pattern(Optional<T> that) of(T t) {
>         if (that.isPresent()) {
>             t = that.get();
>             matches of;
>         }
>     }
>
>     public static<T> pattern(Optional<T> that) of(T t) {
>         if (that.isPresent())
>             matches of(that.get());
>     }
> }
> ```
>
> This flexibility allows users to select a more verbose expression in
> exchange
> for a clearer association of expressions and bindings, though as we'll
> see, it
> does come with some additional constraints.
>
> ### Wrapping an existing API
>
> Nearly every library has methods (sometimes sets of methods) that are
> patterns
> in disguise, such as the pair of methods `isArray` and `getComponentType`
> in
> `Class`, or the `Matcher` helper type in `java.util.regex`.  Library
> maintainers
> will likely want to wrap (or replace) these with real patterns, so these
> can
> participate more effectively in conditional contexts, and in some cases,
> highlight their duality with factory methods.
>
> Matching a string against a `j.u.r.Pattern` regular expression has all the
> same
> elements as a pattern, just with an ad-hoc API (and one that I have to
> look up
> every time).  But we can fairly easily wrap a true pattern around the
> existing
> API.  To match against a `Pattern` today, we pass the match candidate to
> `Pattern::matcher`, which returns a `Matcher` with accessors
> `Matcher::matches`
> (did it match) and `Matcher::group` (conditionally extract a particular
> capture
> group.)  If we want to wrap this with a pattern called `regexMatch`:
>
> ```
> pattern(String that) regexMatch(String... groups) {
>     Matcher m = this.matcher(that);
>     if (m.matches())
>         matches Pattern.regexMatch(IntStream.range(1, m.groupCount())
>                                             .map(Matcher::group)
>                                             .toArray(String[]::new));
>     // whole lotta matchin' goin' on
> }
> ```
>
> This says that a `j.u.r.Pattern` has an instance pattern called `regex`,
> whose
> match candidate is `String`, and which binds a varargs of `String`
> corresponding
> to the capture groups.  The implementation simply delegates to the existing
> `j.u.r.Matcher` API.  This means that `j.u.r.Pattern` becomes a sort of
> "pattern
> object", and we can use it as a receiver at the use site:
>
> ```
> static Pattern As = Pattern.compile("(a*)");
> static Pattern Bs = Pattern.compile("(b*)");
> ...
> switch (string) {
>     case As.regexMatch(var as): ...
>     case Bs.regexMatch(var bs): ...
>     ...
> }
> ```
>
> ### Odds and ends
>
> There are a number of loose ends here.  We could choose other names for the
> match-success and match-fail operations, including trying to reuse `break`
> or
> `yield`.  But, this reuse is tricky; it must be very clear whether a given
> form
> of abrupt completion means "success" or "failure", because in the case of
> patterns with no bindings, we will have no other syntactic cues to help
> disambiguate.  (I think having a single `matches`, with implicit failure
> and
> `return` meaning failure, is the sweet spot here.)
>
> Another question is whether the binding list introduces corresponding
> variables
> into the scope of the body.  For imperative, the answer is "surely yes";
> for
> functional, the answer is "maybe" (unless we want to do the trick where we
> derive imperative from functional, in which case the answer is "yes"
> again.)
>
> If the binding list does not correspond to variables in the body, this may
> be
> initially discomforting; because they do not declare program elements,
> they may
> feel that they are left "dangling".  But even if they are not declaring
> _program_ elements, they are still declaring _API_ elements (similar to the
> return type of a method.)  We will want to provide Javadoc on the
> bindings, just
> like with parameters; we will want to match up binding names in
> deconstructors
> with parameter names in constructors; we may even someday want to support
> by-name binding at the use site (e.g., `case Foo(a: var a)`).  The names
> are
> needed for all of these, just not for the body. Names still matter.  My
> take
> here is that this is a transient "different is scary" reaction, one that we
> would get over quickly.
>
> A final question is whether we should consider unqualified names as
> implicitly
> qualified by `that` (and also `this`, for instance patterns, with some
> conflict
> resolution).  Users will probably grow tired of typing `that.` all the
> time, and most of the time, the unqualified use is perfectly readable.
>
> ## Exhaustiveness
>
> There is one last syntax question in front of us: how to indicate that a
> set of
> patterns are (claimed to be) exhaustive on a given match candidate type.
> We see
> this with `Optional::of` and `Optional::empty`; it would be sad if the
> compiler
> did not realize that these two patterns together were exhaustive on
> `Optional`.
> This is not a feature that will be used often, but not having it at all
> will be
> a repeated irritant.
>
> The best I've come up with is to call these `case` patterns, where a set of
> `case` patterns for a given match candidate type in a given class are
> asserted
> to be an exhaustive set:
>
> ```
> class Optional<T> {
>     static<T> Optional<T> of(T t) { ... }
>     static<T> Optional<T> empty() { ... }
>
>     static<T> case pattern of(T t) for Optional<T> { ... }
>     static<T> case pattern empty() for Optional<T> { ... }
> }
> ```
>
> Because they may not be truly exhaustive, `switch` constructs will have to
> back
> up the static assumption of exhaustiveness with a dynamic check, as we do
> for
> other sets of exhaustive patterns that may have remainder.
>
> I've experimented with variants of `sealed` but it felt more forced, so
> this is
> the best I've come up with.
>
> ## Example: patterns delegating to other patterns
>
> Pattern implementations must compose.  Just as a subclass constructor
> delegates
> to a superclass constructor, the same should be true for deconstructors.
> Here's a typical superclass-subclass pair:
>
> ```
> class A {
>     private final int a;
>
>     public A(int a) { this.a = a; }
>     public pattern A(int a) { matches A(that.a); }
> }
>
> class B extends A {
>     private final int b;
>
>     public B(int a, int b) {
>         super(a);
>         this.b = b;
>     }
>
>     // Imperative style
>     public pattern B(int a, int b) {
>         if (that instanceof super(var aa)) {
>             a = aa;
>             b = that.b;
>             matches B;
>         }
>     }
>
>     // Functional style
>     public pattern B(int a, int b) {
>         if (that instanceof super(var a))
>             matches B(a, b);
>     }
> }
> ```
>
> (Ignore the flow analysis and totality for the time being; we'll come back
> to
> this in a separate document.)
>
> The first thing that jumps out at us is that, in the imperative version,
> we had
> to create a "garbage" variable `aa` to receive the binding, because `a` was
> already in scope, and then we have to copy the garbage variable into the
> real
> binding variable. Users will surely balk at this, and rightly so.  In the
> functional version (depending on the choices from "Odds and Ends") we are
> free
> to use the more natural name and avoid the roundabout locution.
>
> We might be tempted to fix the "garbage variable" problem by inventing
> another
> sub-feature: the ability to use an existing variable as the target of a
> binding,
> such as:
>
> ```
> pattern Point(int a, int b) {
>     if (this instanceof A(__bind a))
>         b = this.b;
> }
> ```
>
> But, I think the language is stronger without this feature, for two
> reasons.
> First, having to reason about whether a pattern match introduces a new
> binding
> or assigns to an existing variables is additional cognitive load for users
> to
> reason about, and second, having assignment to locals happening through
> something other than assignment introduces additional complexity in finding
> where a variable is modified.  While we can argue about the general
> utility of
> this feature, bringing it in just to solve the garbage-variable problem is
> particularly unattractive.
>
> ## Pattern lambdas
>
> One final consideration is is that patterns may also have a lambda form.
> Given
> a single-abstract-pattern (SAP) interface:
>
> ```
> interface Converter<T,U> {
>     pattern(T t) convert(U u);
> }
> ```
>
> one can implement such a pattern with a lambda. Such a lambda has one
> parameter
> (the match candidate), and its body looks like the body of a declared
> pattern:
>
> ```
> Converter<Integer, Short> c =
>     i -> {
>         if (i >= Short.MIN_VALUE && i <= Short.MAX_VALUE)
>             matches Converter.convert((short) i);
>     };
> ```
>
> Because the bindings of the pattern lambda are defined in the interface,
> not in
> the lambda, this is one more reason not to like the imperative version: it
> is
> brittle, and alpha-renaming bindings in the interface would be a
> source-incompatible change.
>
> ## Example gallery
>
> Here's all the pattern examples so far, and a few more, using the suggested
> style (functional, implicit fail, implicit `that`-qualification):
>
> ```
> // Point dtor
> pattern Point(int x, int y) {
>     matches Point(x, y);
> }
>
> // Optional -- static patterns for Optional::of, Optional::empty
> static<T> case pattern(Optional<T> that) of(T t) {
>     if (isPresent())
>         matches of(t);
> }
>
> static<T> case pattern(Optional<T> that) empty() {
>     if (!isPresent())
>         matches empty();
> }
>
> // Class -- instance pattern for arrayClass (match candidate type inferred)
> pattern arrayClass(Class<?> componentType) {
>     if (that.isArray())
>         matches arrayClass(that.getComponentType());
> }
>
> // regular expression -- instance pattern in j.u.r.Pattern
> pattern(String that) regexMatch(String... groups) {
>     Matcher m = matcher(that);
>     if (m.matches())
>         matches Pattern.regexMatch(IntStream.range(1, m.groupCount())
>                                             .map(Matcher::group)
>                                             .toArray(String[]::new));
> }
>
> // power of two (somewhere)
> static pattern(int that) powerOfTwo(int exp) {
>     int exp = 0;
>
>     if (that < 1)
>         return;
>
>     while (that > 1) {
>         if (that % 2 == 0) {
>             that /= 2;
>             exp++;
>         }
>         else
>             return;
>     }
>     matches powerOfTwo(exp);
> }
> ```
>
> ## Closing thoughts
>
> I came out of this exploration with very different conclusions than I
> expected
> when going in.  At first, the "inverse" syntax seemed stilted, but over
> time it
> started to seem more obvious.  Similarly, I went in expecting to prefer the
> imperative approach for the body, but over time, started to warm to the
> functional approach, and eventually concluded it was basically a forced
> move if
> we want to support more than just deconstructors.  And I started out
> skeptical
> of "implicit fail", but after writing a few dozen patterns with it, going
> back
> to fully explicit felt painful.  All of this is to say, you should hold
> your
> initial opinions at arm's length, and give the alternatives a chance to
> sink in.
>
> For most _conditional_ patterns (and conditionality is at the heart of
> pattern
> matching), the functional approach cleanly highlights both the match
> predicate
> and the flow of values, and is considerably less fussy than the imperative
> approach in the same situation; `Optional::of`, `Class::arrayClass`, and
> `regex`
> look great here, much better than the would with imperative.  None of these
> illustrate delegation, but in the presence of delegation, the gap gets even
> wider.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240330/f7f7c029/attachment-0001.htm>

From brian.goetz at oracle.com  Sat Mar 30 20:53:14 2024
From: brian.goetz at oracle.com (Brian Goetz)
Date: Sat, 30 Mar 2024 16:53:14 -0400
Subject: Member Patterns -- the bikeshed
In-Reply-To: <CAFOkWZZbAq1mmmHiHEFdEQHbnDg_priA=z5ic1Wu4FX+CRopSg@mail.gmail.com>
References: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com>
 <CAFOkWZZbAq1mmmHiHEFdEQHbnDg_priA=z5ic1Wu4FX+CRopSg@mail.gmail.com>
Message-ID: <6a243154-82fe-4a7b-88c7-a70c1605a3bf@oracle.com>


On 3/30/2024 3:23 PM, Victor Nazarov wrote:
> I have two points that I think may be good to consider in the list of 
> options.
>
> 1. I'm not sure if this was considered, but I find explicit lists of 
> covering patterns
> rather natural and more flexible than using case as a pattern-modifier.

Agreed (this is how F# does it), and we tried that, but it is so 
contrary to how members are done in Java.?? (One might think that one 
could declare a "sealed" pattern, which "permits" a list of other 
patterns, and this sounds perfectly natural, but it looks pretty weird.)

> The important feature of explicit lists is that there may be more than 
> one covering set of patterns.

Yes, been down this road too, but the reality is that this is not likely 
to come up nearly as often as one might imagine.

> 2. I think that there is a middle ground between functional and 
> imperative pattern body definition style that may look cumbersome at 
> first, but nevertheless gives you best of both worlds:

The `match` block is an interesting idea, will consider.


>
> ? ? * deconstructor patterns look dual to constructors
> ? ? * names from the list of pattern variables are actually used and 
> checked by the compiler
> ? ? * control flow is still functional, which is more natural
>
> The downside that is retained from the imperative style is the need 
> for alpha-renaming,
> but I think we still have to deal with shadowing and renaming 
> local-variable seems natural and easy.
>
> Middle ground may be used like a special form that can be used in the 
> pattern body.
> This form works mostly the same way as `with`-clause as defined in the 
> "Derived Record Instances" JEP.
>
> Here is the long list of examples to fully illustrate different 
> interactions:
>
> ````
> ? ? class Optional<T> matches (of|empty) {
> ? ? ? ? public static <T> pattern<Optional<T>> of(T value) {
> ? ? ? ? ? ? if (that.isPresent()) {
> ? ? ? ? ? ? ? ? match {
> ? ? ? ? ? ? ? ? ? ? value = that.get();
> ? ? ? ? ? ? ? ? }
> ? ? ? ? ? ? }
> ? ? ? ? }
>
> ? ? ? ? public static <T> pattern<Optional<T>> empty() {
> ? ? ? ? ? ? if (that.isEmpty())
> ? ? ? ? ? ? ? ? match {}
> ? ? ? ? }
> ? ? }
>
> ? ? class Pattern {
> ? ? ? ? public pattern<String> regexMatch(String... groups) {
> ? ? ? ? ? ? Matcher m = this.matcher(that);
> ? ? ? ? ? ? if (m.matches()) {
> ? ? ? ? ? ? ? ? match {
> ? ? ? ? ? ? ? ? ? ? groups =
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? IntStream.range(1, m.groupCount())
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? .map(Matcher::group)
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? .toArray(String[]::new);
> ? ? ? ? ? ? ? ? }
> ? ? ? ? ? ? }
> ? ? ? ? }
> ? ? }
>
> ? ? class A {
> ? ? ? ? private final int a;
>
> ? ? ? ? public A(int a) {
> ? ? ? ? ? ? this.a = a;
> ? ? ? ? }
> ? ? ? ? public pattern A(int a) {
> ? ? ? ? ? ? match {
> ? ? ? ? ? ? ? ? a = that.a;
> ? ? ? ? ? ? }
> ? ? ? ? }
> ? ? }
>
> ? ? class B extends A {
> ? ? ? ? private final int b;
>
> ? ? ? ? public B(int a, int b) {
> ? ? ? ? ? ? super(a);
> ? ? ? ? ? ? this.b = b;
> ? ? ? ? }
>
> ? ? ? ? public pattern B(int a, int b) {
> ? ? ? ? ? ? if (that instanceof super(var aa)) {
> ? ? ? ? ? ? ? ? match {
> ? ? ? ? ? ? ? ? ? ? a = aa;
> ? ? ? ? ? ? ? ? ? ? b = that.b;
> ? ? ? ? ? ? ? ? }
> ? ? ? ? ? ? }
> ? ? ? ? }
> ? ? }
>
> ? ? interface Converter<T,U> {
> ? ? ? ? pattern<T> convert(U u);
> ? ? }
> ? ? Converter<Integer, Short> c =
> ? ? ? ? pattern (s) -> {
> ? ? ? ? ? ? if (that >= Short.MIN_VALUE && that <= Short.MAX_VALUE)
> ? ? ? ? ? ? ? ? match {
> ? ? ? ? ? ? ? ? ? ? s = (short) that;
> ? ? ? ? ? ? ? ? }
> ? ? ? ? };
> ````
>
> --
> Victor Nazarov
>
>
> On Fri, Mar 29, 2024 at 10:59?PM Brian Goetz <brian.goetz at oracle.com> 
> wrote:
>
>     We now come to the long-awaited bikeshed discussion on what member
>     patterns should look like.
>
>     Bikeshed disclaimer for EG:
>     ? - This is likely to evoke strong opinions, so please take pains
>     to be especially constructive
>     ? - Long reply-to-reply threads should be avoided even more than usual
>     ? - Holistic, considered replies preferred
>     ? - Please change subject line if commenting on a sub-topic or
>     tangential
>     ??? concern
>
>     Special reminders for Remi:
>     ?- Use of words like "should", "must", "shouldn't", "mistake",
>     "wrong", "broken"
>     ?? are strictly forbidden.
>     ?- If in doubt, ask questions first.
>
>     Notes for external observers:
>     ?- This is a working document for the EG; the discussion may
>     continue for a
>     ?? while before there is an official proposal.? Please be patient.
>
>
>     # Pattern declaration: the bikeshed
>
>     We've largely identified the model for what kinds of patterns we
>     need to
>     express, but there are still several degrees of freedom in the syntax.
>
>     As the model has simplified during the design process, the space
>     of syntax
>     choices has been pruned back, which is a good thing. However,
>     there are still
>     quite a few smaller decisions to be made.? Not all of the
>     considerations are
>     orthogonal, so while they are presented individually, this is not
>     a "pick one
>     from each column" menu.
>
>     Some of these simplifications include:
>
>     ?- Patterns with "input arguments" have been removed; another way
>     to get to what
>     ?? this gave us may come back in another form.
>     ?- I have grown increasingly skeptical of the value of the
>     imperative `match`
>     ?? statement.? With better totality analysis, I think it can be
>     eliminated.
>
>     We can discuss these separately but I would like to sync first on
>     the broad
>     strokes for how patterns are expressed.
>
>     ## Object model requirements
>
>     As outlined in "Towards Member Patterns", the basic model is that
>     patterns are
>     the dual of other executable members (constructors, static
>     methods, instance
>     methods.)? While they are like methods in that they have inputs,
>     outputs, names,
>     and an imperative body, they have additional degrees of freedom that
>     constructors and methods lack:
>
>     ?- Patterns are, in general, _conditional_ (they can succeed or
>     fail), and only
>     ?? produce bindings (outputs) when they succeed.? This
>     conditionality is
>     ?? understood by the language's flow analysis, and is used for
>     computing scoping
>     ?? and definite assignment.
>     ?- Methods can return at most one value; when a pattern completes
>     successfully,
>     ?? it may bind multiple values.
>     ?- All patterns have a _match candidate_, which is a distinguished,
>     ?? possibly-implicit parameter.? Some patterns also have a
>     receiver, which is
>     ?? also a distinguished, possibly-implicit parameter.? In some
>     such cases the
>     ?? receiver and match candidate are aliased, but in others these
>     may refer to
>     ?? different objects.
>
>     So a pattern is a named executable member that takes a _match
>     candidate_ as a
>     possibly-implicit parameter, maybe takes a receiver as an implicit
>     parameter,
>     and has zero or more conditional _bindings_.? Its body can perform
>     imperative
>     computation, and can terminate either with match failure or
>     success.? In the
>     success case, it must provide a value for each binding.
>
>     Deconstruction patterns are special in many of the same ways
>     constructors are:
>     they are constrained in their name, inheritance, and probably their
>     conditionality (they should probably always succeed). Just as the
>     syntax for
>     constructors differs slightly from that of instance methods, the
>     syntax for
>     deconstructors may differ slightly from that of instance
>     patterns.? Static
>     patterns, like static methods, have no receiver and do not have
>     access to the
>     type parameters of the enclosing class.
>
>     Like constructors and methods, patterns can be overloaded, but in
>     accordance
>     with their duality to constructors and methods, the overloading
>     happens on the
>     _bindings_, not the inputs.
>
>     ## Use-site syntax
>
>     There are several kinds of type-driven patterns built into the
>     language: type
>     patterns and record patterns.? A type pattern in a `switch` looks
>     like:
>
>     ??? case String s: ...
>
>     And a record pattern looks like:
>
>     ??? case MyRecord(P1, P2, ...): ...
>
>     where `P1..Pn` are nested patterns that are recursively matched to the
>     components of the record.? This use-site syntax for record
>     patterns was chosen
>     for its similarity to the construction syntax, to highlight that a
>     record
>     pattern is the dual of record construction.
>
>     **Deconstruction patterns.**? The simplest kind of member pattern, a
>     deconstruction pattern, will have the same use-site syntax as a
>     record pattern;
>     record patterns can be thought of as a deconstruction pattern
>     "acquired for
>     free" by records, just as records do with constructors, accessors,
>     object
>     methods, etc.? So the use of a deconstruction pattern for `Point`
>     looks like:
>
>     ??? case Point(var x, var y): ...
>
>     whether `Point` is a record or an ordinary class equipped with a
>     suitable
>     deconstruction pattern.
>
>     **Static patterns.**? Continuing with the idea that the
>     destructuring syntax
>     should evoke the aggregation syntax, there is an obvious candidate
>     for the
>     use-site syntax for static patterns:
>
>     ??? case Optional.of(var e): ...
>     ??? case Optional.empty(): ...
>
>     **Instance patterns.**? Uses of instance patterns will likely come
>     in two forms,
>     analogous to bound and unbound instance method references,
>     depending on whether
>     the receiver and the match candidate are the same object. In the
>     unbound form,
>     used when the receiver is the same object as the match candidate,
>     the pattern
>     name is qualified by a _type_:
>
>     ```
>     Class<?> k = ...
>     switch (k) {
>     ??? // Qualified by type
>     ??? case Class.arrayClass(var componentType): ...
>     }
>     ```
>
>     This means that we _resolve_ the pattern `arrayClass` starting at
>     `Class` and
>     _select_ the pattern using the receiver, `k`.? We may also be able
>     to omit the
>     class qualifier if the static type of the match candidate is
>     sufficient to
>     resolve the desired pattern.
>
>     In the bound form, used when the receiver is distinct from the
>     match candidate,
>     the pattern name is qualified with an explicit _receiver
>     expression_.? As an
>     example, consider an interface that captures primitive widening
>     and narrowing
>     conversions, such as those between `int` and `long`.? In the
>     widening direction,
>     conversion is unconditional, so this can be modeled as a method
>     from `int` to
>     `long`.? In the other direction, conversion is conditional, so
>     this is better
>     modeled as a _pattern_ whose match candidate is `long` and which
>     binds an `int`
>     on success.? Since these are instance methods of some class (say,
>     `NumericConversion<T,U>`), we need to provide the receiver
>     instance in order to
>     resolve the pattern:
>
>     ```
>     NumericConversion<int, long> nc = ...
>
>     switch (aLong) {
>     ??? case nc.narrowed(int i):
>     ??? ...
>     }
>     ```
>
>     The explicit receiver syntax would also be used if we exposed
>     regular expression
>     matching as a pattern on the `j.u.r.Pattern` object (the name
>     collision on
>     `Pattern` is unfortunate).? Imagine we added a `matching` instance
>     pattern to
>     `j.u.r.Pattern`; then we could use it in `instanceof` as follows:
>
>     ```
>     static final java.util.regex.Pattern P = Pattern.compile("(a*)(b*)");
>     ...
>     if (aString instanceof P.matching(String as, String bs)) { ... }
>     ```
>
>     Each of these use-site syntaxes is modeled after the use-site
>     syntax for a
>     method invocation or method reference.
>
>     ## Declaration-site syntax
>
>     To avoid being biased by the simpler cases, we're going to work
>     all the cases
>     concurrently rather than starting with the simpler cases and
>     working up.? (It
>     might seem sensible to start with deconstructors, since they are
>     the "easy"
>     case, but if we did that, we would likely be biased by their
>     simplicity and then
>     find ourselves painted into a corner.)? As our example gallery, we
>     will consider:
>
>     ?- Deconstruction pattern for `Point`;
>     ?- Static patterns for `Optional::of` and `Optional::empty`;
>     ?- Static pattern for "power of two" (illustrating a computations
>     where success
>     ?? or failure, and computation of bindings, cannot easily be
>     separated);
>     ?- Instance pattern for `Class::arrayClass` (used unbound);
>     ?- Instance pattern for `Pattern::matching` on regular expressions
>     (used bound).
>
>     Member patterns, like methods, have _names_.? (We can think of
>     constructors as
>     being named for their enclosing classes, and the same for
>     deconstructors.)? All
>     member patterns have a (possibly empty) ordered list of
>     _bindings_, which are
>     the dual of constructor or method parameters.? Bindings, in turn,
>     have names and
>     types.? And like constructors and methods, member patterns have a
>     _body_ which
>     is a block statement.? Member patterns also have a _match
>     candidate_, which is a
>     likely-implicit method parameter.
>
>     ### Member patterns as inverse methods and constructors
>
>     Regardless of syntax, let us remind ourselves that that
>     deconstructors are the
>     categorical dual to constructors (coconstructors), and pattern
>     methods are the
>     categorical dual to methods (comethods).? They are dual in their
>     structure: a
>     constructor or method takes N arguments and produces a result, the
>     corresponding
>     member pattern consumes a match candidate and (conditionally)
>     produces N
>     bindings.
>
>     Moreover, they are semantically dual: the return value produced by
>     construction
>     or factory invocation is the match candidate for the corresponding
>     member
>     pattern, and the bindings produced by a member pattern are the
>     answers to the
>     _Pattern Question_ -- "could this object have come from an
>     invocation of my
>     dual, and if so, with what arguments."
>
>     ### What do we call them?
>
>     Given the significant overlap between methods and patterns, the
>     first question
>     about the declaration we need to settle is how to identify a
>     member pattern
>     declaration as distinct from a method or constructor declaration.?
>     _Towards
>     Member Patterns_ tried out a syntax that recognized these as
>     _inverse_ methods
>     and constructors:
>
>     ??? public Point(int x, int y) { ... }
>     ??? public inverse Point(int x, int y) { ... }
>
>     While this is a principled choice which clearly highlights the
>     duality, and one
>     that might be good for specification and verbal description, it is
>     questionable
>     whether this would be a great syntax for reading and writing
>     programs.
>
>     A more traditional option is to choose a "noun" (conditional)
>     keyword, such as
>     `pattern`, `matcher`, `extractor`, `view`, etc:
>
>     ??? public pattern Point(int x, int y) { ... }
>
>     If we are using a noun keyword to identify pattern declarations,
>     we could use
>     the same noun for all of them, or we could choose a different one for
>     deconstruction patterns:
>
>     ??? public deconstructor Point(int x, int y) { ... }
>
>     Alternately, we could reach for a symbol to indicate that we are
>     talking about
>     an inverted member.? C++ fans might suggest
>
>     ??? public ~Point(int x, int y) { ... }
>
>     but this is too cryptic (it's evocative once you see it, but then
>     it becomes
>     less evocative as we move away from deconstructors towards
>     instance patterns.)
>
>     If we wish to offer finer-grained control over conditionality, we
>     might
>     additionally need a `total` / `partial` modifier, though I would
>     prefer to avoid
>     that.
>
>     Of the keyword candidates, there is one that stands out (for good
>     and bad)
>     because it connects to something that is already in the language:
>     `pattern`.? On
>     the one hand, using the term `pattern` for the declaration is a
>     slight abuse; on
>     the other, users will immediately connect it with "ah, so that's
>     how I make a
>     new pattern" or "so that's what happens when I match against this
>     pattern."
>     (Lisps would resolve this tension by calling it `defpattern`.)
>
>     The others (`matcher`, `view`, `extractor`, etc) are all made-up
>     terms that
>     don't connect to anything else in the language, for better or
>     worse.? If we pick
>     one of these, we are asking users to sort out _three_ separate new
>     things in
>     their heads: (use-site) patterns, (declaration-site) matchers, and
>     the rules of
>     how patterns and matchers are connected.? Calling them both
>     "patterns", despite
>     the mild abuse of terminology, ties them together in a way that
>     recognizes their
>     connection.
>
>     My personal position: `pattern` is the strongest candidate here,
>     despite some
>     flaws.
>
>     ### Binding lists and match candidates
>
>     There are two obvious alternatives for describing the binding list
>     and match
>     candidate of a pattern declaration, both with their roots in the
>     constructor and
>     method syntax:
>
>     ?- Pretend that a pattern declaration is like a method with
>     multiple return, and
>     ?? put the binding list in the "return position", and make the
>     match candidate
>     ?? an ordinary parameter;
>     ?- Lean into the inverse relationship between constructors and
>     methods (and
>     ?? consistency with the use-site syntax), and put the binding list
>     in the
>     ?? "parameter list position". For static patterns and some
>     instance patterns,
>     ?? which need to explicitly identify the match candidate type,
>     there are several
>     ?? sub-options:
>     ?? - Lean further into the duality, putting the match candidate
>     type in the
>     ???? "return position";
>     ?? - Put the match candidate type somewhere else, where it is less
>     likely to be
>     ???? confused for a method return.
>
>     The "method-like" approach might look like this:
>
>     ```
>     class Point {
>     ??? // Constructor and deconstructor
>     ??? public Point(int x, int y) { ... }
>     ??? public pattern (int x, int y) Point(Point target) { ... }
>     ??? ...
>     }
>
>     class Optional<T> {
>     ??? // Static factory and pattern
>     ??? public static<T> Optional<T> of(T t) { ... }
>     ??? public static<T> pattern (T t) of(Optional<T> target) { ... }
>     ??? ...
>     }
>     ```
>
>     The "inverse" approach might look like:
>
>     ```
>     class Point {
>     ??? // Constructor and deconstructor
>     ??? public Point(int x, int y) { ... }
>     ??? public pattern Point(int x, int y) { ... }
>     ??? ...
>     }
>
>     class Optional<T> {
>     ??? // Static factory and pattern (using the first sub-option)
>     ??? public static<T> Optional<T> of(T t) { ... }
>     ??? public static<T> pattern Optional<T> of(T t) { ... }
>     ??? ...
>     }
>     ```
>
>     With the "method-like" approach, the match candidate gets an
>     explicit name
>     selected by the author; with the inverse approach, we can go with
>     a predefined
>     name such as `that`.? (Because deconstructors do not have
>     receivers, we could by
>     abuse of notation arrange for the keyword `this` to refer instead
>     to the match
>     candidate within the body of a deconstructor.? While this might
>     seem to lead to
>     a more familiar notation for writing deconstructors, it would create a
>     gratuitous asymmetry between the bodies of deconstruction patterns
>     and those of
>     other patterns.)
>
>     Between these choices, nearly all the considerations favor the
>     "inverse"
>     approach:
>
>     ?- The "inverse" approach makes the declaration look like the use
>     site.? This
>     ?? highlights that `pattern Point(int x, int y)` is what gets
>     invoked when you
>     ?? match against the pattern use `Point(int x, int y)`. (This
>     point is so
>     ?? strong that we should probably just stop here.)
>     ?- The "inverse" members also look like their duals; the only
>     difference is the
>     ?? `pattern` keyword (and possibly the placement of the match
>     candidate type).
>     ?? This makes matched pairs much more obvious, and such matched
>     pairs will be
>     ?? critical both for future language features and for library idioms.
>     ?- The method-like approach is suggestive of multiple return or
>     tuples, which is
>     ?? probably helpful for the first few minutes but actually harmful
>     in the long
>     ?? term. This feature is _not_ (much as some people would like to
>     believe) about
>     ?? multiple return or tuples, and playing into this misperception
>     will only make
>     ?? it harder to truly understand.? So this suggestion ends up
>     propping up the
>     ?? wrong mental model.
>
>     The main downside of the "inverse" approach is the one-time speed
>     bump of the
>     unfamiliarity of the inverted syntax.? (The "method-like" syntax
>     also has its
>     own speed bumps, it is just unfamiliar in different ways.)? But
>     unlike the
>     advantages of the inverse approach, which continue to add value
>     forever, this
>     speed bump is a one-time hurdle to get over.
>
>     To smooth out the speed bumps of the inverse approach, we can
>     consider moving
>     the position of the match candidate for static and (suitable)
>     instance pattern
>     declarations, such as:
>
>     ```
>     class Optional<T> {
>     ??? // the usual static factory
>     ??? public static<T> Optional<T> of(T t) { ... }
>
>     ??? // Various ways of writing the corresponding pattern
>     ??? public static<T> pattern of(T t) for Optional<T> { ... }
>     ??? // or ...
>     ??? public static<T> pattern(Optional<T>) of(T t) { ... }
>     ??? // or ...
>     ??? public static<T> pattern(Optional<T> that) of(T t) { ... }
>     ??? // or ...
>     ??? public static<T> pattern<Optional<T>> of(T t) { ... }
>     ??? ...
>     }
>     ```
>
>     (The deconstructor example looks the same with either variant.)?
>     Of these,
>     treating the match candidate like a "parameter" of "pattern" is
>     probably the
>     most evocative:
>
>     ```
>     public static<T> pattern(Optional<T> that) of(T t) { ... }
>     ```
>
>     as it can be read as "pattern taking the parameter `Optional<T>
>     that` called
>     `of`, binding `T`, and is a short departure from the inverse syntax.
>
>     The main value of the various rearrangements is that users don't
>     need to think
>     about things operating in reverse to parse the syntax. This trades
>     some of the
>     secondary point (patterns looking almost exactly like their
>     inverses) for a
>     certain amount of cognitive load, while maintaining the most important
>     consideration: that the declaration site look like the use site.
>
>     For instance pattern declarations, if the match candidate type is
>     the same as
>     the receiver type, the match candidate type can be elided as it is
>     with
>     deconstructors.
>
>     My personal position: the "multiple return" version is terrible;
>     all the
>     sub-variants of the inverse version are probably workable.
>
>     ### Naming the match candidate
>
>     We've been assuming so far that the match candidate always has a
>     fixed name,
>     such as `that`; this is an entirely workable approach. Some of the
>     variants are
>     also amenable to allowing authors to explicitly select a name for
>     the match
>     candidate.? For example, if we put the match candidate as a
>     "parameter" to the `pattern` keyword, there is an obvious place to
>     put the name:
>
>     ```
>     static<T> pattern(Optional<T> target) of(T t) { ... }
>     ```
>
>     My personal opinion: I don't think this degree of freedom buys us
>     much, and in
>     the long run readability probably benefits by picking a fixed name
>     like `that`
>     and sticking with it.? Even with a fixed name, if there is a
>     sensible position
>     for the name, allowing users to type `that` for explicitness is
>     fine (as we do
>     with instance methods, though many people don't know this.)? We
>     may even want to
>     require it.
>
>     ## Body types
>
>     Just as there are two obvious approaches for the declaration,
>     there are two
>     obvious approaches we could take for the body (though there is
>     some coupling
>     between them.)? We'll call the two body approaches _imperative_ and
>     _functional_.
>
>     The imperative approach treats bindings as initially-DU variables
>     that must be
>     DA on successful completion, getting their value through ordinary
>     assignment;
>     the functional approach sets all the bindings at once,
>     positionally.? Either
>     way, member patterns (except maybe deconstructors) also need a way to
>     differentiate a successful match from a failed match.
>
>     Here is the `Point` deconstructor with both imperative and
>     functional style. The
>     functional style uses a placeholder `match` statement to indicate
>     a successful
>     match and provision of bindings:
>
>     ```
>     class Point {
>     ??? int x, y;
>
>     ??? Point(int x, int y) {
>     ??????? this.x = x;
>     ??????? this.y = y;
>     ??? }
>
>     ??? // Imperative style, deconstructor always succeeds
>     ??? pattern Point(int x, int y) {
>     ??????? x = that.x;
>     ??????? y = that.y;
>     ??? }
>
>     ??? // Functional style
>     ??? pattern Point(int x, int y) {
>     ??????? match(that.x, that.y);
>     ??? }
>     }
>     ```
>
>     There are some obvious differences here.? In the imperative style,
>     the dtor body
>     looks much more like the reverse of the ctor body. The functional
>     style is more
>     concise (and amenable to further concision via the "concise method
>     bodies"
>     mechanism in the future), as well as a number of less obvious
>     differences.? For
>     deconstructors, the imperative approach is likely to feel more
>     natural because
>     of the obvious symmetry with constructors.
>
>     In reality, it is _premature at this point to have an opinion_,
>     because we
>     haven't yet seen the full scope of the problem; deconstructors are
>     a special
>     case in many ways, which almost surely is distorting our initial
>     opinion.? As we
>     move towards conditional patterns (and pattern lambdas), our
>     opinions may flip.
>
>     Regardless of which we pick, there are some additional syntactic
>     choices to be
>     made -- what syntax to use to indicate success (we used `match` in
>     the above
>     example) or failure.? (We should be especially careful around
>     trying to reuse
>     words like `return`, `break`, or `yield` because, in the case
>     where there are
>     zero bindings (which is allowable), it becomes unclear whether
>     they mean "fail"
>     or "succeed with zero bindings".)
>
>     ### Success and failure
>
>     Except for possibly deconstructors, which we may require to be
>     total, a pattern
>     declaration needs a way to indicate success and failure. In the
>     examples above,
>     we posited a `match` statement to indicate success in the
>     functional approach,
>     and in both examples leaned on the "implicit success" of
>     deconstructors (under
>     the assumption they always succeed).? Now let's look at the more
>     general case to
>     figure out what else is needed.
>
>     For a static pattern like `Optional::of`, success is conditional.?
>     Using
>     `match-fail` as a placeholder for "the match failed", this might
>     look like
>     (functional version):
>
>     ```
>     public static<T> pattern(Optional<T> that) of(T t) {
>     ??? if (that.isPresent())
>     ??????? match (that.get());
>     ??? else
>     ??????? match-fail;
>     }
>     ```
>
>     The imperative version is less pretty, though.? Using
>     `match-success` as a
>     placeholder:
>
>     ```
>     public static<T> pattern(Optional<T> that) of(T t) {
>     ??? if (that.isPresent()) {
>     ??????? t = that.get();
>     ??????? match-success;
>     ??? }
>     ??? else
>     ??????? match-fail;
>     }
>     ```
>
>     Both arms of the `if` feel excessively ceremonial here. And if we
>     chose to not
>     make all deconstruction patterns unconditional, deconstructors
>     would likely need
>     some explicit success as well:
>
>     ```
>     pattern Point(int x, int y) {
>     ??? x = that.x;
>     ??? y = that.y;
>     ??? match-success;
>     }
>     ```
>
>     It might be tempting to try and eliminate the need for explicit
>     success by
>     inferring it from whether or not the bindings are DA or not, but
>     this is
>     error-prone, is less type-checkable, and falls apart completely
>     for patterns
>     with no bindings.
>
>     ### Implicit failure in the functional approach
>
>     One of the ceremonial-seeming aspects of `Optional::of` above is
>     having to say
>     `else match-fail`, which doesn't feel like it adds a lot of
>     value.? Perhaps we
>     can be more concise without losing clarity.
>
>     Most conditional patterns will have a predicate to determine
>     matching, and then
>     some conditional code to compute the bindings and claim success.?
>     Having to say
>     "and if the predicate didn't hold, then I fail" seems like
>     ceremony for the
>     author and noise for the reader.? Instead, if a conditional
>     pattern falls off
>     the end without matching, we could treat that as simply not matching:
>
>     ```
>     public static<T> pattern(Optional<T> that) of(T t) {
>     ??? if (that.isPresent())
>     ??????? match (that.get());
>     }
>     ```
>
>     This says what we mean: if the optional is present, then this
>     pattern succeeds
>     and bind the contents of the `Optional`.? As long as our "succeed"
>     construct
>     strongly enough connotes that we are terminating abruptly and
>     successfully, this
>     code is perfectly clear.? And most conditional patterns will look
>     a lot like
>     `Optional::of`; do some sort of test and if it succeeds, extract
>     the state and
>     bind it.
>
>     At first glance, this "implicit fail" idiom may seem error-prone
>     or sloppy.? But
>     after writing a few dozen patterns, one quickly tires of saying "else
>     match-fail" -- and the reader doesn't necessarily appreciate
>     reading it either.
>
>     Implicit failure also simplifies the selection of how we
>     explicitly indicate
>     failure; using `return` in a pattern for "no match" becomes pretty
>     much a forced
>     move.? We observe that (in a void method), "return" and "falling
>     off the end"
>     are equivalent; if "falling off the end" means "no match", then so
>     should an
>     explicit `return`.? So in those few cases where we need to
>     explicitly signal "no
>     match", we can just use `return`.? It won't come up that often,
>     but here's an
>     example where it does:
>
>     ```
>     static pattern(int that) powerOfTwo(int exp) {
>     ??? int exp = 0;
>
>     ??? if (that < 1)
>     ??????? return; // explicit fail
>
>     ??? while (that > 1) {
>     ??????? if (that % 2 == 0) {
>     ??????????? that /= 2;
>     ??????????? ++exp;
>     ??????? }
>     ??????? else
>     ??????????? return; // explicit fail
>     ??? }
>     ??? match (exp);
>     }
>     ```
>
>     As a bonus, if `return` as match failure is a forced move, we need
>     only select a
>     term for "successful match" (which obviously can't be `return`).?
>     We could use
>     `match` as we have in the examples, or a variant like `matched` or
>     `matches`.
>     But rather than just creating a new control operator, we have an
>     opportunity to
>     lean into the duality a little harder, by including the pattern
>     syntax in the
>     match:
>
>     ```
>     matches of(that.get());
>     ```
>
>     or the (optionally?) qualified (inferring type arguments, as we do
>     at the use
>     site):
>
>     ```
>     matches Optional.of(that.get());
>     ```
>
>     These "use the name" approaches trades a small amount of verbosity
>     to gain a
>     higher degree of fidelity to the pattern use site (and to evoke
>     the comethod
>     completion.)
>
>     If we don't choose "implicit fail", we would have to invent _two_
>     new control
>     flow statements to indicate "success" and "failure".
>
>     My personal position: for the functional approach, implicit
>     failure both makes
>     the code simpler and clearer, and after you get used to it, you
>     don't want to go
>     back.? Whether we say `match` or `matches` or `matches
>     <pattern-name>` are all
>     workable, though I like some variant that names the pattern.
>
>     ### Implicit success in the imperative approach
>
>     In the imperative approach, we can be implicit as well, but it
>     feels more
>     natural (at least, initially) to choose implicit success rather
>     than failure.
>     This works great for unconditional patterns:
>
>     ```
>     pattern Point(int x, int y) {
>     ??? x = that.x;
>     ??? y = that.y;
>     ??? // implicit success
>     }
>     ```
>
>     but not quite as well for conditional patterns:
>
>     ```
>     static<T> pattern(Optional<T> that) of(T t) {
>     ??? if (that.isPresent()) {
>     ??????? t = that.get();
>     ??? }
>     ??? else
>     ??????? match-fail;
>     ??? // implicit success
>     }
>     ```
>
>     We can eliminate one of the arms of the if, with the more concise (but
>     convoluted) inversion:
>
>     ```
>     static<T> pattern(Optional<T> that) of(T t) {
>     ??? if (!that.isPresent())
>     ??????? match-fail;
>     ??? t = that.get();
>     ??? // implicit success
>     }
>     ```
>
>     Just as with the functional approach, if we choose imperative and
>     "implicit
>     success", using `return` to indicate success is pretty much a
>     forced move.
>
>     ### Imperative is a trap
>
>     If we assume that functional implies implicit failure, and
>     imperative implies
>     implicit success, then our choices become:
>
>     ```
>     class Optional<T> {
>     ??? public static<T> Optional<T> of(T t) { ... }
>
>     ??? // imperative, implicit success
>     ??? public static<T> pattern(Optional<T> that) of(T t) {
>     ??????? if (that.isPresent()) {
>     ??????????? t = that.get();
>     ??????? }
>     ??????? else
>     ??????????? match-fail;
>     ??? }
>
>     ??? // functional, implicit failure
>     ??? public static<T> pattern(Optional<T> that) of(T t) {
>     ??????? if (that.isPresent())
>     ??????????? matches of(that.get());
>     ??? }
>     }
>     ```
>
>     Once we get past deconstructors, the imperative approach looks
>     worse by
>     comparison because we need to assign all the bindings (which is _O(n)_
>     assignments) _and also_ indicate success or failure somehow,
>     whereas in the
>     functional style all can be done together with a single `matches`
>     statement.
>
>     Looking at the alternatives, except maybe for unconditional
>     patterns, the
>     functional example above seems a lot more natural.? The imperative
>     approach
>     works with deconstructors (assuming they are not conditional), but
>     does not
>     scale so well to conditionality -- which is the essence of patterns.
>
>     From a theoretical perspective, the method-comethod duality also
>     gives us a
>     forceful nudge towards the functional approach.? In a method, the
>     method
>     arguments are specified as a positional list of expressions at the
>     use site:
>
>     ??? m(a, b, c)
>
>     and these values are invisibly copied into the parameter slots of
>     the method
>     prior to frame activation.? The dual to that for a comethod to
>     similarly convey
>     the bindings in a positional list of expressions (as they must
>     either all be
>     produced or none), where they are copied into the slots provided
>     at the use
>     site, as is indicated by `matches` in the above examples.
>
>     My personal position: the imperative style feels like a trap.? It
>     seems
>     "obvious" at first if we start with deconstructors, but becomes
>     increasingly
>     difficult when we get past this case, and gets in the way of other
>     opportunities.? The last gasp before acceptance is the discomfort
>     that dtor and
>     ctor bodies are written in different styles, but in the rear-view
>     mirror, this
>     feels like a non-issue.
>
>     ### Derive imperative from functional?
>
>     If we start with "functional with implicit failure", we can
>     possibly rescue
>     imperative by deriving a version of imperative from functional, by
>     "overloading"
>     the match-success operator.
>
>     If we have a pattern whose binding names are `b1..bn` of types
>     `B1..Bn`, then
>     the `matches` operator must take a list of expressions `e1..en`
>     whose arity and
>     types are compatible with `B1..Bn`.? But we could allow `matches`
>     to also have a
>     nilary form, which would have the effect of being shorthand for
>
>     ??? matches <pattern-name>(b1, b2, ..., bn)
>
>     where each of `b1..bn` must be DA at the point of matching.? This
>     means that we
>     could express patterns in either form:
>
>     ```
>     class Optional<T> {
>     ??? public static<T> Optional<T> of(T t) { ... }
>
>     ??? // imperative, derived from functional with implicit failure
>     ??? public static<T> pattern(Optional<T> that) of(T t) {
>     ??????? if (that.isPresent()) {
>     ??????????? t = that.get();
>     ??????????? matches of;
>     ??????? }
>     ??? }
>
>     ??? public static<T> pattern(Optional<T> that) of(T t) {
>     ??????? if (that.isPresent())
>     ??????????? matches of(that.get());
>     ??? }
>     }
>     ```
>
>     This flexibility allows users to select a more verbose expression
>     in exchange
>     for a clearer association of expressions and bindings, though as
>     we'll see, it
>     does come with some additional constraints.
>
>     ### Wrapping an existing API
>
>     Nearly every library has methods (sometimes sets of methods) that
>     are patterns
>     in disguise, such as the pair of methods `isArray` and
>     `getComponentType` in
>     `Class`, or the `Matcher` helper type in `java.util.regex`.?
>     Library maintainers
>     will likely want to wrap (or replace) these with real patterns, so
>     these can
>     participate more effectively in conditional contexts, and in some
>     cases,
>     highlight their duality with factory methods.
>
>     Matching a string against a `j.u.r.Pattern` regular expression has
>     all the same
>     elements as a pattern, just with an ad-hoc API (and one that I
>     have to look up
>     every time).? But we can fairly easily wrap a true pattern around
>     the existing
>     API.? To match against a `Pattern` today, we pass the match
>     candidate to
>     `Pattern::matcher`, which returns a `Matcher` with accessors
>     `Matcher::matches`
>     (did it match) and `Matcher::group` (conditionally extract a
>     particular capture
>     group.)? If we want to wrap this with a pattern called `regexMatch`:
>
>     ```
>     pattern(String that) regexMatch(String... groups) {
>     ??? Matcher m = this.matcher(that);
>     ??? if (m.matches())
>     ??????? matches Pattern.regexMatch(IntStream.range(1, m.groupCount())
>     .map(Matcher::group)
>     .toArray(String[]::new));
>     ??? // whole lotta matchin' goin' on
>     }
>     ```
>
>     This says that a `j.u.r.Pattern` has an instance pattern called
>     `regex`, whose
>     match candidate is `String`, and which binds a varargs of `String`
>     corresponding
>     to the capture groups.? The implementation simply delegates to the
>     existing
>     `j.u.r.Matcher` API.? This means that `j.u.r.Pattern` becomes a
>     sort of "pattern
>     object", and we can use it as a receiver at the use site:
>
>     ```
>     static Pattern As = Pattern.compile("(a*)");
>     static Pattern Bs = Pattern.compile("(b*)");
>     ...
>     switch (string) {
>     ??? case As.regexMatch(var as): ...
>     ??? case Bs.regexMatch(var bs): ...
>     ??? ...
>     }
>     ```
>
>     ### Odds and ends
>
>     There are a number of loose ends here.? We could choose other
>     names for the
>     match-success and match-fail operations, including trying to reuse
>     `break` or
>     `yield`.? But, this reuse is tricky; it must be very clear whether
>     a given form
>     of abrupt completion means "success" or "failure", because in the
>     case of
>     patterns with no bindings, we will have no other syntactic cues to
>     help
>     disambiguate.? (I think having a single `matches`, with implicit
>     failure and
>     `return` meaning failure, is the sweet spot here.)
>
>     Another question is whether the binding list introduces
>     corresponding variables
>     into the scope of the body.? For imperative, the answer is "surely
>     yes"; for
>     functional, the answer is "maybe" (unless we want to do the trick
>     where we
>     derive imperative from functional, in which case the answer is
>     "yes" again.)
>
>     If the binding list does not correspond to variables in the body,
>     this may be
>     initially discomforting; because they do not declare program
>     elements, they may
>     feel that they are left "dangling".? But even if they are not
>     declaring
>     _program_ elements, they are still declaring _API_ elements
>     (similar to the
>     return type of a method.)? We will want to provide Javadoc on the
>     bindings, just
>     like with parameters; we will want to match up binding names in
>     deconstructors
>     with parameter names in constructors; we may even someday want to
>     support
>     by-name binding at the use site (e.g., `case Foo(a: var a)`).? The
>     names are
>     needed for all of these, just not for the body. Names still
>     matter.? My take
>     here is that this is a transient "different is scary" reaction,
>     one that we
>     would get over quickly.
>
>     A final question is whether we should consider unqualified names
>     as implicitly
>     qualified by `that` (and also `this`, for instance patterns, with
>     some conflict
>     resolution).? Users will probably grow tired of typing `that.` all
>     the time, and most of the time, the unqualified use is perfectly
>     readable.
>
>     ## Exhaustiveness
>
>     There is one last syntax question in front of us: how to indicate
>     that a set of
>     patterns are (claimed to be) exhaustive on a given match candidate
>     type.? We see
>     this with `Optional::of` and `Optional::empty`; it would be sad if
>     the compiler
>     did not realize that these two patterns together were exhaustive
>     on `Optional`.
>     This is not a feature that will be used often, but not having it
>     at all will be
>     a repeated irritant.
>
>     The best I've come up with is to call these `case` patterns, where
>     a set of
>     `case` patterns for a given match candidate type in a given class
>     are asserted
>     to be an exhaustive set:
>
>     ```
>     class Optional<T> {
>     ??? static<T> Optional<T> of(T t) { ... }
>     ??? static<T> Optional<T> empty() { ... }
>
>     ??? static<T> case pattern of(T t) for Optional<T> { ... }
>     ??? static<T> case pattern empty() for Optional<T> { ... }
>     }
>     ```
>
>     Because they may not be truly exhaustive, `switch` constructs will
>     have to back
>     up the static assumption of exhaustiveness with a dynamic check,
>     as we do for
>     other sets of exhaustive patterns that may have remainder.
>
>     I've experimented with variants of `sealed` but it felt more
>     forced, so this is
>     the best I've come up with.
>
>     ## Example: patterns delegating to other patterns
>
>     Pattern implementations must compose.? Just as a subclass
>     constructor delegates
>     to a superclass constructor, the same should be true for
>     deconstructors.
>     Here's a typical superclass-subclass pair:
>
>     ```
>     class A {
>     ??? private final int a;
>
>     ??? public A(int a) { this.a = a; }
>     ??? public pattern A(int a) { matches A(that.a); }
>     }
>
>     class B extends A {
>     ??? private final int b;
>
>     ??? public B(int a, int b) {
>     ??????? super(a);
>     ??????? this.b = b;
>     ??? }
>
>     ??? // Imperative style
>     ??? public pattern B(int a, int b) {
>     ??????? if (that instanceof super(var aa)) {
>     ??????????? a = aa;
>     ??????????? b = that.b;
>     ??????????? matches B;
>     ??????? }
>     ??? }
>
>     ??? // Functional style
>     ??? public pattern B(int a, int b) {
>     ??????? if (that instanceof super(var a))
>     ??????????? matches B(a, b);
>     ??? }
>     }
>     ```
>
>     (Ignore the flow analysis and totality for the time being; we'll
>     come back to
>     this in a separate document.)
>
>     The first thing that jumps out at us is that, in the imperative
>     version, we had
>     to create a "garbage" variable `aa` to receive the binding,
>     because `a` was
>     already in scope, and then we have to copy the garbage variable
>     into the real
>     binding variable. Users will surely balk at this, and rightly so.?
>     In the
>     functional version (depending on the choices from "Odds and Ends")
>     we are free
>     to use the more natural name and avoid the roundabout locution.
>
>     We might be tempted to fix the "garbage variable" problem by
>     inventing another
>     sub-feature: the ability to use an existing variable as the target
>     of a binding,
>     such as:
>
>     ```
>     pattern Point(int a, int b) {
>     ??? if (this instanceof A(__bind a))
>     ??????? b = this.b;
>     }
>     ```
>
>     But, I think the language is stronger without this feature, for
>     two reasons.
>     First, having to reason about whether a pattern match introduces a
>     new binding
>     or assigns to an existing variables is additional cognitive load
>     for users to
>     reason about, and second, having assignment to locals happening
>     through
>     something other than assignment introduces additional complexity
>     in finding
>     where a variable is modified.? While we can argue about the
>     general utility of
>     this feature, bringing it in just to solve the garbage-variable
>     problem is
>     particularly unattractive.
>
>     ## Pattern lambdas
>
>     One final consideration is is that patterns may also have a lambda
>     form.? Given
>     a single-abstract-pattern (SAP) interface:
>
>     ```
>     interface Converter<T,U> {
>     ??? pattern(T t) convert(U u);
>     }
>     ```
>
>     one can implement such a pattern with a lambda. Such a lambda has
>     one parameter
>     (the match candidate), and its body looks like the body of a
>     declared pattern:
>
>     ```
>     Converter<Integer, Short> c =
>     ??? i -> {
>     ??????? if (i >= Short.MIN_VALUE && i <= Short.MAX_VALUE)
>     ??????????? matches Converter.convert((short) i);
>     ??? };
>     ```
>
>     Because the bindings of the pattern lambda are defined in the
>     interface, not in
>     the lambda, this is one more reason not to like the imperative
>     version: it is
>     brittle, and alpha-renaming bindings in the interface would be a
>     source-incompatible change.
>
>     ## Example gallery
>
>     Here's all the pattern examples so far, and a few more, using the
>     suggested
>     style (functional, implicit fail, implicit `that`-qualification):
>
>     ```
>     // Point dtor
>     pattern Point(int x, int y) {
>     ??? matches Point(x, y);
>     }
>
>     // Optional -- static patterns for Optional::of, Optional::empty
>     static<T> case pattern(Optional<T> that) of(T t) {
>     ??? if (isPresent())
>     ??????? matches of(t);
>     }
>
>     static<T> case pattern(Optional<T> that) empty() {
>     ??? if (!isPresent())
>     ??????? matches empty();
>     }
>
>     // Class -- instance pattern for arrayClass (match candidate type
>     inferred)
>     pattern arrayClass(Class<?> componentType) {
>     ??? if (that.isArray())
>     ??????? matches arrayClass(that.getComponentType());
>     }
>
>     // regular expression -- instance pattern in j.u.r.Pattern
>     pattern(String that) regexMatch(String... groups) {
>     ??? Matcher m = matcher(that);
>     ??? if (m.matches())
>     ??????? matches Pattern.regexMatch(IntStream.range(1, m.groupCount())
>     .map(Matcher::group)
>     .toArray(String[]::new));
>     }
>
>     // power of two (somewhere)
>     static pattern(int that) powerOfTwo(int exp) {
>     ??? int exp = 0;
>
>     ??? if (that < 1)
>     ??????? return;
>
>     ??? while (that > 1) {
>     ??????? if (that % 2 == 0) {
>     ??????????? that /= 2;
>     ??????????? exp++;
>     ??????? }
>     ??????? else
>     ??????????? return;
>     ??? }
>     ??? matches powerOfTwo(exp);
>     }
>     ```
>
>     ## Closing thoughts
>
>     I came out of this exploration with very different conclusions
>     than I expected
>     when going in.? At first, the "inverse" syntax seemed stilted, but
>     over time it
>     started to seem more obvious.? Similarly, I went in expecting to
>     prefer the
>     imperative approach for the body, but over time, started to warm
>     to the
>     functional approach, and eventually concluded it was basically a
>     forced move if
>     we want to support more than just deconstructors.? And I started
>     out skeptical
>     of "implicit fail", but after writing a few dozen patterns with
>     it, going back
>     to fully explicit felt painful.? All of this is to say, you should
>     hold your
>     initial opinions at arm's length, and give the alternatives a
>     chance to sink in.
>
>     For most _conditional_ patterns (and conditionality is at the
>     heart of pattern
>     matching), the functional approach cleanly highlights both the
>     match predicate
>     and the flow of values, and is considerably less fussy than the
>     imperative
>     approach in the same situation; `Optional::of`,
>     `Class::arrayClass`, and `regex`
>     look great here, much better than the would with imperative.? None
>     of these
>     illustrate delegation, but in the presence of delegation, the gap
>     gets even
>     wider.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240330/251383c3/attachment-0001.htm>