Update on String Templates (JEP 459)
Guy Steele
guy.steele at oracle.com
Fri Mar 15 01:07:16 UTC 2024
On Mar 14, 2024, at 9:05 PM, Guy Steele <guy.steele at oracle.com> wrote:
Thanks for these derails,
Sorry: “details"
but they don’t quite answer my question: how does the compiler makes the decision to require the prefix? Specifically, is it done purely by examining the types of the literals (in which case the existing story, about how method overloading decides which of several methods with the same name to call, is adequate), or are you imagining some additional ad-hoc mechanism that is somehow examining the syntax of method arguments (in which case some care will be needed to ensure that it interacts properly with the rest of the method overloading resolution mechanism)? I ask because, given your explanation below, I am not seeing how types alone can do the job—but maybe I am missing something.
—Guy
On Mar 14, 2024, at 6:15 PM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com> wrote:
On 14/03/2024 22:05, Guy Steele wrote:
Is your intent that a string interpolation literal would have a type other than String? If so, I agree that this is a third option—with the consequence that each API designer now needs to contemplate three-way overloading.
If that is not your intent, then I am not seeing how the prefix helps—so please explain?
Let's go back to the example I mentioned:
String.format("Hello, my name is %s\{name}"); // can you spot the bug?
There's a string with an embedded expression here. The compiler might require a prefix here (e.g. do you want a string, or a string template?). If no prefix is added (as in the above code) it might just be an error, and this won't compile.
This means that if I do:
String.format(INTERPOLATED"Hello, my name is %s\{name}");
I will select String.format(String, Object...) - but I will do so deliberately - it's not just what happens "by default" (as was the case before).
Or, if I want the template version, I do:
String.format(TEMPLATE"Hello, my name is %s\{name}");
Basically, requiring all literals that have embedded expression to have a prefix removes the problem of defaulting on the String side of the fence. Then, personally I'd also prefer if the default was actually on the StringTemplate side of the fence, so that the above was actually identical to this:
String.format("Hello, my name is %s\{name}"); // ok, this is a template
Note that these two prefixes might also come in handy when disambiguating a literal with no embedded expressions. Only, in that case the default would point the other way.
To summarize:
* template literal with arguments -> defaults to StringTemplate. User can ask interpolation explicitly, by adding a prefix
* template literal w/o arguments -> defaults to String. User can ask a degenerate template explicitly, by adding a prefix
This doesn't sound too bad, and it feels like it has the defaults pointing the right way?
Maurizio
Thanks,
Guy
On Mar 14, 2024, at 6:00 PM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com><mailto:maurizio.cimadamore at oracle.com> wrote:
On 14/03/2024 19:39, Guy Steele wrote:
This is a very important example to consider. I observe, however, that there are at least two possible ways to avoid the unpleasant surprise:
(1) Don't have string interpolation literals, because accidentally using a string interpolation literal instead of a string template literals can result in invoking the wrong overload of a method.
(2) Don’t overload methods so as to accept either a string or a string template.
I agree with your analysis, but note that there is also a third option:
(3) make it so that both string interpolation literal and string template literal have a prefix.
I believe that is enough to solve the issue (because the program I wrote would no longer compile: the compiler would require an explicit prefix).
Maurizio
If we were to take approach (2), then:
(a) We would keep `println` as is, and not allow it to accept a template, but that’s okay—if you thought you wanted a template, what you really want is plan old string interpolation, and the type checking will make sure you don't use the wrong one.
(b) A SQL processor would accept a template but not a string—if you thought you wanted string interpolation, what you really want is a template, and the type checking will make sure you don't use the wrong one.
(c) I think `format` is a special case that we tend to get hung up on, and I think that, in this particular branch of the design space we are exploring, perhaps a name other than `String.format` should be chosen for the method that does string formatting on templates. Possible names are `StringTemplate.format` and `String.format$`, but I will leave further bikeshedding on this to others. I do recognize that this move will not enable the type system per se to absolutely prevent programmers from writing
String.format("Hello, my name is %s{name}"); // can you spot the bug?
but, as Clement has observed, such cases will probably provoke a warning about a mismatch between the number of arguments and the number of %-specifiers that require parameters, so maybe overloading would be okay anyway for `String.format`.
Anyway, my point is that whether to overload a method to accept either a string or a string template can be evaluated on a case-by-case basis according to a small number of principles that I think we could enumerate and explain pretty easily.
—Guy
On Mar 14, 2024, at 1:40 PM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com><mailto:maurizio.cimadamore at oracle.com> wrote:
Not to pour too much cold water on the idea of having string interpolation literal, but I’d like to mention a few points here.
First, it was a deliberate design goal of the string template feature to make interpolation an explicit act. Note that, if we had the syntax you describe, we actually achieve the opposite effect: string interpolation is now the default, and implicit, and actually cheaper (to type) than the safer template alternative. This is a bit of a red herring, I think.
The second problem is that interpolation literals can sometimes be deceiving. Consider this example:
String.format("Hello, my name is %s{name}"); // can you spot the bug?
Where String::format has a new overload which accepts a StringTemplate.
Basically, since here we forgot the leading “$” (or whatever char that is), the whole thing is just a big interpolation. Semantically equivalent to:
String.format("Hello, my name is %s" + name); // whoops!
This will fail, as String::format will be waiting for an argument (a string), but none is provided. So:
| Exception java.util.MissingFormatArgumentException: Format specifier '%s'
| at Formatter.format (Formatter.java:2672)
| at Formatter.format (Formatter.java:2609)
| at String.format (String.java:2897)
| at (#2:1)
This is a very odd (and new!) failure mode, that I’m sure is gonna surprise developers.
Maurizio
On 14/03/2024 15:08, Guy Steele wrote:
Second thoughts about how to explain a string interpolation literal:
On Mar 13, 2024, at 2:02 PM, Guy Steele <guy.steele at oracle.com><mailto:guy.steele at oracle.com> wrote:
. . .
—————————
String is not a subtype of StringTemplate; they are disjoint types.
$”foo” is a (trivial) string template literal
“foo” is a string literal
$”Hello, \{x}” is a (nontrivial) string template literal
“Hello, \{x}” is a shorthand (expanded by the compiler) for `String.of($“Hello, \{x}”)`
—————————
Given that the intent is that String.of (or whatever we want to call it—possibly the `interpolation` instance method of class `StringTemplate` rather than a static method `String.of`) should just do standard string concatenation, we might be better off just saying that a string interpolation literal is expanded by the compiler into uses of “+”; for example,
“Hello, \{x}.”
(I have added a period to the example to make the point clearer) is expanded into
“Hello, “ + x + “.”
and in general
“c0\{e1}c1\{e2}c2…\{en}cn”
(where each ck is a possibly empty sequence of string characters and each ek is an expression) is expanded into
“c0” + (e1) + “c1” + (e2) + “c2” + … + (en) + “cn”
The point is that, with this definition, “c0\{e1}c1\{e2}c2…\{en}cn” is a constant expression iff every ek is a constant expression. This is handy for interpolating constant variables into a string that is itself intended to be constant.
—Guy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240315/6d259683/attachment-0001.htm>
More information about the amber-spec-observers
mailing list