Update on String Templates (JEP 459)

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Fri Mar 15 09:56:42 UTC 2024


On 15/03/2024 02:10, Guy Steele wrote:
> Oh, I think I get it now; I misinterpreted "The compiler might require 
> a prefix here” to mean "The compiler might require a prefix on a 
> literal that is a method argument”, but I now see, from your later 
> sentence "Basically, requiring all literals that have embedded 
> expression to have a prefix . . .” that maybe you just want to adjust 
> the syntax of literals to be roughly what Clement suggested:
>
> “…”                             plain string literal, cannot contain 
> \{…}, type is String
> INTERPOLATION”…”     string interpolation, may contain \{…}, type is 
> String
> TEMPLATE”…”      string template, , may contain \{…}, type is 
> StringTemplate
>
> where the precise syntax for the prefixed INTERPOLATION and TEMPLATE 
> is to be determined. Do I understand your proposal correctly now?
Yes, with the further tweak that the prefix (with syntax TBD) might be 
omitted in the "obvious cases" (but kept for clarity):

* "Hello" w/o prefix is just String
* "Hello \{world}" without prefix is just StringTemplate

Does this help? (I'm basically trying to get to a world where use of 
prefix will be relatively rare, as common cases have the right defaults).

Maurizio

>
> —Guy
>
>> On Mar 14, 2024, at 9:05 PM, Guy Steele <guy.steele at oracle.com> wrote:
>>
>> Thanks for these derails, but they don’t quite answer my question: 
>> how does the compiler makes the decision to require the prefix? 
>> Specifically, is it done purely by examining the types of the 
>> literals (in which case the existing story, about how method 
>> overloading decides which of several methods with the same name to 
>> call, is adequate), or are you imagining some additional ad-hoc 
>> mechanism that is somehow examining the syntax of method arguments 
>> (in which case some care will be needed to ensure that it interacts 
>> properly with the rest of the method overloading resolution 
>> mechanism)? I ask because, given your explanation below, I am not 
>> seeing how types alone can do the job—but maybe I am missing something.
>>
>> —Guy
>>
>>> On Mar 14, 2024, at 6:15 PM, Maurizio Cimadamore 
>>> <maurizio.cimadamore at oracle.com> wrote:
>>>
>>>
>>> On 14/03/2024 22:05, Guy Steele wrote:
>>>> Is your intent that a string interpolation literal would have a 
>>>> type other than String? If so, I agree that this is a third 
>>>> option—with the consequence that each API designer now needs to 
>>>> contemplate three-way overloading.
>>>>
>>>> If that is not your intent, then I am not seeing how the prefix 
>>>> helps—so please explain?
>>>
>>> Let's go back to the example I mentioned:
>>>
>>> |String.format("Hello, my name is %s\{name}"); // can you spot the bug? |
>>> There's a string with an embedded expression here. The compiler 
>>> might require a prefix here (e.g. do you want a string, or a string 
>>> template?). If no prefix is added (as in the above code) it might 
>>> just be an error, and this won't compile.
>>>
>>> This means that if I do:
>>> |String.format(INTERPOLATED"Hello, my name is %s\{name}"); |
>>>
>>> I will select String.format(String, Object...) - but I will do so 
>>> deliberately - it's not just what happens "by default" (as was the 
>>> case before).
>>>
>>> Or, if I want the template version, I do:
>>>
>>> |String.format(TEMPLATE"Hello, my name is %s\{name}");|
>>>
>>> Basically, requiring all literals that have embedded expression to 
>>> have a prefix removes the problem of defaulting on the String side 
>>> of the fence. Then, personally I'd also prefer if the default was 
>>> actually on the StringTemplate side of the fence, so that the above 
>>> was actually identical to this:
>>>
>>> |String.format("Hello, my name is %s\{name}"); // ok, this is a template|
>>>
>>> Note that these two prefixes might also come in handy when 
>>> disambiguating a literal with no embedded expressions. Only, in that 
>>> case the default would point the other way.
>>>
>>> To summarize:
>>>
>>>   * template literal with arguments -> defaults to StringTemplate.
>>>     User can ask interpolation explicitly, by adding a prefix
>>>   * template literal w/o arguments -> defaults to String. User can
>>>     ask a degenerate template explicitly, by adding a prefix
>>>
>>> This doesn't sound too bad, and it feels like it has the defaults 
>>> pointing the right way?
>>>
>>> Maurizio
>>>
>>>> Thanks,
>>>> Guy
>>>>
>>>>> On Mar 14, 2024, at 6:00 PM, Maurizio Cimadamore 
>>>>> <maurizio.cimadamore at oracle.com> wrote:
>>>>>
>>>>>
>>>>> On 14/03/2024 19:39, Guy Steele wrote:
>>>>>> This is a very important example to consider. I observe, however, 
>>>>>> that there are at least two possible ways to avoid the unpleasant 
>>>>>> surprise:
>>>>>>
>>>>>> (1) Don't have string interpolation literals, because 
>>>>>> accidentally using a string interpolation literal instead of a 
>>>>>> string template literals can result in invoking the wrong 
>>>>>> overload of a method.
>>>>>>
>>>>>> (2) Don’t overload methods so as to accept either a string or a 
>>>>>> string template.
>>>>>
>>>>> I agree with your analysis, but note that there is also a third 
>>>>> option:
>>>>>
>>>>> (3) make it so that both string interpolation literal and string 
>>>>> template literal have a prefix.
>>>>>
>>>>> I believe that is enough to solve the issue (because the program I 
>>>>> wrote would no longer compile: the compiler would require an 
>>>>> explicit prefix).
>>>>>
>>>>> Maurizio
>>>>>
>>>>>>
>>>>>> If we were to take approach (2), then:
>>>>>>
>>>>>> (a) We would keep `println` as is, and not allow it to accept a 
>>>>>> template, but that’s okay—if you thought you wanted a template, 
>>>>>> what you really want is plan old string interpolation, and the 
>>>>>> type checking will make sure you don't use the wrong one.
>>>>>>
>>>>>> (b) A SQL processor would accept a template but not a string—if 
>>>>>> you thought you wanted string interpolation, what you really want 
>>>>>> is a template, and the type checking will make sure you don't use 
>>>>>> the wrong one.
>>>>>>
>>>>>> (c) I think `format` is a special case that we tend to get hung 
>>>>>> up on, and I think that, in this particular branch of the design 
>>>>>> space we are exploring, perhaps a name other than `String.format` 
>>>>>> should be chosen for the method that does string formatting on 
>>>>>> templates. Possible names are `StringTemplate.format` and 
>>>>>> `String.format$`, but I will leave further bikeshedding on this 
>>>>>> to others. I do recognize that this move will not enable the type 
>>>>>> system per se to absolutely prevent programmers from writing
>>>>>> |String.format("Hello, my name is %s{name}"); // can you spot the 
>>>>>> bug? |
>>>>>> but, as Clement has observed, such cases will probably provoke a 
>>>>>> warning about a mismatch between the number of arguments and the 
>>>>>> number of %-specifiers that require parameters, so maybe 
>>>>>> overloading would be okay anyway for `String.format`.
>>>>>>
>>>>>> Anyway, my point is that whether to overload a method to accept 
>>>>>> either a string or a string template can be evaluated on a 
>>>>>> case-by-case basis according to a small number of principles that 
>>>>>> I think we could enumerate and explain pretty easily.
>>>>>>
>>>>>> —Guy
>>>>>>
>>>>>>> On Mar 14, 2024, at 1:40 PM, Maurizio Cimadamore 
>>>>>>> <maurizio.cimadamore at oracle.com> wrote:
>>>>>>>
>>>>>>> Not to pour too much cold water on the idea of having string 
>>>>>>> interpolation literal, but I’d like to mention a few points here.
>>>>>>>
>>>>>>> First, it was a deliberate design goal of the string template 
>>>>>>> feature to make interpolation an explicit act. Note that, if we 
>>>>>>> had the syntax you describe, we actually achieve the opposite 
>>>>>>> effect: string interpolation is now the default, and implicit, 
>>>>>>> and actually /cheaper/ (to type) than the safer template 
>>>>>>> alternative. This is a bit of a red herring, I think.
>>>>>>>
>>>>>>> The second problem is that interpolation literals can sometimes 
>>>>>>> be deceiving. Consider this example:
>>>>>>>
>>>>>>> |String.format("Hello, my name is %s{name}"); // can you spot 
>>>>>>> the bug? |
>>>>>>>
>>>>>>> Where |String::format| has a new overload which accepts a 
>>>>>>> StringTemplate.
>>>>>>>
>>>>>>> Basically, since here we forgot the leading “$” (or whatever 
>>>>>>> char that is), the whole thing is just a big interpolation. 
>>>>>>> Semantically equivalent to:
>>>>>>>
>>>>>>> |String.format("Hello, my name is %s" + name); // whoops! |
>>>>>>>
>>>>>>> This will fail, as |String::format| will be waiting for an 
>>>>>>> argument (a string), but none is provided. So:
>>>>>>>
>>>>>>> || Exception java.util.MissingFormatArgumentException: Format 
>>>>>>> specifier '%s' | at Formatter.format (Formatter.java:2672) | at 
>>>>>>> Formatter.format (Formatter.java:2609) | at String.format 
>>>>>>> (String.java:2897) | at (#2:1) |
>>>>>>>
>>>>>>> This is a very odd (and new!) failure mode, that I’m sure is 
>>>>>>> gonna surprise developers.
>>>>>>>
>>>>>>> Maurizio
>>>>>>>
>>>>>>> On 14/03/2024 15:08, Guy Steele wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Second thoughts about how to explain a string interpolation literal:
>>>>>>>>
>>>>>>>>> On Mar 13, 2024, at 2:02 PM, Guy Steele<guy.steele at oracle.com>  wrote:
>>>>>>>>> . . .
>>>>>>>>>
>>>>>>>>> —————————
>>>>>>>>> String is not a subtype of StringTemplate; they are disjoint types.
>>>>>>>>>
>>>>>>>>> 	$”foo”              is a (trivial) string template literal
>>>>>>>>> 	“foo”                is a string literal
>>>>>>>>>          $”Hello, \{x}”     is a (nontrivial) string template literal
>>>>>>>>>          “Hello, \{x}”      is a shorthand (expanded by the compiler) for `String.of($“Hello, \{x}”)`
>>>>>>>>> —————————
>>>>>>>> Given that the intent is that String.of (or whatever we want to call it—possibly the `interpolation` instance method of class `StringTemplate` rather than a static method `String.of`) should just do standard string concatenation, we might be better off just saying that a string interpolation literal is expanded by the compiler into uses of “+”; for example,
>>>>>>>>
>>>>>>>>           “Hello, \{x}.”
>>>>>>>>
>>>>>>>> (I have added a period to the example to make the point clearer) is expanded into
>>>>>>>>
>>>>>>>>          “Hello, “ + x + “.”
>>>>>>>>
>>>>>>>> and in general
>>>>>>>>
>>>>>>>>          “c0\{e1}c1\{e2}c2…\{en}cn”
>>>>>>>>
>>>>>>>> (where each ck is a possibly empty sequence of string characters and each ek is an expression)  is expanded into
>>>>>>>>
>>>>>>>>          “c0” + (e1) + “c1” + (e2) + “c2” + … + (en) + “cn”
>>>>>>>>
>>>>>>>> The point is that, with this definition, “c0\{e1}c1\{e2}c2…\{en}cn” is a constant expression iff every ek is a constant expression. This is handy for interpolating constant variables into a string that is itself intended to be constant.
>>>>>>>>
>>>>>>>> —Guy
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>>>>>>
>>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20240315/c9c320a2/attachment-0001.htm>


More information about the amber-spec-experts mailing list