Update on String Templates (JEP 459)
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Fri Mar 15 09:56:42 UTC 2024
On 15/03/2024 02:10, Guy Steele wrote:
> Oh, I think I get it now; I misinterpreted "The compiler might require
> a prefix here” to mean "The compiler might require a prefix on a
> literal that is a method argument”, but I now see, from your later
> sentence "Basically, requiring all literals that have embedded
> expression to have a prefix . . .” that maybe you just want to adjust
> the syntax of literals to be roughly what Clement suggested:
>
> “…” plain string literal, cannot contain
> \{…}, type is String
> INTERPOLATION”…” string interpolation, may contain \{…}, type is
> String
> TEMPLATE”…” string template, , may contain \{…}, type is
> StringTemplate
>
> where the precise syntax for the prefixed INTERPOLATION and TEMPLATE
> is to be determined. Do I understand your proposal correctly now?
Yes, with the further tweak that the prefix (with syntax TBD) might be
omitted in the "obvious cases" (but kept for clarity):
* "Hello" w/o prefix is just String
* "Hello \{world}" without prefix is just StringTemplate
Does this help? (I'm basically trying to get to a world where use of
prefix will be relatively rare, as common cases have the right defaults).
Maurizio
>
> —Guy
>
>> On Mar 14, 2024, at 9:05 PM, Guy Steele <guy.steele at oracle.com> wrote:
>>
>> Thanks for these derails, but they don’t quite answer my question:
>> how does the compiler makes the decision to require the prefix?
>> Specifically, is it done purely by examining the types of the
>> literals (in which case the existing story, about how method
>> overloading decides which of several methods with the same name to
>> call, is adequate), or are you imagining some additional ad-hoc
>> mechanism that is somehow examining the syntax of method arguments
>> (in which case some care will be needed to ensure that it interacts
>> properly with the rest of the method overloading resolution
>> mechanism)? I ask because, given your explanation below, I am not
>> seeing how types alone can do the job—but maybe I am missing something.
>>
>> —Guy
>>
>>> On Mar 14, 2024, at 6:15 PM, Maurizio Cimadamore
>>> <maurizio.cimadamore at oracle.com> wrote:
>>>
>>>
>>> On 14/03/2024 22:05, Guy Steele wrote:
>>>> Is your intent that a string interpolation literal would have a
>>>> type other than String? If so, I agree that this is a third
>>>> option—with the consequence that each API designer now needs to
>>>> contemplate three-way overloading.
>>>>
>>>> If that is not your intent, then I am not seeing how the prefix
>>>> helps—so please explain?
>>>
>>> Let's go back to the example I mentioned:
>>>
>>> |String.format("Hello, my name is %s\{name}"); // can you spot the bug? |
>>> There's a string with an embedded expression here. The compiler
>>> might require a prefix here (e.g. do you want a string, or a string
>>> template?). If no prefix is added (as in the above code) it might
>>> just be an error, and this won't compile.
>>>
>>> This means that if I do:
>>> |String.format(INTERPOLATED"Hello, my name is %s\{name}"); |
>>>
>>> I will select String.format(String, Object...) - but I will do so
>>> deliberately - it's not just what happens "by default" (as was the
>>> case before).
>>>
>>> Or, if I want the template version, I do:
>>>
>>> |String.format(TEMPLATE"Hello, my name is %s\{name}");|
>>>
>>> Basically, requiring all literals that have embedded expression to
>>> have a prefix removes the problem of defaulting on the String side
>>> of the fence. Then, personally I'd also prefer if the default was
>>> actually on the StringTemplate side of the fence, so that the above
>>> was actually identical to this:
>>>
>>> |String.format("Hello, my name is %s\{name}"); // ok, this is a template|
>>>
>>> Note that these two prefixes might also come in handy when
>>> disambiguating a literal with no embedded expressions. Only, in that
>>> case the default would point the other way.
>>>
>>> To summarize:
>>>
>>> * template literal with arguments -> defaults to StringTemplate.
>>> User can ask interpolation explicitly, by adding a prefix
>>> * template literal w/o arguments -> defaults to String. User can
>>> ask a degenerate template explicitly, by adding a prefix
>>>
>>> This doesn't sound too bad, and it feels like it has the defaults
>>> pointing the right way?
>>>
>>> Maurizio
>>>
>>>> Thanks,
>>>> Guy
>>>>
>>>>> On Mar 14, 2024, at 6:00 PM, Maurizio Cimadamore
>>>>> <maurizio.cimadamore at oracle.com> wrote:
>>>>>
>>>>>
>>>>> On 14/03/2024 19:39, Guy Steele wrote:
>>>>>> This is a very important example to consider. I observe, however,
>>>>>> that there are at least two possible ways to avoid the unpleasant
>>>>>> surprise:
>>>>>>
>>>>>> (1) Don't have string interpolation literals, because
>>>>>> accidentally using a string interpolation literal instead of a
>>>>>> string template literals can result in invoking the wrong
>>>>>> overload of a method.
>>>>>>
>>>>>> (2) Don’t overload methods so as to accept either a string or a
>>>>>> string template.
>>>>>
>>>>> I agree with your analysis, but note that there is also a third
>>>>> option:
>>>>>
>>>>> (3) make it so that both string interpolation literal and string
>>>>> template literal have a prefix.
>>>>>
>>>>> I believe that is enough to solve the issue (because the program I
>>>>> wrote would no longer compile: the compiler would require an
>>>>> explicit prefix).
>>>>>
>>>>> Maurizio
>>>>>
>>>>>>
>>>>>> If we were to take approach (2), then:
>>>>>>
>>>>>> (a) We would keep `println` as is, and not allow it to accept a
>>>>>> template, but that’s okay—if you thought you wanted a template,
>>>>>> what you really want is plan old string interpolation, and the
>>>>>> type checking will make sure you don't use the wrong one.
>>>>>>
>>>>>> (b) A SQL processor would accept a template but not a string—if
>>>>>> you thought you wanted string interpolation, what you really want
>>>>>> is a template, and the type checking will make sure you don't use
>>>>>> the wrong one.
>>>>>>
>>>>>> (c) I think `format` is a special case that we tend to get hung
>>>>>> up on, and I think that, in this particular branch of the design
>>>>>> space we are exploring, perhaps a name other than `String.format`
>>>>>> should be chosen for the method that does string formatting on
>>>>>> templates. Possible names are `StringTemplate.format` and
>>>>>> `String.format$`, but I will leave further bikeshedding on this
>>>>>> to others. I do recognize that this move will not enable the type
>>>>>> system per se to absolutely prevent programmers from writing
>>>>>> |String.format("Hello, my name is %s{name}"); // can you spot the
>>>>>> bug? |
>>>>>> but, as Clement has observed, such cases will probably provoke a
>>>>>> warning about a mismatch between the number of arguments and the
>>>>>> number of %-specifiers that require parameters, so maybe
>>>>>> overloading would be okay anyway for `String.format`.
>>>>>>
>>>>>> Anyway, my point is that whether to overload a method to accept
>>>>>> either a string or a string template can be evaluated on a
>>>>>> case-by-case basis according to a small number of principles that
>>>>>> I think we could enumerate and explain pretty easily.
>>>>>>
>>>>>> —Guy
>>>>>>
>>>>>>> On Mar 14, 2024, at 1:40 PM, Maurizio Cimadamore
>>>>>>> <maurizio.cimadamore at oracle.com> wrote:
>>>>>>>
>>>>>>> Not to pour too much cold water on the idea of having string
>>>>>>> interpolation literal, but I’d like to mention a few points here.
>>>>>>>
>>>>>>> First, it was a deliberate design goal of the string template
>>>>>>> feature to make interpolation an explicit act. Note that, if we
>>>>>>> had the syntax you describe, we actually achieve the opposite
>>>>>>> effect: string interpolation is now the default, and implicit,
>>>>>>> and actually /cheaper/ (to type) than the safer template
>>>>>>> alternative. This is a bit of a red herring, I think.
>>>>>>>
>>>>>>> The second problem is that interpolation literals can sometimes
>>>>>>> be deceiving. Consider this example:
>>>>>>>
>>>>>>> |String.format("Hello, my name is %s{name}"); // can you spot
>>>>>>> the bug? |
>>>>>>>
>>>>>>> Where |String::format| has a new overload which accepts a
>>>>>>> StringTemplate.
>>>>>>>
>>>>>>> Basically, since here we forgot the leading “$” (or whatever
>>>>>>> char that is), the whole thing is just a big interpolation.
>>>>>>> Semantically equivalent to:
>>>>>>>
>>>>>>> |String.format("Hello, my name is %s" + name); // whoops! |
>>>>>>>
>>>>>>> This will fail, as |String::format| will be waiting for an
>>>>>>> argument (a string), but none is provided. So:
>>>>>>>
>>>>>>> || Exception java.util.MissingFormatArgumentException: Format
>>>>>>> specifier '%s' | at Formatter.format (Formatter.java:2672) | at
>>>>>>> Formatter.format (Formatter.java:2609) | at String.format
>>>>>>> (String.java:2897) | at (#2:1) |
>>>>>>>
>>>>>>> This is a very odd (and new!) failure mode, that I’m sure is
>>>>>>> gonna surprise developers.
>>>>>>>
>>>>>>> Maurizio
>>>>>>>
>>>>>>> On 14/03/2024 15:08, Guy Steele wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Second thoughts about how to explain a string interpolation literal:
>>>>>>>>
>>>>>>>>> On Mar 13, 2024, at 2:02 PM, Guy Steele<guy.steele at oracle.com> wrote:
>>>>>>>>> . . .
>>>>>>>>>
>>>>>>>>> —————————
>>>>>>>>> String is not a subtype of StringTemplate; they are disjoint types.
>>>>>>>>>
>>>>>>>>> $”foo” is a (trivial) string template literal
>>>>>>>>> “foo” is a string literal
>>>>>>>>> $”Hello, \{x}” is a (nontrivial) string template literal
>>>>>>>>> “Hello, \{x}” is a shorthand (expanded by the compiler) for `String.of($“Hello, \{x}”)`
>>>>>>>>> —————————
>>>>>>>> Given that the intent is that String.of (or whatever we want to call it—possibly the `interpolation` instance method of class `StringTemplate` rather than a static method `String.of`) should just do standard string concatenation, we might be better off just saying that a string interpolation literal is expanded by the compiler into uses of “+”; for example,
>>>>>>>>
>>>>>>>> “Hello, \{x}.”
>>>>>>>>
>>>>>>>> (I have added a period to the example to make the point clearer) is expanded into
>>>>>>>>
>>>>>>>> “Hello, “ + x + “.”
>>>>>>>>
>>>>>>>> and in general
>>>>>>>>
>>>>>>>> “c0\{e1}c1\{e2}c2…\{en}cn”
>>>>>>>>
>>>>>>>> (where each ck is a possibly empty sequence of string characters and each ek is an expression) is expanded into
>>>>>>>>
>>>>>>>> “c0” + (e1) + “c1” + (e2) + “c2” + … + (en) + “cn”
>>>>>>>>
>>>>>>>> The point is that, with this definition, “c0\{e1}c1\{e2}c2…\{en}cn” is a constant expression iff every ek is a constant expression. This is handy for interpolating constant variables into a string that is itself intended to be constant.
>>>>>>>>
>>>>>>>> —Guy
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240315/c9c320a2/attachment-0001.htm>
More information about the amber-spec-observers
mailing list