Update on String Templates (JEP 459)
Victor Nazarov
asviraspossible at gmail.com
Fri Mar 15 13:48:56 UTC 2024
Hello experts,
I'm not sure if we need one more voice in this thread, but maybe my summary
can be a small contribution.
I've read the whole thread and I saw only two goals that were named for the
StringTemplates-feature.
1) is safety as explained very thoroughly by Maurizio Cimadamore, and
another is
2) avoiding proliferation of String-literal sublanguages as advocated by
Brian Goetz
As Maurizion Cimadamore explain in one of the message, from the safety
point of view, the only solutions from those mentioned in the thread that
fit the bill are
a) either special syntax for string-templates that is distinct from
plain-strings, or
b) automatic promotion from string-*literal* (without any placeholders
inside) into StringTemplate.
If we take into account the goal stated by Brian Goetz, then we can see
that (b) looks better than (a), because we avoid differently looking
language elements.
The problem with (b) though is overload selection and many other problems
that Maurizio Cimadamore stated already in the original message of this
thread.
My observation is that if all these problems are completely new, then it's
probably hard to choose the right poison, but
my opinion is that Java language already had these problems before and
solved them, so why not solve this problem with StringTemplates the exact
same way.
Already solved problems are numeric-types, let us look at the relationship
between int and long:
* numeric-literal that fit in 32-bits can be both int and long
* numeric-literal outside the 32-bit range can only be long
* m(int),m(long) with numeric-literal that can be both int and long selects
int-overload
* n = i works as when n is long and i is int
* i = n is compile-time error
* i = (int) n succeeds, when i is int and n is long
* n instanceof int soon to succeed on long variable n as long as n fits
within 32-bits
* i instanceof long succeeds when i is int
* additionally numeric-literal can use "l" or "L" suffix to denote that it
is really long, this can be used to tweak overload-selection
I think the above can be translated almost word for word to StringTemplates
world:
* stringy-literal that doesn't have holes-with-values can be both String
and StringTemplate
* stringy-literal that has holes-with-values can only be StringTemplate
* m(String),m(StringTemplate) with stringy-literal that can be both String
and StringTemplate selects String-overload
* t = s works as when t is StringTemplate and s is String
* s = t is compile-time error
* s = (String) t succeeds, when s is String and t is StringTemplate (and
does string concatenation)
* t instanceof String succeeds on StringTemplate variable t as long as t
doesn't have any holes-with-values
* s instanceof StringTemplate succeeds when s is String
* additionally stringy-literal can use "t" or "T" *suffix* to denote that
it is really a template, this can be used to tweak overload-selection and
to certify, that some processing of values is expected
For me the table for String-StringTemplates satisfies both (1) and (2)
goals and feels natural for Java-language, because most of these rules have
been present in the language for more than 20 years already.
--
Victor Nazarov
On Fri, Mar 15, 2024 at 12:59 PM Maurizio Cimadamore <
maurizio.cimadamore at oracle.com> wrote:
>
> On 15/03/2024 02:10, Guy Steele wrote:
>
> Oh, I think I get it now; I misinterpreted "The compiler might require a
> prefix here” to mean "The compiler might require a prefix on a literal that
> is a method argument”, but I now see, from your later sentence "Basically,
> requiring all literals that have embedded expression to have a prefix . .
> .” that maybe you just want to adjust the syntax of literals to be roughly
> what Clement suggested:
>
> “…” plain string literal, cannot contain \{…},
> type is String
> INTERPOLATION”…” string interpolation, may contain \{…}, type is String
> TEMPLATE”…” string template, , may contain \{…}, type is
> StringTemplate
>
> where the precise syntax for the prefixed INTERPOLATION and TEMPLATE is to
> be determined. Do I understand your proposal correctly now?
>
> Yes, with the further tweak that the prefix (with syntax TBD) might be
> omitted in the "obvious cases" (but kept for clarity):
>
> * "Hello" w/o prefix is just String
> * "Hello \{world}" without prefix is just StringTemplate
>
> Does this help? (I'm basically trying to get to a world where use of
> prefix will be relatively rare, as common cases have the right defaults).
>
> Maurizio
>
>
> —Guy
>
> On Mar 14, 2024, at 9:05 PM, Guy Steele <guy.steele at oracle.com>
> <guy.steele at oracle.com> wrote:
>
> Thanks for these derails, but they don’t quite answer my question: how
> does the compiler makes the decision to require the prefix? Specifically,
> is it done purely by examining the types of the literals (in which case the
> existing story, about how method overloading decides which of several
> methods with the same name to call, is adequate), or are you imagining some
> additional ad-hoc mechanism that is somehow examining the syntax of method
> arguments (in which case some care will be needed to ensure that it
> interacts properly with the rest of the method overloading resolution
> mechanism)? I ask because, given your explanation below, I am not seeing
> how types alone can do the job—but maybe I am missing something.
>
> —Guy
>
> On Mar 14, 2024, at 6:15 PM, Maurizio Cimadamore
> <maurizio.cimadamore at oracle.com> <maurizio.cimadamore at oracle.com> wrote:
>
>
> On 14/03/2024 22:05, Guy Steele wrote:
>
> Is your intent that a string interpolation literal would have a type other
> than String? If so, I agree that this is a third option—with the
> consequence that each API designer now needs to contemplate three-way
> overloading.
>
> If that is not your intent, then I am not seeing how the prefix helps—so
> please explain?
>
> Let's go back to the example I mentioned:
>
> String.format("Hello, my name is %s\{name}"); // can you spot the bug?
>
> There's a string with an embedded expression here. The compiler might
> require a prefix here (e.g. do you want a string, or a string template?).
> If no prefix is added (as in the above code) it might just be an error, and
> this won't compile.
>
> This means that if I do:
>
> String.format(INTERPOLATED"Hello, my name is %s\{name}");
>
>
> I will select String.format(String, Object...) - but I will do so
> deliberately - it's not just what happens "by default" (as was the case
> before).
>
> Or, if I want the template version, I do:
>
> String.format(TEMPLATE"Hello, my name is %s\{name}");
>
>
> Basically, requiring all literals that have embedded expression to have a
> prefix removes the problem of defaulting on the String side of the fence.
> Then, personally I'd also prefer if the default was actually on the
> StringTemplate side of the fence, so that the above was actually identical
> to this:
>
> String.format("Hello, my name is %s\{name}"); // ok, this is a template
>
>
> Note that these two prefixes might also come in handy when disambiguating
> a literal with no embedded expressions. Only, in that case the default
> would point the other way.
>
> To summarize:
>
> - template literal with arguments -> defaults to StringTemplate. User
> can ask interpolation explicitly, by adding a prefix
> - template literal w/o arguments -> defaults to String. User can ask a
> degenerate template explicitly, by adding a prefix
>
> This doesn't sound too bad, and it feels like it has the defaults pointing
> the right way?
>
> Maurizio
>
> Thanks,
> Guy
>
> On Mar 14, 2024, at 6:00 PM, Maurizio Cimadamore
> <maurizio.cimadamore at oracle.com> <maurizio.cimadamore at oracle.com> wrote:
>
>
> On 14/03/2024 19:39, Guy Steele wrote:
>
> This is a very important example to consider. I observe, however, that
> there are at least two possible ways to avoid the unpleasant surprise:
>
> (1) Don't have string interpolation literals, because accidentally using a
> string interpolation literal instead of a string template literals can
> result in invoking the wrong overload of a method.
>
> (2) Don’t overload methods so as to accept either a string or a string
> template.
>
> I agree with your analysis, but note that there is also a third option:
>
> (3) make it so that both string interpolation literal and string template
> literal have a prefix.
>
> I believe that is enough to solve the issue (because the program I wrote
> would no longer compile: the compiler would require an explicit prefix).
>
> Maurizio
>
>
> If we were to take approach (2), then:
>
> (a) We would keep `println` as is, and not allow it to accept a template,
> but that’s okay—if you thought you wanted a template, what you really want
> is plan old string interpolation, and the type checking will make sure you
> don't use the wrong one.
>
> (b) A SQL processor would accept a template but not a string—if you
> thought you wanted string interpolation, what you really want is a
> template, and the type checking will make sure you don't use the wrong one.
>
> (c) I think `format` is a special case that we tend to get hung up on, and
> I think that, in this particular branch of the design space we are
> exploring, perhaps a name other than `String.format` should be chosen for
> the method that does string formatting on templates. Possible names are
> `StringTemplate.format` and `String.format$`, but I will leave further
> bikeshedding on this to others. I do recognize that this move will not
> enable the type system per se to absolutely prevent programmers from writing
>
> String.format("Hello, my name is %s{name}"); // can you spot the bug?
>
> but, as Clement has observed, such cases will probably provoke a warning
> about a mismatch between the number of arguments and the number of
> %-specifiers that require parameters, so maybe overloading would be okay
> anyway for `String.format`.
>
> Anyway, my point is that whether to overload a method to accept either a
> string or a string template can be evaluated on a case-by-case basis
> according to a small number of principles that I think we could enumerate
> and explain pretty easily.
>
> —Guy
>
> On Mar 14, 2024, at 1:40 PM, Maurizio Cimadamore
> <maurizio.cimadamore at oracle.com> <maurizio.cimadamore at oracle.com> wrote:
>
> Not to pour too much cold water on the idea of having string interpolation
> literal, but I’d like to mention a few points here.
>
> First, it was a deliberate design goal of the string template feature to
> make interpolation an explicit act. Note that, if we had the syntax you
> describe, we actually achieve the opposite effect: string interpolation is
> now the default, and implicit, and actually *cheaper* (to type) than the
> safer template alternative. This is a bit of a red herring, I think.
>
> The second problem is that interpolation literals can sometimes be
> deceiving. Consider this example:
>
> String.format("Hello, my name is %s{name}"); // can you spot the bug?
>
> Where String::format has a new overload which accepts a StringTemplate.
>
> Basically, since here we forgot the leading “$” (or whatever char that
> is), the whole thing is just a big interpolation. Semantically equivalent
> to:
>
> String.format("Hello, my name is %s" + name); // whoops!
>
> This will fail, as String::format will be waiting for an argument (a
> string), but none is provided. So:
>
> | Exception java.util.MissingFormatArgumentException: Format specifier '%s'
> | at Formatter.format (Formatter.java:2672)
> | at Formatter.format (Formatter.java:2609)
> | at String.format (String.java:2897)
> | at (#2:1)
>
> This is a very odd (and new!) failure mode, that I’m sure is gonna
> surprise developers.
>
> Maurizio
>
> On 14/03/2024 15:08, Guy Steele wrote:
>
>
> Second thoughts about how to explain a string interpolation literal:
>
>
> On Mar 13, 2024, at 2:02 PM, Guy Steele <guy.steele at oracle.com> <guy.steele at oracle.com> wrote:
> . . .
>
> —————————
> String is not a subtype of StringTemplate; they are disjoint types.
>
> $”foo” is a (trivial) string template literal
> “foo” is a string literal
> $”Hello, \{x}” is a (nontrivial) string template literal
> “Hello, \{x}” is a shorthand (expanded by the compiler) for `String.of($“Hello, \{x}”)`
> —————————
>
> Given that the intent is that String.of (or whatever we want to call it—possibly the `interpolation` instance method of class `StringTemplate` rather than a static method `String.of`) should just do standard string concatenation, we might be better off just saying that a string interpolation literal is expanded by the compiler into uses of “+”; for example,
>
> “Hello, \{x}.”
>
> (I have added a period to the example to make the point clearer) is expanded into
>
> “Hello, “ + x + “.”
>
> and in general
>
> “c0\{e1}c1\{e2}c2…\{en}cn”
>
> (where each ck is a possibly empty sequence of string characters and each ek is an expression) is expanded into
>
> “c0” + (e1) + “c1” + (e2) + “c2” + … + (en) + “cn”
>
> The point is that, with this definition, “c0\{e1}c1\{e2}c2…\{en}cn” is a constant expression iff every ek is a constant expression. This is handy for interpolating constant variables into a string that is itself intended to be constant.
>
> —Guy
>
>
>
>
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20240315/c505fbf6/attachment-0001.htm>
More information about the amber-spec-observers
mailing list