Update on String Templates (JEP 459)

Remi Forax forax at univ-mlv.fr
Mon Mar 11 13:01:29 UTC 2024


Hello, 

> Another, simpler, option we consider was to use some kind of prefix to mark a string template literal (e.g. make that explicit, instead of resorting to language wizardry). That works, but has the disadvantage of breaking the spell that there is only “one string literal”, which is something we have worked quite hard to achieve. 

I vote for making string templates explicit. 
Yes, it can be seen as complex as first because not everything is a String, but at the same time, I believe it makes the conversion rules far easier to understand. 

For me, we already have a lot of methods that takes a String as parameter, so seeing String as a StringTemplate is not really a solution because it means adding a lot of overloads with all of the incompatibilities you describe. 
I see the conversion of a StringTemplate to a String as a boxing conversion, if the current type is a StringTemplate and the target type is a String, the compiler will generate a code that is equivalent to calling ".interpolate()" implicitly. 
It's not a real boxing conversion, because it's a one way conversion, i.e. there is a boxing conversion between StringTemplate to String but no boxing conversion from String to StringTemplate. We can add it, but i do not think it's necessary given that with a String s, it can always be converted to a StringTemplate using t"\{s}". 

One question can be if we prefer a callsite conversion like the boxing conversion described above or a declaration site conversion, i.e. ask all developers if they want to support StringTemplate to add a new overload. 
Apart from the fact that adding overloads in a lot of existing projects looks like a sisiphus task, doing the conversion at use site also as the advantage of allowing the compiler generates an invokedynamic at use site so the boxing from a StringTemplate to a String will be as fast as the string concatenation using '+' (see Duncan email on amber-dev). 

regards, 
Rémi 

> From: "Maurizio Cimadamore" <maurizio.cimadamore at oracle.com>
> To: "Brian Goetz" <brian.goetz at oracle.com>, "Guy Steele" <guy.steele at oracle.com>
> Cc: "Tagir Valeev" <amaembo at gmail.com>, "amber-spec-experts"
> <amber-spec-experts at openjdk.org>
> Sent: Monday, March 11, 2024 1:15:51 PM
> Subject: Re: Update on String Templates (JEP 459)

> Hi all,
> we tried mainly three approaches to allow smoother interop between strings and
> string templates: (a) make String a subclass of StringTemplate. Or (b) make
> constant strings bs convertible to string templates. Or, (c) use target-typing.
> All these approaches have some issues, discussed below.

> The first approach is slightly simpler, because it can be achieved entirely
> outside of the Java language. Unfortunately, adding “String implements
> StringTemplate” adds overload ambiguities in cases such as this:
> format(StringTemplate) // 1
> format(String, Object...) // 2

> This is actually a very important case, as we predice that StringTemplate will
> serve as a great replacement for methods out there accepting a string/Object…
> pack.

> Unfortunatly, if String <: StringTemplate, this means that calling format with a
> string literal will resolve to (1), not (2) as before. The problem here is that
> (2) is not even applicable during the two overload resolution phases (which is
> only allowed to use subtyping and conversions, respectively), as it is a
> varargs method. Because of this, (1) will now take the precedence, as that’s
> not varargs. While for String::format this is probably harmless, changing
> results of overload selection is something that should be done with care (esp.
> if different overloads have different return types), as it could lead to source
> compatibility issues.

> On top of these issues, making all strings be string templates has the
> disadvantage of also considering “messy” strings obtained via concatenation of
> non-constant values string templates too, which seems bad.

> To overcome these issues, we attempetd to add an implicit conversion from
> constant strings to StringTemplate. As it was observed by Guy, in case of
> ambiguities, the non-converting variants (e.g. m(String)) would be preferred.
> That said, in the above example (with varargs) we would still get a potentially
> incompatible change - as a string literal would be applicable in (1) before (2)
> is even considered, so the same concerns surrounding overload resolution
> changes would remain.

> Another thing that came up is that conversions automatically bring in casting
> conversions. E.g. if you can go from A to B using assignment conversion, you
> can typically go the same direction using casting conversion. This raises two
> issues. The first is that casting conversion is generally a symmetric type
> relationship (e.g. if you can cast from A to B, then you can cast from B to A),
> while here we’re mostly discussing about one direction. But this is, perhaps,
> not a big deal - after all, “constant strings” don’t have a denotable type, so
> perhaps it should come to no surprise that you can’t use them as a target type
> for a cast.

> The second “issue” is that casting conversion brings about patterns, as that’s
> how pattern applicability is defined. For instance:
> switch("Hello") {
>     case StringTemplate st ...
> }

> To make this work we would need at least to tweak exhaustiveness (otherwise
> javac would think the above switch is not exhaustive, and ask you to add a
> default). Secondly, some tweaks to the runtime tests would be required also.
> Not impossible, but would require some more work to make sure we’re ok with
> this direction.

> Another issue with the conversion is that it would expose a sharp edge in the
> current overload resolution and inference machinery. For instance, this program
> doesn’t compile correctly:
> List<Integer> li = List.of(1, 1L)

> Similarly, this program would also not compile correctly:
> List<StringTemplate> li = List.of("Hello", "Hello \{world}");

> The last possibility would be to say that a string literal is a poly expression
> . As such, a string literal can be typed to either String or StringTemplate
> depending on the target type (for instance, this is close to how int literals
> also work).

> This approach would still suffer from the same incompatible overload changes
> with varargs method as the other approaches. But, by avoiding to add a
> conversion, it makes things a little easier: for instance, in the case of
> pattern matching, nothing needs to be done, as the string literal will be
> turned into a string template before the switch even takes place (meaning that
> existing exhaustiveness and runtime checks would still work). But, there’s
> still dragons and irregularities when it comes to inference - for instance:
> List<StringTemplate> lst = List.of("hello", "world");

> This would not type-check: we need a target-type to know which way the literal
> is going (List::of just accepts a type-variable X). Note that overload
> resolution happens at a time where the target-type is not known, so here we’d
> probably pick X = String, which will then fail to type-check against the
> target.

> Another issue with target-typing is that if you have two overloads:
> m(String)
> m(StringTemplate)

> And you call this with a string literal, you get an ambiguity: you can go both
> ways, but String and StringTemplate are unrelated types, so we can’t pick one
> as “most specific”. This issue could be addressed, in principle, by adding an
> ad-hoc most specific rule that, in case of an ambiguity, always gave precedence
> to String over StringTemplate. We do a similar trick for lambda expressions,
> where if two method accepts similarly looking functional interface, we give
> precedence to the non-boxing one.

> Anyway, the general message here is that it’s a bit of a “pick your posion”
> situation. Adding a more fluid relationship between string and templates is
> definitively possible, but there are risks that this will impact negatively
> other areas of the language, risks that would need to be assessed very
> carefully.

> Another, simpler, option we consider was to use some kind of prefix to mark a
> string template literal (e.g. make that explicit, instead of resorting to
> language wizardry). That works, but has the disadvantage of breaking the spell
> that there is only “one string literal”, which is something we have worked
> quite hard to achieve.

> Cheers
> Maurizio

> On 09/03/2024 23:52, Brian Goetz wrote:

>> I’ll let Maurizio give the details, because I’m sure I will have forgotten one
>> or two.

>-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20240311/e9b97e9d/attachment-0001.htm>


More information about the amber-spec-experts mailing list