Update on String Templates (JEP 459)
Clement Cherlin
ccherlin at gmail.com
Mon Mar 11 15:50:52 UTC 2024
On Mon, Mar 11, 2024 at 9:37 AM Remi Forax <forax at univ-mlv.fr> wrote:
>
> Hello,
>
> > Another, simpler, option we consider was to use some kind of prefix to mark a string template literal (e.g. make that explicit, instead of resorting to language wizardry). That works, but has the disadvantage of breaking the spell that there is only “one string literal”, which is something we have worked quite hard to achieve.
>
> I vote for making string templates explicit.
> Yes, it can be seen as complex as first because not everything is a String, but at the same time, I believe it makes the conversion rules far easier to understand.
I agree, and suggest `backquotes` (and ```triple backquotes``` for
Template Blocks) to denote template strings. They were already
considered for Raw String Literals, which implies they're a viable
option. Raw String Literals didn't get adopted, so the character
remains available for use.
>
> For me, we already have a lot of methods that takes a String as parameter, so seeing String as a StringTemplate is not really a solution because it means adding a lot of overloads with all of the incompatibilities you describe.
> I see the conversion of a StringTemplate to a String as a boxing conversion, if the current type is a StringTemplate and the target type is a String, the compiler will generate a code that is equivalent to calling ".interpolate()" implicitly.
> It's not a real boxing conversion, because it's a one way conversion, i.e. there is a boxing conversion between StringTemplate to String but no boxing conversion from String to StringTemplate. We can add it, but i do not think it's necessary given that with a String s, it can always be converted to a StringTemplate using t"\{s}".
>
> One question can be if we prefer a callsite conversion like the boxing conversion described above or a declaration site conversion, i.e. ask all developers if they want to support StringTemplate to add a new overload.
> Apart from the fact that adding overloads in a lot of existing projects looks like a sisiphus task, doing the conversion at use site also as the advantage of allowing the compiler generates an invokedynamic at use site so the boxing from a StringTemplate to a String will be as fast as the string concatenation using '+' (see Duncan email on amber-dev).
>
> regards,
> Rémi
>
> ________________________________
I do not think implicit conversion from either String to String
Template or String Template to String is wise, given the complications
with overload resolution and potential for surprising undesired
effects cited below. When a method doesn't accept a String Template,
you simply process it to String first (or another type, remember that
a StringTemplate doesn't have to be converted to a String!). This is
no different than what you would do with a StringBuilder or other
String-like-but-not-actually-String object.
Cheers,
Clement
> From: "Maurizio Cimadamore" <maurizio.cimadamore at oracle.com>
> To: "Brian Goetz" <brian.goetz at oracle.com>, "Guy Steele" <guy.steele at oracle.com>
> Cc: "Tagir Valeev" <amaembo at gmail.com>, "amber-spec-experts" <amber-spec-experts at openjdk.org>
> Sent: Monday, March 11, 2024 1:15:51 PM
> Subject: Re: Update on String Templates (JEP 459)
>
> Hi all,
> we tried mainly three approaches to allow smoother interop between strings and string templates: (a) make String a subclass of StringTemplate. Or (b) make constant strings bs convertible to string templates. Or, (c) use target-typing. All these approaches have some issues, discussed below.
>
> The first approach is slightly simpler, because it can be achieved entirely outside of the Java language. Unfortunately, adding “String implements StringTemplate” adds overload ambiguities in cases such as this:
>
> format(StringTemplate) // 1
> format(String, Object...) // 2
>
> This is actually a very important case, as we predice that StringTemplate will serve as a great replacement for methods out there accepting a string/Object… pack.
>
> Unfortunatly, if String <: StringTemplate, this means that calling format with a string literal will resolve to (1), not (2) as before. The problem here is that (2) is not even applicable during the two overload resolution phases (which is only allowed to use subtyping and conversions, respectively), as it is a varargs method. Because of this, (1) will now take the precedence, as that’s not varargs. While for String::format this is probably harmless, changing results of overload selection is something that should be done with care (esp. if different overloads have different return types), as it could lead to source compatibility issues.
>
> On top of these issues, making all strings be string templates has the disadvantage of also considering “messy” strings obtained via concatenation of non-constant values string templates too, which seems bad.
>
> To overcome these issues, we attempetd to add an implicit conversion from constant strings to StringTemplate. As it was observed by Guy, in case of ambiguities, the non-converting variants (e.g. m(String)) would be preferred. That said, in the above example (with varargs) we would still get a potentially incompatible change - as a string literal would be applicable in (1) before (2) is even considered, so the same concerns surrounding overload resolution changes would remain.
>
> Another thing that came up is that conversions automatically bring in casting conversions. E.g. if you can go from A to B using assignment conversion, you can typically go the same direction using casting conversion. This raises two issues. The first is that casting conversion is generally a symmetric type relationship (e.g. if you can cast from A to B, then you can cast from B to A), while here we’re mostly discussing about one direction. But this is, perhaps, not a big deal - after all, “constant strings” don’t have a denotable type, so perhaps it should come to no surprise that you can’t use them as a target type for a cast.
>
> The second “issue” is that casting conversion brings about patterns, as that’s how pattern applicability is defined. For instance:
>
> switch("Hello") {
> case StringTemplate st ...
> }
>
> To make this work we would need at least to tweak exhaustiveness (otherwise javac would think the above switch is not exhaustive, and ask you to add a default). Secondly, some tweaks to the runtime tests would be required also. Not impossible, but would require some more work to make sure we’re ok with this direction.
>
> Another issue with the conversion is that it would expose a sharp edge in the current overload resolution and inference machinery. For instance, this program doesn’t compile correctly:
>
> List<Integer> li = List.of(1, 1L)
>
> Similarly, this program would also not compile correctly:
>
> List<StringTemplate> li = List.of("Hello", "Hello \{world}");
>
> The last possibility would be to say that a string literal is a poly expression. As such, a string literal can be typed to either String or StringTemplate depending on the target type (for instance, this is close to how int literals also work).
>
> This approach would still suffer from the same incompatible overload changes with varargs method as the other approaches. But, by avoiding to add a conversion, it makes things a little easier: for instance, in the case of pattern matching, nothing needs to be done, as the string literal will be turned into a string template before the switch even takes place (meaning that existing exhaustiveness and runtime checks would still work). But, there’s still dragons and irregularities when it comes to inference - for instance:
>
> List<StringTemplate> lst = List.of("hello", "world");
>
> This would not type-check: we need a target-type to know which way the literal is going (List::of just accepts a type-variable X). Note that overload resolution happens at a time where the target-type is not known, so here we’d probably pick X = String, which will then fail to type-check against the target.
>
> Another issue with target-typing is that if you have two overloads:
>
> m(String)
> m(StringTemplate)
>
> And you call this with a string literal, you get an ambiguity: you can go both ways, but String and StringTemplate are unrelated types, so we can’t pick one as “most specific”. This issue could be addressed, in principle, by adding an ad-hoc most specific rule that, in case of an ambiguity, always gave precedence to String over StringTemplate. We do a similar trick for lambda expressions, where if two method accepts similarly looking functional interface, we give precedence to the non-boxing one.
>
> Anyway, the general message here is that it’s a bit of a “pick your posion” situation. Adding a more fluid relationship between string and templates is definitively possible, but there are risks that this will impact negatively other areas of the language, risks that would need to be assessed very carefully.
>
> Another, simpler, option we consider was to use some kind of prefix to mark a string template literal (e.g. make that explicit, instead of resorting to language wizardry). That works, but has the disadvantage of breaking the spell that there is only “one string literal”, which is something we have worked quite hard to achieve.
>
> Cheers
> Maurizio
>
> On 09/03/2024 23:52, Brian Goetz wrote:
>
> I’ll let Maurizio give the details, because I’m sure I will have forgotten one or two.
>
>
More information about the amber-spec-observers
mailing list