<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
I think this has been a good discussion, and it looks like we're
starting to see some convergence. <br>
<br>
I think we keep trying to exploit ambiguity / implicitness, and it
doesn't go well:<br>
<br>
- Many users want STR to be the "implicit processor", but that
isn't good for security<br>
- We tried reusing the String delimiters for string templates to
reduce the perception of how many different things there are here,
but that creates cognitive load (can't tell strings from templates
without parsing the entire contents), among other problems<br>
- We tried making String a poly expression (and other tricks) to
reduce the number of explicit conversions, but that created problems
too<br>
<br>
John's characterization captures the feeling and eventual conclusion
that I think many of us share: <br>
<br>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">I kind of like Guy’s offensive-to-everyone suggestion that $ is required to make a true ST. </pre>
</blockquote>
<br>
Indeed, my first reaction to the $ sigil was "please no", but I am
grudgingly coming to the conclusion that we should stop trying to
implicitly "just figure out what the user wants" and acknowledge the
reality: templates are not strings, strings are not templates, and
they can be converted to each other with ... methods, just like any
other relatable types. So string literals are as they always were;
string templates are a new thing, whose syntax and type is disjoint
from that of strings, as Guy also seems to be converging on:<br>
<br>
<blockquote type="cite">
<div>And now that I have that better understanding, I think I lean
toward (a) abandoning string interpolation and (b) having a
single, short, _non-optional_ prefix for templates (“$” would be
a plausible choice), on the grounds that I think it makes code
more readable if templates are always distinguished up front
from strings—and this is especially helpful when the templates
are rather long and any `\{` present might be far from the
beginning. It has a minimal number of cases to explain:</div>
<div><br>
</div>
<div><span class="Apple-tab-span" style="white-space:pre"></span>“…”
string literal, must not contain \{…}, type String</div>
<div><span class="Apple-tab-span" style="white-space:pre"></span>$”…”
template literal, may contain \{…}, type StringTemplate</div>
</blockquote>
<br>
(concrete syntax TBB (to be bikeshod), along with the spellings of S
-> ST and ST -> S.) <br>
<br>
Some more useful observations: <br>
<br>
- The toString behavior cannot be mere interpolation. Besides the
principled objections and inevitable propping-open-the-security-door
that this would lead to, people will quickly learn to abuse "" + ST
as the "fewest characters required" way to get interpolation, which
is "clever" in the same way that John's "empty \{}" trick is clever,
but not good for clarity. <br>
- We need a story to tell for how to write good overloads, which
seems to be more subtle than initially thought.<br>
- If the only way to make a StringTemplate is the literal syntax,
then STs gain a valuable security property: all fragments in the ST
are strings that appeared literally in code, and therefore
untainted. This is probably too restrictive but we should be aware
of what we are giving up as we explore the API options.<br>
- Processors should be encouraged to "flatten" embedded STs.<br>
<br>
A few people have implied that only the tainted parts of an ST (the
embedded expressions) need special processing, but I'll point out
that the untainted parts may often require domain-specific
validation. For example, a ST representing a SQL query wants
balanced quotes, and might want to require quotes around embedded
expressions. <br>
<br>
<br>
<br>
<div class="moz-cite-prefix">On 3/8/2024 1:35 PM, Brian Goetz wrote:<br>
</div>
<blockquote type="cite" cite="mid:8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">Time to check in with where were are with String
Templates. We’ve gone through two rounds of preview, and have
received some feedback. </span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<font class="" color="#000000">As a reminder, the primary goal of
gathering feedback is to learn things about the design or
implementation that we don’t already know. This could be bug
reports, experience reports, code review, careful analysis,
novel alternatives, etc. And the best feedback usually comes
from using the feature “in anger” — trying to actually write
code with it. </font><span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">(“Some
people would prefer a different syntax” or “some people would
prefer we focused on string interpolation only” fall squarely in
the “things we already knew” camp.) </span>
<div class=""><font class="" color="#000000"><span style="caret-color: rgb(0, 0, 0);" class=""><br class="">
</span></font>
<div class="">
<div class=""><font class="" color="#000000">In the course
of using this feature in the `jextract` project, we did
learn quite a few things we didn’t already know, and this
was conclusive enough that it has motivated us to adjust
our approach in this feature. Specifically, the role of
processors is “outsized” to the value they offer, and,
after further exploration, we now believe it is possible
to achieve the goals of the feature without an explicit
“processor” abstraction at all! This is a very positive
development. </font><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">First, I want to affirm that that the goals of
the project have not changed. From JEP 459:</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">Goals</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span class="Apple-tab-span" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); white-space: pre;"></span><span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">• Simplify the writing of Java programs by making
it easy to express strings that include values computed at
run time.</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span class="Apple-tab-span" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); white-space: pre;"></span><span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">• Enhance the readability of expressions that mix
text and expressions, whether the text fits on a single
source line (as with string literals) or spans several
source lines (as with text blocks).</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span class="Apple-tab-span" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); white-space: pre;"></span><span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">• Improve the security of Java programs that
compose strings from user-provided values and pass them to
other systems (e.g., building queries for databases) by
supporting validation and transformation of both the
template and the values of its embedded expressions.</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span class="Apple-tab-span" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); white-space: pre;"></span><span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">• Retain flexibility by allowing Java libraries
to define the formatting syntax used in string templates.</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span class="Apple-tab-span" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); white-space: pre;"></span><span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">• Simplify the use of APIs that accept strings
written in non-Java languages (e.g., SQL, XML, and JSON).</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span class="Apple-tab-span" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); white-space: pre;"></span><span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">• Enable the creation of non-string values
computed from literal text and embedded expressions
without having to transit through an intermediate string
representation.</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">Non-Goals</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span class="Apple-tab-span" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); white-space: pre;"></span><span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">• It is not a goal to introduce syntactic sugar
for Java's string concatenation operator (+), since that
would circumvent the goal of validation.</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span class="Apple-tab-span" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); white-space: pre;"></span><span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">• It is not a goal to deprecate or remove the
StringBuilder and StringBuffer classes, which have
traditionally been used for complex or programmatic string
composition.</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">Another thing that has not changed is our view on
the syntax for embedding expressions. While many people
did express the opinion of “why not ‘just' do what
Kotlin/Scala does”, this issue was more than fully
explored during the initial design round. (In fact, while
syntax disagreements are often purely subjective, this one
was far more clear — the $-syntax is objectively worse,
and would be doubly so if injected into an existing
language where there were already string literals in the
wild. This has all been more than adequately covered
elsewhere, so I won’t rehash it here.)</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">Now, let’s talk about what we do think should
change: the role of processors and the StringTemplate
type. </span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<font class="" color="#000000">Processors were envisioned as
a means to abstract the transformation of templates to
their final form (whether string, or something else.)
However, Java already has a well established means of
abstracting behavior: methods. (In fact, a processor
application can be viewed as merely a new syntax for a
method call.) Our experience using the feature
highlighted the question: When converting a SQL query
expressed as a template to the form required by the
database (such as PreparedStatement), why do we need to
say:</font><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class=""> DB.”… template …”</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">When we could use an ordinary Java library:</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class=""> Query q = Query.of(“…template…”)</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<font class="" color="#000000">Indeed, one of the worst
things about having processors in the language is that API
designers are put in the difficult situation of not
knowing whether to write a processor or an ordinary API,
and often have to make that choice before the consequences
are fully understood. (To add to this, processors raise
similar questions at the use site.) But the real criticism
here is that template capture and processing are
complected, when they should be separate, composable
features. </font><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">This motivated us to revisit some of the reasons
why processors were so central to the initial design in
the first place. And it turned out, this choice had been
influenced — perhaps overly so — by early implementation
experiments. (One of the background design goals was to
enable expensive operations like `String::format` to be
(much) cheaper. Without digressing too deeply on
performance, String::format can be more than an order of
magnitude worse than the equivalent concatenation
operation, and this in turn sometimes motivates developers
to use worse idioms for formatting. The FMT processor
brough that cost back in line with the equivalent
concatenation.) These early experiments biased the design
towards needing to know the processor at the point of
template capture, but upon reexamination we realized that
there are other ways to achieve the desired performance
goals without requiring processors to be known at capture
time. This, in turn, enabled us to revisit a point in the
design space we had transited through earlier, where
string templates were “just a new kind of literal” and the
job performed by processors could instead be performed by
ordinary APIs.</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">At this point, a simpler design and
implementation emerged that met the semantic, correctness,
and performance goals: template literals (“Hello \{name}”)
are simply the literal form of StringTemplate:</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<font class="" color="#000000"><span style="caret-color: rgb(0, 0, 0);" class=""> StringTemplate
st = “Hello \{name}”;</span></font><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<font class="" color="#000000"><span style="caret-color: rgb(0, 0, 0);" class="">String and
StringTemplate remain unrelated types. (We explored a
number of ways to interconvert them, but they caused
more trouble than they solved.) Processing of string
templates, including interpolation, is done by ordinary
APIs that deal in StringTemplate, aided by some clever
implementation tricks to ensure good performance. </span></font></div>
<div class=""><font class="" color="#000000"><span style="caret-color: rgb(0, 0, 0);" class=""><br class="">
</span></font></div>
<div class=""><font class="" color="#000000"><span style="caret-color: rgb(0, 0, 0);" class="">For APIs
where interpolation is known to be safe in the domain,
such as PrintWriter, APIs can make that choice on behalf
of the domain, by providing overloads to embody this
design choice: </span></font><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class=""> void println(String) { … }</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<font class="" color="#000000"> void
println(StringTemplate) { … interpolate and delegate to
println(String) …. }</font><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">The upshot is that for interpolation-safe APIs
like println, we can use a template directly without
giving up any safety:</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class=""> System.out.println(“Hello \{name}”);</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br class="">
In this example, the string template evaluates to
StringTemplate, not String (no implicit interpolation), and
chooses the StringTemplate overload of println, which in
turn chooses how to process the template. <span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">This stays true to the design principle that
interpolation is dangerous enough that it should be an
explicit choice in the code — but it allows that choice to
be made by libraries when the library is comfortable doing
so. </span></div>
<div class=""><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">Similarly, the FMT processor is replaced by an
overload of String::format that interprets templates with
embedded format specifiers (e.g., “%d”):</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class=""> String format(String formatString, Object…
parameters) { … same as today … }</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class=""> String format(StringTemplate template) {...
equivalent of FMT ...}</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">And users can call this as:</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class=""> String s = String.format(“Hello %12s\{name}”);</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<font class="" color="#000000"><span style="caret-color: rgb(0, 0, 0);" class="">Here, the
String::format API has chosen to interpret string
templates according to the rules previously specified in
the FMT processor (not ordinary interpolation), but that
choice is embedded in the library semantics so no
further explicit choice at the use site is required.
The user already chose to pass it to String::format;
that’s all the processing selection that is needed. </span></font><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">Where APIs do not express a choice of what
template expansion means, users continue to be free to
process them explicitly before passing them, using APIs
that do (such as String::format or ordinary
interpolation.). </span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">The result is:</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">- The need for use-site "goop" (previously, the
processor name; now, static or instance methods to process
a template) goes away entirely when dealing with libraries
that are already template-friendly. </span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">- Even with libraries that require use-site goop,
it is no more intrusive than before, and can be reduced
over time as APIs get with the program. </span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">- StringTemplate is just another type that APIs
can support if they want. The "DB" processor becomes an
ordinary factory method that accepts a string template or
an ordinary builder API. </span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">- APIs now can have _more_ control over the
timing and meaning of template processing, because we are
not biasing so strongly towards early processing.</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">- It becomes easier to abstract over template
processing (i.e., combine or manipulate templates as
templates before processing)</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">- Interpolation remains an explicit choice, but
ST-aware libraries can make this choice on behalf of the
user.</span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">- The language feature and API surface get
considerably smaller, which is good. Core JDK APIs (e.g.,
println, format, exception constructors) get upgraded to
work with string templates. </span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">The remaining question that everyone is probably
asking is: “so how do we do interpolation.” The answer
there is “ordinary library methods”. This might be a
static method (String.join(StringTemplate)) or an instance
method (template.join()), shed to be painted (but please,
not right now.). </span><br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<br style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);" class="">
<font class="" color="#000000"><span style="caret-color: rgb(0, 0, 0);" class="">This is a
sketch of direction, so feel free to pose
questions/comments on the direction. We’ll discuss the
details as we go. </span></font></div>
</div>
</div>
<div class=""><font class="" color="#000000"><span style="caret-color: rgb(0, 0, 0);" class=""><br class="">
</span></font></div>
<div class=""><font class="" color="#000000"><span style="caret-color: rgb(0, 0, 0);" class=""><br class="">
</span></font></div>
</blockquote>
<br>
</body>
</html>