Feedback: String Templates (JEP 430)
John Rose
john.r.rose at oracle.com
Fri Mar 31 20:23:20 UTC 2023
On 31 Mar 2023, at 12:30, Remi Forax wrote:
> …
> I agree that interpolate() is too easy to misuse but at the same time,
> it's a useful primitive.
+1
> I wonder if the solution is to add an escape function, a function that
> takes an Object and returns an Object that should escape the values to
> interpolate.
>
> Something like
> public String StringTemplate.interpolate(UnaryOperator<Object>
> escapeFunction) { ... }
>
> By asking for an escape function, we are making the API safer to use.
But the workaround is saying `interpolate(x->x)` and grumbling about
ceremony. That workaround doesn’t get much closer to exposing the
root problem. Also, the unary operator, if we were to do this
functionally as suggested, needs to “see” more context about each
value that it would be interpolated.
It seems to me that part of the problem here is the responsibility for
escaping is on the wrong side of the fence, with `interpolate` as
presently discussed. When offered a function whose contract knows only
about string joining, the party producing stringy bits to join into a
correct DSL statement is burdened with responsibility of figuring out
how and when to escape them, and in their various DSL-specific contexts.
But surely that knowledge belongs more exactly to the ST processor, not
to the supplier of interpolation values. Some languages have
context-dependent quoting rules. Note that Remi’s suggested unary
function doesn’t see the context. It could be given context as
`interpolate((x,c)->…)`, and that begs a very good question about the
type of c. What I’m saying is that c is not something the supplier of
interpolation values should be forced to worry about.
For example in SQL values are quoted with a single quote but names are
quoted with a different kind of quote, often vendor-specific; yuck.
Fighting against code-injection might require getting the correct
contextual flavor of quote in each case, if not for SQL then for more
complex templated notations. It’s lucky for SQL that textually
doubling ` ' ` to ` '' ` will cover most use cases, regardless of
context, but that’s just luck. JSON has distinct kinds of values,
which would require distinct tactics for validation and/or quoting; you
need quote-escapes for string bodies and field names but not for
numbers. If you were trying to do Java templates you’d want to know
the contextual difference between char and string literals.
Generally speaking, getting the quoting right is not the direct
responsibility directly of people supplying values to interpolate, but
rather the responsibility of the party weaving together a (correct)
template (SQL or JSON or …). Asking the value-supplier to shoulder
the burden of correct quoting requires a mix of two kinds of expertise
(business logic and query syntax), which is how bugs happen.
I think I would prefer to see a formulation of interpolate which would
require users to take apart the ST processor, lower it into a
plain-string-cat template processor, and then run a natively
string-cat-ing format operation on it; after that it can be lifted back
to its DSL, with fingers crossed that we got avoided bad injections.
But I admit I haven’t figured out the details, so that’s just a
vague suggestion…
What I hope is clear is my point about separating concerns, between
knowing how and when to escape a value *in a particular place*, and
coming up with a set of interpolation values for those places. It’s
rooted in the distinction between an envelope and its contents. Quoting
(and validation) is something envelope-specific. Contents are usually
specific to some completely unrelated domain of business logic. Unless
API users are helped to separate those concerns, there will be
confusion, exploitable in attacks.
— John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-dev/attachments/20230331/6563052d/attachment-0001.htm>
More information about the amber-dev
mailing list