Feedback: String Templates (JEP 430)

John Rose john.r.rose at oracle.com
Fri Mar 31 20:23:20 UTC 2023


On 31 Mar 2023, at 12:30, Remi Forax wrote:

>> I agree that interpolate() is too easy to misuse but at the same time, 
> it's a useful primitive.

+1

> I wonder if the solution is to add an escape function, a function that 
> takes an Object and returns an Object that should escape the values to 
> interpolate.
>
> Something like
> public String StringTemplate.interpolate(UnaryOperator<Object> 
> escapeFunction) { ... }
>
> By asking for an escape function, we are making the API safer to use.

But the workaround is saying `interpolate(x->x)` and grumbling about 
ceremony.  That workaround doesn’t get much closer to exposing the 
root problem.  Also, the unary operator, if we were to do this 
functionally as suggested, needs to “see” more context about each 
value that it would be interpolated.

It seems to me that part of the problem here is the responsibility for 
escaping is on the wrong side of the fence, with `interpolate` as 
presently discussed.  When offered a function whose contract knows only 
about string joining, the party producing stringy bits to join into a 
correct DSL statement is burdened with responsibility of figuring out 
how and when to escape them, and in their various DSL-specific contexts.

But surely that knowledge belongs more exactly to the ST processor, not 
to the supplier of interpolation values.  Some languages have 
context-dependent quoting rules.  Note that Remi’s suggested unary 
function doesn’t see the context.  It could be given context as 
`interpolate((x,c)->…)`, and that begs a very good question about the 
type of c.  What I’m saying is that c is not something the supplier of 
interpolation values should be forced to worry about.

For example in SQL values are quoted with a single quote but names are 
quoted with a different kind of quote, often vendor-specific; yuck.  
Fighting against code-injection might require getting the correct 
contextual flavor of quote in each case, if not for SQL then for more 
complex templated notations.  It’s lucky for SQL that textually 
doubling ` ' ` to ` '' ` will cover most use cases, regardless of 
context, but that’s just luck.  JSON has distinct kinds of values, 
which would require distinct tactics for validation and/or quoting; you 
need quote-escapes for string bodies and field names but not for 
numbers.  If you were trying to do Java templates you’d want to know 
the contextual difference between char and string literals.

Generally speaking, getting the quoting right is not the direct 
responsibility directly of people supplying values to interpolate, but 
rather the responsibility of the party weaving together a (correct) 
template (SQL or JSON or …).  Asking the value-supplier to shoulder 
the burden of correct quoting requires a mix of two kinds of expertise 
(business logic and query syntax), which is how bugs happen.

I think I would prefer to see a formulation of interpolate which would 
require users to take apart the ST processor, lower it into a 
plain-string-cat template processor, and then run a natively 
string-cat-ing format operation on it; after that it can be lifted back 
to its DSL, with fingers crossed that we got avoided bad injections.  
But I admit I haven’t figured out the details, so that’s just a 
vague suggestion…

What I hope is clear is my point about separating concerns, between 
knowing how and when to escape a value *in a particular place*, and 
coming up with a set of interpolation values for those places.  It’s 
rooted in the distinction between an envelope and its contents.  Quoting 
(and validation) is something envelope-specific.  Contents are usually 
specific to some completely unrelated domain of business logic.  Unless 
API users are helped to separate those concerns, there will be 
confusion, exploitable in attacks.

— John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-dev/attachments/20230331/6563052d/attachment-0001.htm>


More information about the amber-dev mailing list