Presentation at JCrete.org

Robbe Pincket robbepincket at live.be
Mon Jul 24 08:54:15 UTC 2023


Hi David

I feel like I need to disagree on some points. The reason we had the log4j incident was not because people couldn’t validate their inputs, but a misuse of the api (imho due to a bad design decision).
The line `logger.info(value);` did not mean ‘log this value’ like most people expected it to mean, but instead ‘use this value as a format string’.
In my mind this is not unlike passing a user provided untrusted to `printf` in C.

Having a `LOG.”value: \{value}”` would solve this issue. You can’t provide a user provided untrusted string to the logger, only as one of the interpolation arguments, which in any reasonable design wouldn’t be interpreted as a template, only the string the programmer hardcoded into their code gets any template expansion. This means invoking a logger in the way Rémi wants to be able to invoke a logger, would have been more safe, and not less safe then we did (/do) with log4j.

This does not mean I think having side effects to string interpolation is a good idea, and it also doesn’t mean that you can’t get this same safety with side effect free interpolation. If the logger’s log methods have an overload for a `Templated` object that would be returned by the `LOG.”...”` interpolation, one could just not template expand any `Strings` passed to the logger.
Having `logger.info(LOG.”value: \{value}”)` is a bit more unwieldly to some than just `LOG.”value: \{value}”`, so I think it’s gonna be inevitable that people are gonna add sideeffects somewhere to interpolations.

Kind regards
Robbe Pincket


________________________________
Van: amber-dev <amber-dev-retn at openjdk.org> namens David Alayachew <davidalayachew at gmail.com>
Verzonden: Monday, July 24, 2023 12:32:16 AM
Aan: forax at univ-mlv.fr <forax at univ-mlv.fr>
CC: Brian Goetz <brian.goetz at oracle.com>; amber-dev <amber-dev at openjdk.org>
Onderwerp: Re: Presentation at JCrete.org

Hello Rémi,

> May i ask why ?
>
> A logger process interpolated strings, combined with the
> new shiny syntax of the template processor, make sense to
> me.

1. Isn't the entire purpose of String interpolation to (safely) go from raw, unsanitized inputs into a fully formed String? Java allows us to go one step further and create output values other than a String, since doing so aligns perfectly with the broad goal of interpolation -- take raw inputs, sanitize them, and produce a clean, validated output. The JEP even says so in the final bullet of the goals section -- "Enable the creation of non-string values computed from literal text and embedded expressions without having to transit through an intermediate string representation." -- https://openjdk.org/jeps/430

2. And isn't the entire point why the Java team thought this was worth doing was not (just) because it was a simple boilerplate reduction, but because String interpolation is pretty much the front door for injection attacks? So, by attaching the "safe way" to the new shiny syntax, they have incentivized the right way for all developers.

With both of those in mind, it becomes clear that mixing side effects with String interpolation makes String interpolation inherently unsafe.

For example, let's think through your Logger example.

We just got finished (?) dealing with log4j. Somebody clever exploited our loggers, allowing them to download and execute scripts from the outside world.

If I were using your logger, where exactly would I perform validations to address this new threat? Yes, I know most of us just upgraded to the version that turned that feature off by default, but assume that you still need that feature. Because of the way your logger is designed, you have actually made it HARDER for the developer to validate the incoming request, not easier.

Maybe I could stick the validations in the logger itself? But what if I don't need those validations at every call site, only the service edges? Must I slow down all other log requests to defend myself in situations where it is not possible to be exploited?

Should I validate my String before putting it into the Logger? But now the design of your Logger forces me to do low-level validation on raw objects. String Templates allow me to output a "richer" object, which can enable richer validations. For example, if the logger takes in String a, b, and c, then a, b, and c on their own might not create a url, but together they do. I know a url on its own doesn't execute code, but you see my point. The point is, because your Logger won't let me check the outputs before using them, I must perform my validation on the "more raw" data inputs instead. This makes things harder to validate in some cases, and may require that I duplicate work that could be done by the Logger String Template (creating the "rich" output object) in order to make sure I am safe.

Alternatively, I could create different loggers that do different validations, or just forgoe validations entirely, but that's just moving the problem and complicating your solution. What if some call sites are vulnerable to ExploitA, others are vulnerable to ExploitB, and others are vulnerable to both? Do you plan to permutate through them all as individual types?

Mixing side-effects into String interpolation compromises safety.

String interpolation is meant to increase safety, but it cannot guarantee it on its own. The idea is that you take the created String (or T), and then perform further validations upon the output that could not be done within the context of a String interpolator, AND THEN, once you have proven that the output is actually safe, then execute the output, whether that's logging it, handing it to a DB, etc. Sometimes, that won't be necessary, but sometimes it will be, so the 2 need to be separate to facilitate that flexibility.

Trying to bundle it all into a String template that executes its inputs takes away this exact opportunity from you. This is why your Logger example, and any other example that does side-effects in a String template, is a bad idea.

> And this is a slipery slope, if we have good template
> processor and bad template processor, how to recognize a
> good template processor from a bad one ?

There will be many things that make template processors better or worse than others. But as demonstrated above, state-changing side-effects are firmly in the bad category.

And as for good, it's best if you try to make your String templates "pure functions" whenever possible. The rule can be a little flexible (for example, you may have a counter that counts the number of times a Template processor was called), but you should try to stick to this unless absolutely necessary.

And since the only real benefit that your logger gives us is removing one line of code, that would definitely not be necessary.

LOGGER."\{validate(a)}-\{validate(b)}-\{validate(c)}"

Versus doing this.

final String output = contextSpecificValidations(SOME_STR."\{a}-\{b}-\{c}");
actualLogger.info(output);

You save one line of code and lose out on all of the flexibility. Does it make sense why the LOGGER Template Processor is problematic now?

Thank you for your time!
David Alayachew


On Sun, Jul 23, 2023 at 4:09 PM <forax at univ-mlv.fr<mailto:forax at univ-mlv.fr>> wrote:
----- Original Message -----
> From: "Brian Goetz" <brian.goetz at oracle.com<mailto:brian.goetz at oracle.com>>
> To: "Remi Forax" <forax at univ-mlv.fr<mailto:forax at univ-mlv.fr>>, "amber-dev" <amber-dev at openjdk.org<mailto:amber-dev at openjdk.org>>
> Sent: Sunday, July 23, 2023 3:12:22 AM
> Subject: Re: Presentation at JCrete.org

>> So I don't believe a logger based on a template processor makes sense at least
>> not until StringTemplate.Processor.Linkage becomes non-sealed.
>
> I'll take this further: I don't think a logger based on a template
> processor makes sense *at all*.

May i ask why ?

A logger process interpolated strings, combined with the new shiny syntax of the template processor, make sense to me.

And this is a slipery slope, if we have good template processor and bad template processor, how to recognize a good template processor from a bad one ?

Rémi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-dev/attachments/20230724/534445d6/attachment-0001.htm>


More information about the amber-dev mailing list