RFR: 8291641: Optimize StackTraceElement.toString()

Mon Aug 1 17:31:43 UTC 2022

On Thu, 28 Jul 2022 15:42:15 GMT, Quan Anh Mai <duke at openjdk.org> wrote:

>> I would like to contribute an optimized version of `StackTraceElement#toString()` that uses a single StringBuilder throughout creation to avoid intermediate `String` allocations. `StackTraceElement#toString()` is used in a number of JDK code paths including `Throwable#printStackTrace()`, as well as many JDK consumers may transform `StackTraceElement` `toString()` in logging frameworks capturing throwables and exceptions, and diagnostics performing dumps.
>> 
>> Given this usage and some observed JFR profiles from production services, I'd like to reduce the intermediate allocations to reduce CPU pressure in these circumstances. I have added a couple benchmarks for a sample `Throwable#printStackTrace()` converted to String via `StringWriter` and individual `StackTraceElement` `toString`. The former shows ~15% improvement, while the latter shows ~40% improvement.
>> 
>> Before
>> 
>> Benchmark                               Mode  Cnt       Score      Error  Units
>> StackTraceElementBench.printStackTrace  avgt   15  167147.066 ± 4260.521  ns/op
>> StackTraceElementBench.toString         avgt   15     132.781 ±    2.095  ns/op
>> 
>> 
>> After
>> 
>> Benchmark                               Mode  Cnt       Score      Error  Units
>> StackTraceElementBench.printStackTrace  avgt   15  142909.133 ± 2290.720  ns/op
>> StackTraceElementBench.toString         avgt   15      78.939 ±    0.469  ns/op
>
> If the objective of this patch is to optimise `StackTraceElement::toString()`, then I would suggest the most efficient method of calculating the length and encoding of the result, filling the backing buffer then using the package-private constructor of `java.lang.String` to construct the final result.
> 
> And if you want to propose exposing `StackTraceElement::appendTo(Appendable)`, then I think it should be a separate patch instead.
> 
> Thanks.

Thanks for the reviews and comments! To @merykitty 's point, I am now focusing this PR on optimizing `StackTraceElement.toString()` only -- additional changes can follow on in subsequent PRs.

I tested out a couple alternative implementations, and have currently landed on precomputing the StringBuilder capacity as seen in current state of this PR (commit https://github.com/openjdk/jdk/pull/9665/commits/aca442839c82688566d94cb5cb43e34b5b029642 ). This shaves a bit more off the `StackTraceElementBench.toString` microbenchmark compared to not precomputing the StringBuilder capacity. The trade off is more readability and more complex code as there are 2 passes -- one to precompute length and one to build the resulting String.

Benchmark                               Mode  Cnt       Score     Error  Units
StackTraceElementBench.printStackTrace  avgt   15  143218.862 ± 840.141  ns/op
StackTraceElementBench.toString         avgt   15      66.566 ±   0.288  ns/op

Alternately, I computed the length and collected into a `char[]` then converted to String as seen in commit https://github.com/openjdk/jdk/commit/de0a337e7f18f9a1525ec222ca22ced93166bf2e . This ends up similar to the presized StringBuilder, with even more complex code, so doesn't seem worth going that route. 

Benchmark                               Mode  Cnt       Score      Error  Units
StackTraceElementBench.printStackTrace  avgt   15  133324.365 ± 1133.849  ns/op
StackTraceElementBench.toString         avgt   15      68.701 ±    1.574  ns/op

I did not write up an implementation to buffer & encode to a `byte[]` and create a resulting string from that via the String constructors or `newStringUTF8NoRepl`. I'm not sure that paying the complexity of UTF-16 handling directly would be worth improvement removing the `char[]` -> `byte[]` conversion & copy, especially in most common case of Latin 1 `StackTraceElement` contents.

-------------

PR: https://git.openjdk.org/jdk/pull/9665