<i18n dev> RFR: 8335366: Improve String.format performance with fastpath [v9]
Chen Liang
liach at openjdk.org
Mon Jul 1 06:21:22 UTC 2024
On Sun, 30 Jun 2024 18:21:52 GMT, Shaojin Wen <duke at openjdk.org> wrote:
>> We need a String format solution with good performance. String Template was once expected, but it has been removed. j.u.Formatter is powerful, but its performance is not good enough.
>>
>> This PR implements a subset of j.u.Formatter capabilities. The performance is good enough that it is a fastpath for commonly used functions. When the supported functions are exceeded, it will fall back to using j.u.Formatter.
>>
>> The performance of this implementation is good enough, the fastpath has low detection cost, There is no noticeable performance degradation when falling back to j.u.Formatter via fastpath.
>>
>> Below is a comparison of String.format and concat-based and StringBuilder:
>>
>> * benchmark java code
>>
>> public class StringFormat {
>> @Benchmark
>> public String stringIntFormat() {
>> return "%s %d".formatted(s, i);
>> }
>>
>> @Benchmark
>> public String stringIntConcat() {
>> return s + " " + i;
>> }
>>
>> @Benchmark
>> public String stringIntStringBuilder() {
>> return new StringBuilder(s).append(" ").append(i).toString();
>> }
>> }
>>
>>
>> * benchmark number on macbook m1 pro
>>
>> Benchmark Mode Cnt Score Error Units
>> StringFormat.stringIntConcat avgt 15 6.541 ? 0.056 ns/op
>> StringFormat.stringIntFormat avgt 15 17.399 ? 0.133 ns/op
>> StringFormat.stringIntStringBuilder avgt 15 8.004 ? 0.063 ns/op
>>
>>
>> From the above data, we can see that the implementation of fastpath reduces the performance difference between String.format and StringBuilder from 10 times to 2~3 times.
>>
>> The implementation of fastpath supports the following four specifiers, which can appear at most twice and support a width of 1 to 9.
>>
>> d
>> x
>> X
>> s
>>
>> If necessary, we can add a few more.
>>
>>
>> Below is a comparison of performance numbers running on a MacBook M1, showing a significant performance improvement.
>>
>> -Benchmark Mode Cnt Score Error Units (baseline)
>> -StringFormat.complexFormat avgt 15 895.954 ? 52.541 ns/op
>> -StringFormat.decimalFormat avgt 15 277.420 ? 18.254 ns/op
>> -StringFormat.stringFormat avgt 15 66.787 ? 2.715 ns/op
>> -StringFormat.stringIntFormat avgt 15 81.046 ? 1.879 ns/op
>> -StringFormat.widthStringFormat avgt 15 38.897 ? 0.114 ns/op
>> -StringFormat.widthStringIntFormat avgt 15 109.841 ? 1.028 ns/op
>>
>> +Benchmark ...
>
> Shaojin Wen has updated the pull request incrementally with one additional commit since the last revision:
>
> code style
As promising as the performance number is, I think we need to ensure two things:
1. Correctness: this patch adds a lot of special cases; not sure if the current test cases already cover all of them. In addition, format is i18n stuff, which will need extra review besides the review for performance gains.
2. Validity: the existing benchmarks don't have profile pollution: see `ReflectionSpeedBenchmark` https://github.com/openjdk/jdk/blob/d9bcf061450ebfb7fe02b5a50c855db1d9178e5d/test/micro/org/openjdk/bench/java/lang/reflect/ReflectionSpeedBenchmark.java#L291 where the `Method.invoke` and `Constructor.newInstance` are called to tamper JIT's profiling, as JIT can conclude that only one format shape is ever used in your benchmark, which is unlikely in production. You should call `String.format` and `String.formatted` with varied format strings and arguments in setup for profile pollution.<br>
An extreme example would be at https://github.com/openjdk/jdk/pull/14944#issuecomment-1644050455, where `Arrays.hashCode(Object[])` where every element is an `Integer` is extremely fast, but slows down drastically once different arrays are passed.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/19956#issuecomment-2199327844
More information about the i18n-dev
mailing list