<i18n dev> RFR: 8335366: Improve String.format performance with fastpath [v9]

Chen Liang liach at openjdk.org
Mon Jul 1 06:21:22 UTC 2024


On Sun, 30 Jun 2024 18:21:52 GMT, Shaojin Wen <duke at openjdk.org> wrote:

>> We need a String format solution with good performance. String Template was once expected, but it has been removed. j.u.Formatter is powerful, but its performance is not good enough.
>> 
>> This PR implements a subset of j.u.Formatter capabilities. The performance is good enough that it is a fastpath for commonly used functions. When the supported functions are exceeded, it will fall back to using j.u.Formatter.
>> 
>> The performance of this implementation is good enough, the fastpath has low detection cost, There is no noticeable performance degradation when falling back to j.u.Formatter via fastpath.
>> 
>> Below is a comparison of String.format and concat-based and StringBuilder:
>> 
>> * benchmark java code
>> 
>> public class StringFormat {
>>     @Benchmark
>>     public String stringIntFormat() {
>>         return "%s %d".formatted(s, i);
>>     }
>> 
>>     @Benchmark
>>     public String stringIntConcat() {
>>         return s + " " + i;
>>     }
>> 
>>     @Benchmark
>>     public String stringIntStringBuilder() {
>>         return new StringBuilder(s).append(" ").append(i).toString();
>>     }
>> }
>> 
>> 
>> * benchmark number on macbook m1 pro
>> 
>> Benchmark                            Mode  Cnt   Score   Error  Units
>> StringFormat.stringIntConcat         avgt   15   6.541 ? 0.056  ns/op
>> StringFormat.stringIntFormat         avgt   15  17.399 ? 0.133  ns/op
>> StringFormat.stringIntStringBuilder  avgt   15   8.004 ? 0.063  ns/op
>> 
>> 
>> From the above data, we can see that the implementation of fastpath reduces the performance difference between String.format and StringBuilder from 10 times to 2~3 times.
>> 
>> The implementation of fastpath supports the following four specifiers, which can appear at most twice and support a width of 1 to 9.
>> 
>> d
>> x
>> X
>> s
>> 
>> If necessary, we can add a few more.
>> 
>> 
>> Below is a comparison of performance numbers running on a MacBook M1, showing a significant performance improvement.
>> 
>> -Benchmark                          Mode  Cnt    Score    Error  Units (baseline)
>> -StringFormat.complexFormat         avgt   15  895.954 ? 52.541  ns/op
>> -StringFormat.decimalFormat         avgt   15  277.420 ? 18.254  ns/op
>> -StringFormat.stringFormat          avgt   15   66.787 ?  2.715  ns/op
>> -StringFormat.stringIntFormat       avgt   15   81.046 ?  1.879  ns/op
>> -StringFormat.widthStringFormat     avgt   15   38.897 ?  0.114  ns/op
>> -StringFormat.widthStringIntFormat  avgt   15  109.841 ?  1.028  ns/op
>> 
>> +Benchmark                ...
>
> Shaojin Wen has updated the pull request incrementally with one additional commit since the last revision:
> 
>   code style

As promising as the performance number is, I think we need to ensure two things:
1. Correctness: this patch adds a lot of special cases; not sure if the current test cases already cover all of them. In addition, format is i18n stuff, which will need extra review besides the review for performance gains.
2. Validity: the existing benchmarks don't have profile pollution: see `ReflectionSpeedBenchmark` https://github.com/openjdk/jdk/blob/d9bcf061450ebfb7fe02b5a50c855db1d9178e5d/test/micro/org/openjdk/bench/java/lang/reflect/ReflectionSpeedBenchmark.java#L291 where the `Method.invoke` and `Constructor.newInstance` are called to tamper JIT's profiling, as JIT can conclude that only one format shape is ever used in your benchmark, which is unlikely in production. You should call `String.format` and `String.formatted` with varied format strings and arguments in setup for profile pollution.<br>
  An extreme example would be at https://github.com/openjdk/jdk/pull/14944#issuecomment-1644050455, where `Arrays.hashCode(Object[])` where every element is an `Integer` is extremely fast, but slows down drastically once different arrays are passed.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/19956#issuecomment-2199327844


More information about the i18n-dev mailing list