<i18n dev> RFR: 8335366: Improve String.format performance with fastpath [v12]
Shaojin Wen
duke at openjdk.org
Tue Jul 2 00:57:35 UTC 2024
On Mon, 1 Jul 2024 19:17:50 GMT, Shaojin Wen <duke at openjdk.org> wrote:
>> We need a String format solution with good performance. String Template was once expected, but it has been removed. j.u.Formatter is powerful, but its performance is not good enough.
>>
>> This PR implements a subset of j.u.Formatter capabilities. The performance is good enough that it is a fastpath for commonly used functions. When the supported functions are exceeded, it will fall back to using j.u.Formatter.
>>
>> The performance of this implementation is good enough, the fastpath has low detection cost, There is no noticeable performance degradation when falling back to j.u.Formatter via fastpath.
>>
>> Below is a comparison of String.format and concat-based and StringBuilder:
>>
>> * benchmark java code
>>
>> public class StringFormat {
>> @Benchmark
>> public String stringIntFormat() {
>> return "%s %d".formatted(s, i);
>> }
>>
>> @Benchmark
>> public String stringIntConcat() {
>> return s + " " + i;
>> }
>>
>> @Benchmark
>> public String stringIntStringBuilder() {
>> return new StringBuilder(s).append(" ").append(i).toString();
>> }
>> }
>>
>>
>> * benchmark number on macbook m1 pro
>>
>> Benchmark Mode Cnt Score Error Units
>> StringFormat.stringIntConcat avgt 15 6.541 ? 0.056 ns/op
>> StringFormat.stringIntFormat avgt 15 17.399 ? 0.133 ns/op
>> StringFormat.stringIntStringBuilder avgt 15 8.004 ? 0.063 ns/op
>>
>>
>> From the above data, we can see that the implementation of fastpath reduces the performance difference between String.format and StringBuilder from 10 times to 2~3 times.
>>
>> The implementation of fastpath supports the following four specifiers, which can appear at most twice and support a width of 1 to 9.
>>
>> d
>> x
>> X
>> s
>>
>> If necessary, we can add a few more.
>>
>>
>> Below is a comparison of performance numbers running on a MacBook M1, showing a significant performance improvement.
>>
>> -Benchmark Mode Cnt Score Error Units (baseline)
>> -StringFormat.complexFormat avgt 15 895.954 ? 52.541 ns/op
>> -StringFormat.decimalFormat avgt 15 277.420 ? 18.254 ns/op
>> -StringFormat.stringFormat avgt 15 66.787 ? 2.715 ns/op
>> -StringFormat.stringIntFormat avgt 15 81.046 ? 1.879 ns/op
>> -StringFormat.widthStringFormat avgt 15 38.897 ? 0.114 ns/op
>> -StringFormat.widthStringIntFormat avgt 15 109.841 ? 1.028 ns/op
>>
>> +Benchmark ...
>
> Shaojin Wen has updated the pull request incrementally with one additional commit since the last revision:
>
> improve StringFormat benchmark
I updated the StringFormat benchmark. Here are the performance numbers on a MacBook M1 Pro:
# baseline 5d866bf17d96bd0f0e4545d7eee5912eda2e3a94
-Benchmark Mode Cnt Score Error Units
-StringFormat.complexFormat avgt 15 980.323 ? 77.227 ns/op
-StringFormat.decimalFormat avgt 15 277.042 ? 17.070 ns/op
-StringFormat.intFormat avgt 15 61.400 ? 0.269 ns/op
-StringFormat.intFormatUtf16 avgt 15 126.581 ? 2.270 ns/op
-StringFormat.intHexFormat avgt 15 232.125 ? 79.431 ns/op
-StringFormat.intHexFormatUtf16 avgt 15 113.980 ? 1.537 ns/op
-StringFormat.intHexUFormat avgt 15 202.426 ? 10.029 ns/op
-StringFormat.intHexUFormatUtf16 avgt 15 132.994 ? 3.183 ns/op
-StringFormat.intIntFormat avgt 15 138.301 ? 7.932 ns/op
-StringFormat.intIntFormatUtf16 avgt 15 131.910 ? 7.142 ns/op
-StringFormat.intOctalFormat avgt 15 249.411 ? 89.041 ns/op
-StringFormat.intOctalFormatUtf16 avgt 15 117.244 ? 1.426 ns/op
-StringFormat.stringFormat avgt 15 63.645 ? 3.791 ns/op
-StringFormat.stringFormatUtf16 avgt 15 150.495 ? 9.363 ns/op
-StringFormat.stringIntFormat avgt 15 123.888 ? 4.177 ns/op
-StringFormat.stringIntFormatUtf16 avgt 15 174.825 ? 18.766 ns/op
-StringFormat.stringIntHexFormat avgt 15 150.750 ? 4.177 ns/op
-StringFormat.stringIntHexUFormat avgt 15 162.287 ? 5.530 ns/op
-StringFormat.stringIntOctalFormat avgt 15 157.474 ? 6.569 ns/op
-StringFormat.stringIntOctalFormatUtf16 avgt 15 151.507 ? 20.590 ns/op
-StringFormat.stringIntRFormat avgt 15 135.326 ? 1.884 ns/op
-StringFormat.stringIntRFormatUtf16 avgt 15 196.023 ? 6.874 ns/op
-StringFormat.stringWidthIntFormat avgt 15 151.324 ? 2.308 ns/op
-StringFormat.stringWidthIntFormatUtf16 avgt 15 136.286 ? 2.770 ns/op
-StringFormat.widthStringFormat avgt 15 93.960 ? 22.499 ns/op
-StringFormat.widthStringFormatUtf16 avgt 15 127.997 ? 18.408 ns/op
-StringFormat.widthStringIntFormat avgt 15 150.131 ? 12.496 ns/op
-StringFormat.widthStringIntFormatUtf16 avgt 15 159.734 ? 3.821 ns/op
# current 392898703ce9907b84d51098ef7b8f536d355742
+Benchmark Mode Cnt Score Error Units
+StringFormat.complexFormat avgt 15 909.352 ? 46.681 ns/op
+StringFormat.decimalFormat avgt 15 290.069 ? 20.778 ns/op
+StringFormat.intFormat avgt 15 11.211 ? 0.047 ns/op
+StringFormat.intFormatUtf16 avgt 15 11.799 ? 0.032 ns/op
+StringFormat.intHexFormat avgt 15 35.550 ? 7.420 ns/op
+StringFormat.intHexFormatUtf16 avgt 15 35.680 ? 1.608 ns/op
+StringFormat.intHexUFormat avgt 15 31.229 ? 3.043 ns/op
+StringFormat.intHexUFormatUtf16 avgt 15 46.915 ? 7.014 ns/op
+StringFormat.intIntFormat avgt 15 31.573 ? 0.539 ns/op
+StringFormat.intIntFormatUtf16 avgt 15 31.595 ? 0.345 ns/op
+StringFormat.intOctalFormat avgt 15 265.869 ? 60.233 ns/op
+StringFormat.intOctalFormatUtf16 avgt 15 153.843 ? 18.901 ns/op
+StringFormat.stringFormat avgt 15 10.852 ? 0.009 ns/op
+StringFormat.stringFormatUtf16 avgt 15 13.422 ? 0.203 ns/op
+StringFormat.stringIntFormat avgt 15 36.833 ? 5.495 ns/op
+StringFormat.stringIntFormatUtf16 avgt 15 21.482 ? 0.043 ns/op
+StringFormat.stringIntHexFormat avgt 15 19.926 ? 0.041 ns/op
+StringFormat.stringIntHexUFormat avgt 15 19.954 ? 0.055 ns/op
+StringFormat.stringIntOctalFormat avgt 15 137.748 ? 1.796 ns/op
+StringFormat.stringIntOctalFormatUtf16 avgt 15 126.615 ? 13.933 ns/op
+StringFormat.stringIntRFormat avgt 15 37.231 ? 2.449 ns/op
+StringFormat.stringIntRFormatUtf16 avgt 15 38.406 ? 0.577 ns/op
+StringFormat.stringWidthIntFormat avgt 15 43.897 ? 6.464 ns/op
+StringFormat.stringWidthIntFormatUtf16 avgt 15 28.683 ? 2.147 ns/op
+StringFormat.widthStringFormat avgt 15 11.421 ? 0.021 ns/op
+StringFormat.widthStringFormatUtf16 avgt 15 12.228 ? 0.023 ns/op
+StringFormat.widthStringIntFormat avgt 15 38.051 ? 6.635 ns/op
+StringFormat.widthStringIntFormatUtf16 avgt 15 23.347 ? 1.049 ns/op+
In the scenario where the performance is improved or worse through fastpath and then falls back to j.u.Formatter, it is confusing.
| | baseline | current | delta |
| --- | --- | --- | --- |
| StringFormat.complexFormat | 980.323 | 909.352 | 7.80% |
| StringFormat.decimalFormat | 277.042 | 290.069 | -4.49% |
| StringFormat.intOctalFormat | 249.411 | 265.869 | -6.19% |
| StringFormat.intOctalFormatUtf16 | 117.244 | 153.843 | -23.79% |
| StringFormat.stringIntOctalFormat | 157.474 | 137.748 | 14.32% |
| StringFormat.stringIntOctalFormatUtf16 | 151.507 | 126.615 | 19.66% |
In other scenarios hit by fastpath, the performance is significantly improved.
| | baseline | current | delta |
| --- | --- | --- | --- |
| StringFormat.intFormat | 61.400 | 11.211 | 447.68% |
| StringFormat.intFormatUtf16 | 126.581 | 11.799 | 972.81% |
| StringFormat.intHexFormat | 232.125 | 35.550 | 552.95% |
| StringFormat.intHexFormatUtf16 | 113.980 | 35.680 | 219.45% |
| StringFormat.intHexUFormat | 202.426 | 31.229 | 548.20% |
| StringFormat.intHexUFormatUtf16 | 132.994 | 46.915 | 183.48% |
| StringFormat.intIntFormat | 138.301 | 31.573 | 338.04% |
| StringFormat.intIntFormatUtf16 | 131.910 | 31.595 | 317.50% |
| StringFormat.stringFormat | 63.645 | 10.852 | 486.48% |
| StringFormat.stringFormatUtf16 | 150.495 | 13.422 | 1021.26% |
| StringFormat.stringIntFormat | 123.888 | 36.833 | 236.35% |
| StringFormat.stringIntFormatUtf16 | 174.825 | 21.482 | 713.82% |
| StringFormat.stringIntHexFormat | 150.750 | 19.926 | 656.55% |
| StringFormat.stringIntHexUFormat | 162.287 | 19.954 | 713.31% |
| StringFormat.stringIntRFormat | 135.326 | 37.231 | 263.48% |
| StringFormat.stringIntRFormatUtf16 | 196.023 | 38.406 | 410.40% |
| StringFormat.stringWidthIntFormat | 151.324 | 43.897 | 244.73% |
| StringFormat.stringWidthIntFormatUtf16 | 136.286 | 28.683 | 375.15% |
| StringFormat.widthStringFormat | 93.960 | 11.421 | 722.70% |
| StringFormat.widthStringFormatUtf16 | 127.997 | 12.228 | 946.75% |
| StringFormat.widthStringIntFormat | 150.131 | 38.051 | 294.55% |
| StringFormat.widthStringIntFormatUtf16 | 159.734 | 23.347 | 584.17% |
-------------
PR Comment: https://git.openjdk.org/jdk/pull/19956#issuecomment-2201575948
More information about the i18n-dev
mailing list