RFR: 8335366: Improve String.format performance with fastpath
David Schlosnagle
duke at openjdk.org
Sat Jun 29 16:49:20 UTC 2024
On Sat, 29 Jun 2024 11:53:31 GMT, Shaojin Wen <duke at openjdk.org> wrote:
> We need a String format solution with good performance. String Template was once expected, but it has been removed. j.u.Formatter is powerful, but its performance is not good enough.
>
> This PR implements a subset of j.u.Formatter capabilities. The performance is good enough that it is a fastpath for commonly used functions. When the supported functions are exceeded, it will fall back to using j.u.Formatter.
>
> The performance of this implementation is good enough, the fastpath has low detection cost, There is no noticeable performance degradation when falling back to j.u.Formatter via fastpath.
>
> Below is a comparison of String.format and concat-based and StringBuilder:
>
> * benchmark java code
>
> public class StringFormat {
> @Benchmark
> public String stringIntFormat() {
> return "%s %d".formatted(s, i);
> }
>
> @Benchmark
> public String stringIntConcat() {
> return s + " " + i;
> }
>
> @Benchmark
> public String stringIntStringBuilder() {
> return new StringBuilder(s).append(" ").append(i).toString();
> }
> }
>
>
> * benchmark number on macbook m1 pro
>
> Benchmark Mode Cnt Score Error Units
> StringFormat.stringIntConcat avgt 15 6.541 ? 0.056 ns/op
> StringFormat.stringIntFormat avgt 15 17.399 ? 0.133 ns/op
> StringFormat.stringIntStringBuilder avgt 15 8.004 ? 0.063 ns/op
>
>
> From the above data, we can see that the implementation of fastpath reduces the performance difference between String.format and StringBuilder from 10 times to 2~3 times.
>
> The implementation of fastpath supports the following four specifiers, which can appear at most twice and support a width of 1 to 9.
>
> d
> x
> X
> s
>
> If necessary, we can add a few more.
>
>
> Below is a comparison of performance numbers running on a MacBook M1, showing a significant performance improvement.
>
> -Benchmark Mode Cnt Score Error Units (baseline)
> -StringFormat.complexFormat avgt 15 895.954 ? 52.541 ns/op
> -StringFormat.decimalFormat avgt 15 277.420 ? 18.254 ns/op
> -StringFormat.stringFormat avgt 15 66.787 ? 2.715 ns/op
> -StringFormat.stringIntFormat avgt 15 81.046 ? 1.879 ns/op
> -StringFormat.widthStringFormat avgt 15 38.897 ? 0.114 ns/op
> -StringFormat.widthStringIntFormat avgt 15 109.841 ? 1.028 ns/op
>
> +Benchmark Mode Cnt Score Error Units (current f925339e93fdf7a281462554ce5d73139bd0f0cd)
> +StringFormat.complexF...
src/java.base/share/classes/java/lang/StringFormat.java line 65:
> 63: int max = format.length();
> 64: int off = format.indexOf('%');
> 65: if (off == -1 || off + 1 == max) {
If there is no `%` in the format string, should this just return the format string as result?
Suggestion:
int off = format.indexOf('%');
if (off == -1) {
// no formatting to be done
return format;
}
int max = format.length();
if (off + 1 == max) {
src/java.base/share/classes/java/lang/StringFormat.java line 82:
> 80: conv = format.charAt(off + 2);
> 81: }
> 82: }
is it worth handling `width > 9`?
Suggestion:
for (int i = 2; conv >= '1' && conv <= '9' && off + i < max; i++) {
width = (width * 10) + (conv - '0');
if (off + i < max) {
conv = format.charAt(off + i);
}
}
src/java.base/share/classes/java/lang/StringFormat.java line 90:
> 88: if (conv == STRING) {
> 89: arg = String.valueOf(arg);
> 90: }
should these conditions be combined?
Suggestion:
if (conv == STRING) {
if (isBigInt(arg)) {
conv = DECIMAL_INTEGER;
} else {
arg = String.valueOf(arg);
}
}
src/java.base/share/classes/java/lang/StringFormat.java line 102:
> 100: int max = format.length();
> 101: int off0 = format.indexOf('%');
> 102: if (off0 == -1 || off0 + 1 == max) {
similar to above, if `%` is not found, return `format`
Suggestion:
int off0 = format.indexOf('%');
if (off0 == -1) {
// no formatting to be done
return format;
}
int max = format.length();
if (off0 + 1 == max) {
src/java.base/share/classes/java/lang/StringFormat.java line 110:
> 108: width0 = conv0 - '0';
> 109: if (off0 + 2 < max) {
> 110: conv0 = format.charAt(off0 + 2);
is it worth handling `width0 > 9`?
Suggestion:
for (int i = 2; conv0 >= '1' && conv0 <= '9' && off0 + i < max; i++) {
width0 = (width0 * 10) + (conv0 - '0');
if (off0 + i < max) {
conv0 = format.charAt(off0 + i);
src/java.base/share/classes/java/lang/StringFormat.java line 123:
> 121: width1 = conv1 - '0';
> 122: if (off1 + 2 < max) {
> 123: conv1 = format.charAt(off1 + 2);
is it worth handling `width1 > 9`?
Suggestion:
for (int i = 2; conv1 >= '1' && conv1 <= '9' && off1 + i < max; i++) {
width1 = (width1 * 10) + (conv1 - '0');
if (off1 + i < max) {
conv1 = format.charAt(off1 + i);
src/java.base/share/classes/java/lang/StringFormat.java line 151:
> 149: coder |= str.coder();
> 150: arg1 = String.valueOf(str);
> 151: }
similar to above, should these conditions be consolidated?
Suggestion:
String str;
byte coder = format.coder();
if (conv0 == STRING) {
if (isBigInt(arg0)) {
conv0 = DECIMAL_INTEGER;
} else {
str = String.valueOf(arg0);
coder |= str.coder();
arg0 = str;
}
}
if (conv1 == STRING) {
if (isBigInt(arg1)) {
conv1 = DECIMAL_INTEGER;
} else {
str = String.valueOf(arg1);
coder |= str.coder();
arg1 = String.valueOf(str);
}
}
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/19956#discussion_r1659880719
PR Review Comment: https://git.openjdk.org/jdk/pull/19956#discussion_r1659888420
PR Review Comment: https://git.openjdk.org/jdk/pull/19956#discussion_r1659881435
PR Review Comment: https://git.openjdk.org/jdk/pull/19956#discussion_r1659882380
PR Review Comment: https://git.openjdk.org/jdk/pull/19956#discussion_r1659888524
PR Review Comment: https://git.openjdk.org/jdk/pull/19956#discussion_r1659888863
PR Review Comment: https://git.openjdk.org/jdk/pull/19956#discussion_r1659882487
More information about the core-libs-dev
mailing list