[PATCH] Performance improvement of StringBuilder.append(CharSerquence, int, int)
Сергей Цыпанов
sergei.tsypanov at yandex.ru
Mon Jan 14 21:22:43 UTC 2019
Hello Tagir,
I've reworked the benchmark [1] to test also the case when non-latin characters are appended (e.g. Russian).
Measurements for both average time and memory allocation per benchmark method invocation are done for C2 and C1. [2]
As for other instances of CharSequence in JDK (excluding StringBu***er and String) we have those:
- jdk.internal.jline.extra.EditingHistory$PersistentLine
- jdk.internal.jline.extra.EditingHistory$NarrowingHistoryLine
- jdk.nashorn.internal.runtime.ConsString
- org.graalvm.compiler.phases.ClassTypeSequence
- org.graalvm.compiler.phases.LazyName and MethodDebugValueName
- com.sun.tools.javac.util.Name
All of them delegate the method, affected in my patch, either to String or another CharSequence which I'm sure is an instance of java.lang.String in 99 cases out of a hundred.
As for other questions of yours:
- indeed, there can be a case when a single char of a long String is appended to SB using SB::append(CharSequence, int, int).
Proposed patch will create a new String consisted of that character, so no memory would leak.
In this case I agree that performance effect of the patch would be negligible, I guess this is what you wanted to point out.
This case however seems to be less probable than the opposite case when we have more than 1 character.
- there also can be a case when many String instances are appended to the same SB.
Proposed patch won't affect other append calls, just SB::append(CharSequence, int, int).
I think this is useful optimisation as it provides better performance without changing lots of code and seems to be not breaking anything.
P. S. I've tried to modify the patch to make it call SB::append(String), see [3]. It's results seems to be even more interesting:
First let's look into C2 [4]. Here patched version again loses, but not dramatically.
But on C1 [5] patched version wins at least in avgt mode! The reason for memory loss is missing scalar replacement of allocated sub-string.
And again on C2 intrinsification doesn't work in patced version event when it calls SB::append(String) from StringBuilder itself.
Appendix
1. Benchmark
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Fork(jvmArgsAppend = {"-Xms2g", "-Xmx2g"})
public class StringBuilderAppendBenchmark {
@Benchmark
@SuppressWarnings("StringBufferReplaceableByString")
public String appendSubString(Data data) {
String englishStr = data.latinStr;
String nonLatinStr = data.nonLatinStr;
int beginIndex = data.beginIndex;
int endIndex = data.endIndex;
String substring = data.appendNonLatin ?
nonLatinStr.substring(beginIndex, endIndex) :
englishStr.substring(beginIndex, endIndex);
return new StringBuilder().append('L').append(substring).append(';').toString();
}
@Benchmark
public String appendBounds(Data data) {
String englishStr = data.latinStr;
String nonLatinStr = data.nonLatinStr;
int beginIndex = data.beginIndex;
int endIndex = data.endIndex;
String appended = data.appendNonLatin ? nonLatinStr : englishStr;
return new StringBuilder().append('L').append(appended, beginIndex, endIndex).append(';').toString();
}
@State(Scope.Thread)
public static class Data {
String latinStr;
String nonLatinStr;
@Param({"true", "false"})
boolean appendNonLatin;
@Param({"10", "100", "1000"})
private int length;
private int beginIndex;
private int endIndex;
private ThreadLocalRandom random = ThreadLocalRandom.current();
@Setup
public void setup() {
latinStr = randomString("abcdefghijklmnopqrstuvwxyz");
nonLatinStr = randomString("абвгдеёжзиклмнопрстуфхцчшщыьъэюя");
beginIndex = length / 4;
endIndex = length / 4 * 3;
}
private String randomString(String alphabet) {
char[] chars = alphabet.toCharArray();
StringBuilder sb = new StringBuilder(length);
for (int i = 0; i < length; i++) {
char c = chars[random.nextInt(chars.length)];
sb.append(c);
}
return sb.toString();
}
}
}
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2. Results for StringBuilder:
JDK 11.0.1, OpenJDK 64-Bit Server VM, 11.0.1+13-Ubuntu-3ubuntu116.04ppa1
Benchmark (appendNonLatin) (length) Mode Cnt Score Error Units
StringBuilderAppendBenchmark.appendBounds true 10 avgt 100 39,092 ± 0,558 ns/op
StringBuilderAppendBenchmark.appendBounds true 100 avgt 100 96,113 ± 0,739 ns/op
StringBuilderAppendBenchmark.appendBounds true 1000 avgt 100 653,716 ± 82,838 ns/op
StringBuilderAppendBenchmark.appendSubString true 10 avgt 100 27,052 ± 0,513 ns/op
StringBuilderAppendBenchmark.appendSubString true 100 avgt 100 46,154 ± 6,511 ns/op
StringBuilderAppendBenchmark.appendSubString true 1000 avgt 100 226,889 ± 5,857 ns/op
StringBuilderAppendBenchmark.appendBounds false 10 avgt 100 26,044 ± 0,291 ns/op
StringBuilderAppendBenchmark.appendBounds false 100 avgt 100 64,630 ± 0,565 ns/op
StringBuilderAppendBenchmark.appendBounds false 1000 avgt 100 314,778 ± 11,625 ns/op
StringBuilderAppendBenchmark.appendSubString false 10 avgt 100 22,899 ± 0,365 ns/op
StringBuilderAppendBenchmark.appendSubString false 100 avgt 100 25,075 ± 0,166 ns/op
StringBuilderAppendBenchmark.appendSubString false 1000 avgt 100 88,880 ± 0,258 ns/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm true 10 avgt 100 184,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm true 100 avgt 100 688,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm true 1000 avgt 100 5192,001 ± 0,001 B/op
StringBuilderAppendBenchmark.appendSubString:·gc.alloc.rate.norm true 10 avgt 100 128,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendSubString:·gc.alloc.rate.norm true 100 avgt 100 360,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendSubString:·gc.alloc.rate.norm true 1000 avgt 100 2608,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm false 10 avgt 100 104,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm false 100 avgt 100 344,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm false 1000 avgt 100 2144,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendSubString:·gc.alloc.rate.norm false 10 avgt 100 96,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendSubString:·gc.alloc.rate.norm false 100 avgt 100 192,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendSubString:·gc.alloc.rate.norm false 1000 avgt 100 1088,000 ± 0,001 B/op
-XX:TieredStopAtLevel=1
Benchmark (appendNonLatin) (length) Mode Cnt Score Error Units
StringBuilderAppendBenchmark.appendBounds true 10 avgt 100 99,823 ± 1,956 ns/op
StringBuilderAppendBenchmark.appendBounds true 100 avgt 100 395,615 ± 4,098 ns/op
StringBuilderAppendBenchmark.appendBounds true 1000 avgt 100 3198,072 ± 39,497 ns/op
StringBuilderAppendBenchmark.appendSubString true 10 avgt 100 94,530 ± 4,256 ns/op
StringBuilderAppendBenchmark.appendSubString true 100 avgt 100 148,147 ± 0,853 ns/op
StringBuilderAppendBenchmark.appendSubString true 1000 avgt 100 640,169 ± 1,821 ns/op
StringBuilderAppendBenchmark.appendBounds false 10 avgt 100 79,023 ± 14,429 ns/op
StringBuilderAppendBenchmark.appendBounds false 100 avgt 100 298,557 ± 9,347 ns/op
StringBuilderAppendBenchmark.appendBounds false 1000 avgt 100 2433,623 ± 32,966 ns/op
StringBuilderAppendBenchmark.appendSubString false 10 avgt 100 54,906 ± 0,112 ns/op
StringBuilderAppendBenchmark.appendSubString false 100 avgt 100 86,471 ± 0,453 ns/op
StringBuilderAppendBenchmark.appendSubString false 1000 avgt 100 251,550 ± 0,998 ns/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm true 10 avgt 100 184,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm true 100 avgt 100 688,001 ± 0,001 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm true 1000 avgt 100 5192,005 ± 0,003 B/op
StringBuilderAppendBenchmark.appendSubString:·gc.alloc.rate.norm true 10 avgt 100 256,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendSubString:·gc.alloc.rate.norm true 100 avgt 100 904,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendSubString:·gc.alloc.rate.norm true 1000 avgt 100 6752,001 ± 0,001 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm false 10 avgt 100 104,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm false 100 avgt 100 344,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm false 1000 avgt 100 2144,004 ± 0,003 B/op
StringBuilderAppendBenchmark.appendSubString:·gc.alloc.rate.norm false 10 avgt 100 152,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendSubString:·gc.alloc.rate.norm false 100 avgt 100 440,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendSubString:·gc.alloc.rate.norm false 1000 avgt 100 2688,000 ± 0,001 B/op
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
3. New patch
diff --git a/src/java.base/share/classes/java/lang/AbstractStringBuilder.java b/src/java.base/share/classes/java/lang/AbstractStringBuilder.java
--- a/src/java.base/share/classes/java/lang/AbstractStringBuilder.java
+++ b/src/java.base/share/classes/java/lang/AbstractStringBuilder.java
@@ -626,10 +626,7 @@
s = "null";
}
checkRange(start, end, s.length());
- int len = end - start;
- ensureCapacityInternal(count + len);
- appendChars(s, start, end);
- return this;
+ return append(s.subSequence(start, end).toString());
}
/**
@@ -1592,7 +1589,7 @@
* @param dstBegin the char index, not offset of byte[]
* @param coder the coder of dst[]
*/
- void getBytes(byte dst[], int dstBegin, byte coder) {
+ void getBytes(byte[] dst, int dstBegin, byte coder) {
if (this.coder == coder) {
System.arraycopy(value, 0, dst, dstBegin << coder, count << coder);
} else { // this.coder == LATIN && coder == UTF16
@@ -1686,29 +1683,8 @@
this.count = count + end - off;
}
- private final void appendChars(CharSequence s, int off, int end) {
- if (isLatin1()) {
- byte[] val = this.value;
- for (int i = off, j = count; i < end; i++) {
- char c = s.charAt(i);
- if (StringLatin1.canEncode(c)) {
- val[j++] = (byte)c;
- } else {
- count = j;
- inflate();
- StringUTF16.putCharsSB(this.value, j, s, i, end);
- count += end - i;
- return;
- }
- }
- } else {
- StringUTF16.putCharsSB(this.value, count, s, off, end);
- }
- count += end - off;
- }
-
/* IndexOutOfBoundsException, if out of bounds */
- private static void checkRange(int start, int end, int len) {
+ protected static void checkRange(int start, int end, int len) {
if (start < 0 || start > end || end > len) {
throw new IndexOutOfBoundsException(
"start " + start + ", end " + end + ", length " + len);
diff --git a/src/java.base/share/classes/java/lang/StringBuilder.java b/src/java.base/share/classes/java/lang/StringBuilder.java
--- a/src/java.base/share/classes/java/lang/StringBuilder.java
+++ b/src/java.base/share/classes/java/lang/StringBuilder.java
@@ -210,8 +210,11 @@
*/
@Override
public StringBuilder append(CharSequence s, int start, int end) {
- super.append(s, start, end);
- return this;
+ if (s == null) {
+ s = "null";
+ }
+ checkRange(start, end, s.length());
+ return append(s.subSequence(start, end).toString());
}
@Override
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
4. Comparison of original and patched versions for C2
Original JDK 11.0.1, OpenJDK 64-Bit Server VM, 11.0.1+13-Ubuntu-3ubuntu116.04ppa1
Benchmark (appendNonLatin) (length) Mode Cnt Score Error Units
StringBuilderAppendBenchmark.appendBounds true 10 avgt 50 48,047 ± 1,123 ns/op
StringBuilderAppendBenchmark.appendBounds true 100 avgt 50 147,212 ± 7,504 ns/op
StringBuilderAppendBenchmark.appendBounds true 1000 avgt 50 1042,751 ± 68,417 ns/op
StringBuilderAppendBenchmark.appendBounds false 10 avgt 50 22,323 ± 0,513 ns/op
StringBuilderAppendBenchmark.appendBounds false 100 avgt 50 74,883 ± 3,260 ns/op
StringBuilderAppendBenchmark.appendBounds false 1000 avgt 50 466,711 ± 25,859 ns/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm true 10 avgt 50 184,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm true 100 avgt 50 688,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm true 1000 avgt 50 5192,002 ± 0,002 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm false 10 avgt 50 80,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm false 100 avgt 50 320,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm false 1000 avgt 50 2144,001 ± 0,001 B/op
Patched JDK 11-internal, OpenJDK 64-Bit Server VM, 11-internal+0-adhoc.sergei.jdk11
Benchmark (appendNonLatin) (length) Mode Cnt Score Error Units
StringBuilderAppendBenchmark.appendBounds true 10 avgt 50 62,194 ± 1,955 ns/op
StringBuilderAppendBenchmark.appendBounds true 100 avgt 50 172,281 ± 1,865 ns/op
StringBuilderAppendBenchmark.appendBounds true 1000 avgt 50 1132,282 ± 8,697 ns/op
StringBuilderAppendBenchmark.appendBounds false 10 avgt 50 38,363 ± 0,815 ns/op
StringBuilderAppendBenchmark.appendBounds false 100 avgt 50 85,648 ± 1,087 ns/op
StringBuilderAppendBenchmark.appendBounds false 1000 avgt 50 505,787 ± 6,477 ns/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm true 10 avgt 50 256,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm true 100 avgt 50 904,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm true 1000 avgt 50 6752,002 ± 0,002 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm false 10 avgt 50 152,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm false 100 avgt 50 440,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm false 1000 avgt 50 2688,001 ± 0,001 B/op
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
5. Comparison of original and patched versions for C1
Original
Benchmark (appendNonLatin) (length) Mode Cnt Score Error Units
StringBuilderAppendBenchmark.appendBounds true 10 avgt 50 117,387 ± 1,886 ns/op
StringBuilderAppendBenchmark.appendBounds true 100 avgt 50 506,864 ± 10,523 ns/op
StringBuilderAppendBenchmark.appendBounds true 1000 avgt 50 4172,215 ± 62,144 ns/op
StringBuilderAppendBenchmark.appendBounds false 10 avgt 50 65,084 ± 1,547 ns/op
StringBuilderAppendBenchmark.appendBounds false 100 avgt 50 350,656 ± 5,061 ns/op
StringBuilderAppendBenchmark.appendBounds false 1000 avgt 50 2944,915 ± 43,500 ns/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm true 10 avgt 50 184,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm true 100 avgt 50 688,001 ± 0,001 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm true 1000 avgt 50 5192,006 ± 0,007 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm false 10 avgt 50 104,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm false 100 avgt 50 344,001 ± 0,001 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm false 1000 avgt 50 2144,004 ± 0,005 B/op
Patched
Benchmark (appendNonLatin) (length) Mode Cnt Score Error Units
StringBuilderAppendBenchmark.appendBounds true 10 avgt 50 109,806 ± 1,013 ns/op
StringBuilderAppendBenchmark.appendBounds true 100 avgt 50 212,775 ± 8,061 ns/op
StringBuilderAppendBenchmark.appendBounds true 1000 avgt 50 1308,307 ± 17,675 ns/op
StringBuilderAppendBenchmark.appendBounds false 10 avgt 50 69,315 ± 1,788 ns/op
StringBuilderAppendBenchmark.appendBounds false 100 avgt 50 122,495 ± 5,311 ns/op
StringBuilderAppendBenchmark.appendBounds false 1000 avgt 50 665,867 ± 31,839 ns/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm true 10 avgt 50 256,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm true 100 avgt 50 904,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm true 1000 avgt 50 6752,002 ± 0,002 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm false 10 avgt 50 152,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm false 100 avgt 50 440,000 ± 0,001 B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm false 1000 avgt 50 2688,001 ± 0,001 B/op
More information about the core-libs-dev
mailing list