[PATCH] Performance improvement of StringBuilder.append(CharSerquence, int, int)

Сергей Цыпанов sergei.tsypanov at yandex.ru
Mon Jan 14 21:22:43 UTC 2019


Hello Tagir,

I've reworked the benchmark [1] to test also the case when non-latin characters are appended (e.g. Russian).

Measurements for both average time and memory allocation per benchmark method invocation are done for C2 and C1. [2]

As for other instances of CharSequence in JDK (excluding StringBu***er and String) we have those: 

- jdk.internal.jline.extra.EditingHistory$PersistentLine
- jdk.internal.jline.extra.EditingHistory$NarrowingHistoryLine
- jdk.nashorn.internal.runtime.ConsString
- org.graalvm.compiler.phases.ClassTypeSequence
- org.graalvm.compiler.phases.LazyName and MethodDebugValueName
- com.sun.tools.javac.util.Name

All of them delegate the method, affected in my patch, either to String or another CharSequence which I'm sure is an instance of java.lang.String in 99 cases out of a hundred.

As for other questions of yours:

- indeed, there can be a case when a single char of a long String is appended to SB using SB::append(CharSequence, int, int).
Proposed patch will create a new String consisted of that character, so no memory would leak.
In this case I agree that performance effect of the patch would be negligible, I guess this is what you wanted to point out.
This case however seems to be less probable than the opposite case when we have more than 1 character.

- there also can be a case when many String instances are appended to the same SB.
Proposed patch won't affect other append calls, just SB::append(CharSequence, int, int).

I think this is useful optimisation as it provides better performance without changing lots of code and seems to be not breaking anything.

P. S. I've tried to modify the patch to make it call SB::append(String), see [3]. It's results seems to be even more interesting:

First let's look into C2 [4]. Here patched version again loses, but not dramatically.
But on C1 [5] patched version wins at least in avgt mode! The reason for memory loss is missing scalar replacement of allocated sub-string.

And again on C2 intrinsification doesn't work in patced version event when it calls SB::append(String) from StringBuilder itself.

Appendix

1. Benchmark

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Fork(jvmArgsAppend = {"-Xms2g", "-Xmx2g"})
public class StringBuilderAppendBenchmark {

  @Benchmark
  @SuppressWarnings("StringBufferReplaceableByString")
  public String appendSubString(Data data) {
    String englishStr = data.latinStr;
    String nonLatinStr = data.nonLatinStr;
    int beginIndex = data.beginIndex;
    int endIndex = data.endIndex;

    String substring = data.appendNonLatin ?
            nonLatinStr.substring(beginIndex, endIndex) :
            englishStr.substring(beginIndex, endIndex);

    return new StringBuilder().append('L').append(substring).append(';').toString();
  }

  @Benchmark
  public String appendBounds(Data data) {
    String englishStr = data.latinStr;
    String nonLatinStr = data.nonLatinStr;
    int beginIndex = data.beginIndex;
    int endIndex = data.endIndex;

    String appended = data.appendNonLatin ? nonLatinStr : englishStr;

    return new StringBuilder().append('L').append(appended, beginIndex, endIndex).append(';').toString();
  }

  @State(Scope.Thread)
  public static class Data {
    String latinStr;
    String nonLatinStr;

    @Param({"true", "false"})
    boolean appendNonLatin;

    @Param({"10", "100", "1000"})
    private int length;

    private int beginIndex;
    private int endIndex;

    private ThreadLocalRandom random = ThreadLocalRandom.current();

    @Setup
    public void setup() {
      latinStr = randomString("abcdefghijklmnopqrstuvwxyz");
      nonLatinStr = randomString("абвгдеёжзиклмнопрстуфхцчшщыьъэюя");
      beginIndex = length / 4;
      endIndex = length / 4 * 3;
    }

    private String randomString(String alphabet) {
      char[] chars = alphabet.toCharArray();

      StringBuilder sb = new StringBuilder(length);
      for (int i = 0; i < length; i++) {
        char c = chars[random.nextInt(chars.length)];
        sb.append(c);
      }

      return sb.toString();
    }
  }

}

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------

2. Results for StringBuilder:

JDK 11.0.1, OpenJDK 64-Bit Server VM, 11.0.1+13-Ubuntu-3ubuntu116.04ppa1

Benchmark                                                          (appendNonLatin)  (length)  Mode  Cnt     Score     Error   Units
StringBuilderAppendBenchmark.appendBounds                                      true        10  avgt  100    39,092 ±   0,558   ns/op
StringBuilderAppendBenchmark.appendBounds                                      true       100  avgt  100    96,113 ±   0,739   ns/op
StringBuilderAppendBenchmark.appendBounds                                      true      1000  avgt  100   653,716 ±  82,838   ns/op

StringBuilderAppendBenchmark.appendSubString                                   true        10  avgt  100    27,052 ±   0,513   ns/op
StringBuilderAppendBenchmark.appendSubString                                   true       100  avgt  100    46,154 ±   6,511   ns/op
StringBuilderAppendBenchmark.appendSubString                                   true      1000  avgt  100   226,889 ±   5,857   ns/op

StringBuilderAppendBenchmark.appendBounds                                     false        10  avgt  100    26,044 ±   0,291   ns/op
StringBuilderAppendBenchmark.appendBounds                                     false       100  avgt  100    64,630 ±   0,565   ns/op
StringBuilderAppendBenchmark.appendBounds                                     false      1000  avgt  100   314,778 ±  11,625   ns/op

StringBuilderAppendBenchmark.appendSubString                                  false        10  avgt  100    22,899 ±   0,365   ns/op
StringBuilderAppendBenchmark.appendSubString                                  false       100  avgt  100    25,075 ±   0,166   ns/op
StringBuilderAppendBenchmark.appendSubString                                  false      1000  avgt  100    88,880 ±   0,258   ns/op

StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                  true        10  avgt  100   184,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                  true       100  avgt  100   688,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                  true      1000  avgt  100  5192,001 ±   0,001    B/op

StringBuilderAppendBenchmark.appendSubString:·gc.alloc.rate.norm               true        10  avgt  100   128,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendSubString:·gc.alloc.rate.norm               true       100  avgt  100   360,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendSubString:·gc.alloc.rate.norm               true      1000  avgt  100  2608,000 ±   0,001    B/op

StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                 false        10  avgt  100   104,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                 false       100  avgt  100   344,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                 false      1000  avgt  100  2144,000 ±   0,001    B/op

StringBuilderAppendBenchmark.appendSubString:·gc.alloc.rate.norm              false        10  avgt  100    96,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendSubString:·gc.alloc.rate.norm              false       100  avgt  100   192,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendSubString:·gc.alloc.rate.norm              false      1000  avgt  100  1088,000 ±   0,001    B/op


-XX:TieredStopAtLevel=1

Benchmark                                                          (appendNonLatin)  (length)  Mode  Cnt     Score     Error   Units
StringBuilderAppendBenchmark.appendBounds                                      true        10  avgt  100    99,823 ±   1,956   ns/op
StringBuilderAppendBenchmark.appendBounds                                      true       100  avgt  100   395,615 ±   4,098   ns/op
StringBuilderAppendBenchmark.appendBounds                                      true      1000  avgt  100  3198,072 ±  39,497   ns/op

StringBuilderAppendBenchmark.appendSubString                                   true        10  avgt  100    94,530 ±   4,256   ns/op
StringBuilderAppendBenchmark.appendSubString                                   true       100  avgt  100   148,147 ±   0,853   ns/op
StringBuilderAppendBenchmark.appendSubString                                   true      1000  avgt  100   640,169 ±   1,821   ns/op

StringBuilderAppendBenchmark.appendBounds                                     false        10  avgt  100    79,023 ±  14,429   ns/op
StringBuilderAppendBenchmark.appendBounds                                     false       100  avgt  100   298,557 ±   9,347   ns/op
StringBuilderAppendBenchmark.appendBounds                                     false      1000  avgt  100  2433,623 ±  32,966   ns/op

StringBuilderAppendBenchmark.appendSubString                                  false        10  avgt  100    54,906 ±   0,112   ns/op
StringBuilderAppendBenchmark.appendSubString                                  false       100  avgt  100    86,471 ±   0,453   ns/op
StringBuilderAppendBenchmark.appendSubString                                  false      1000  avgt  100   251,550 ±   0,998   ns/op

StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                  true        10  avgt  100   184,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                  true       100  avgt  100   688,001 ±   0,001    B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                  true      1000  avgt  100  5192,005 ±   0,003    B/op

StringBuilderAppendBenchmark.appendSubString:·gc.alloc.rate.norm               true        10  avgt  100   256,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendSubString:·gc.alloc.rate.norm               true       100  avgt  100   904,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendSubString:·gc.alloc.rate.norm               true      1000  avgt  100  6752,001 ±   0,001    B/op

StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                 false        10  avgt  100   104,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                 false       100  avgt  100   344,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                 false      1000  avgt  100  2144,004 ±   0,003    B/op

StringBuilderAppendBenchmark.appendSubString:·gc.alloc.rate.norm              false        10  avgt  100   152,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendSubString:·gc.alloc.rate.norm              false       100  avgt  100   440,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendSubString:·gc.alloc.rate.norm              false      1000  avgt  100  2688,000 ±   0,001    B/op

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------

3. New patch

diff --git a/src/java.base/share/classes/java/lang/AbstractStringBuilder.java b/src/java.base/share/classes/java/lang/AbstractStringBuilder.java
--- a/src/java.base/share/classes/java/lang/AbstractStringBuilder.java
+++ b/src/java.base/share/classes/java/lang/AbstractStringBuilder.java
@@ -626,10 +626,7 @@
             s = "null";
         }
         checkRange(start, end, s.length());
-        int len = end - start;
-        ensureCapacityInternal(count + len);
-        appendChars(s, start, end);
-        return this;
+        return append(s.subSequence(start, end).toString());
     }
 
     /**
@@ -1592,7 +1589,7 @@
      * @param dstBegin  the char index, not offset of byte[]
      * @param coder     the coder of dst[]
      */
-    void getBytes(byte dst[], int dstBegin, byte coder) {
+    void getBytes(byte[] dst, int dstBegin, byte coder) {
         if (this.coder == coder) {
             System.arraycopy(value, 0, dst, dstBegin << coder, count << coder);
         } else {        // this.coder == LATIN && coder == UTF16
@@ -1686,29 +1683,8 @@
         this.count = count + end - off;
     }
 
-    private final void appendChars(CharSequence s, int off, int end) {
-        if (isLatin1()) {
-            byte[] val = this.value;
-            for (int i = off, j = count; i < end; i++) {
-                char c = s.charAt(i);
-                if (StringLatin1.canEncode(c)) {
-                    val[j++] = (byte)c;
-                } else {
-                    count = j;
-                    inflate();
-                    StringUTF16.putCharsSB(this.value, j, s, i, end);
-                    count += end - i;
-                    return;
-                }
-            }
-        } else {
-            StringUTF16.putCharsSB(this.value, count, s, off, end);
-        }
-        count += end - off;
-    }
-
     /* IndexOutOfBoundsException, if out of bounds */
-    private static void checkRange(int start, int end, int len) {
+    protected static void checkRange(int start, int end, int len) {
         if (start < 0 || start > end || end > len) {
             throw new IndexOutOfBoundsException(
                 "start " + start + ", end " + end + ", length " + len);
diff --git a/src/java.base/share/classes/java/lang/StringBuilder.java b/src/java.base/share/classes/java/lang/StringBuilder.java
--- a/src/java.base/share/classes/java/lang/StringBuilder.java
+++ b/src/java.base/share/classes/java/lang/StringBuilder.java
@@ -210,8 +210,11 @@
      */
     @Override
     public StringBuilder append(CharSequence s, int start, int end) {
-        super.append(s, start, end);
-        return this;
+        if (s == null) {
+            s = "null";
+        }
+        checkRange(start, end, s.length());
+        return append(s.subSequence(start, end).toString());
     }
 
     @Override

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------

4. Comparison of original and patched versions for C2

Original JDK 11.0.1, OpenJDK 64-Bit Server VM, 11.0.1+13-Ubuntu-3ubuntu116.04ppa1

Benchmark                                                               (appendNonLatin)  (length)  Mode  Cnt     Score     Error   Units
StringBuilderAppendBenchmark.appendBounds                                           true        10  avgt   50    48,047 ±   1,123   ns/op
StringBuilderAppendBenchmark.appendBounds                                           true       100  avgt   50   147,212 ±   7,504   ns/op
StringBuilderAppendBenchmark.appendBounds                                           true      1000  avgt   50  1042,751 ±  68,417   ns/op

StringBuilderAppendBenchmark.appendBounds                                          false        10  avgt   50    22,323 ±   0,513   ns/op
StringBuilderAppendBenchmark.appendBounds                                          false       100  avgt   50    74,883 ±   3,260   ns/op
StringBuilderAppendBenchmark.appendBounds                                          false      1000  avgt   50   466,711 ±  25,859   ns/op

StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                       true        10  avgt   50   184,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                       true       100  avgt   50   688,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                       true      1000  avgt   50  5192,002 ±   0,002    B/op

StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                      false        10  avgt   50    80,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                      false       100  avgt   50   320,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                      false      1000  avgt   50  2144,001 ±   0,001    B/op


Patched JDK 11-internal, OpenJDK 64-Bit Server VM, 11-internal+0-adhoc.sergei.jdk11

Benchmark                                                               (appendNonLatin)  (length)  Mode  Cnt     Score     Error   Units
StringBuilderAppendBenchmark.appendBounds                                           true        10  avgt   50    62,194 ±   1,955   ns/op
StringBuilderAppendBenchmark.appendBounds                                           true       100  avgt   50   172,281 ±   1,865   ns/op
StringBuilderAppendBenchmark.appendBounds                                           true      1000  avgt   50  1132,282 ±   8,697   ns/op

StringBuilderAppendBenchmark.appendBounds                                          false        10  avgt   50    38,363 ±   0,815   ns/op
StringBuilderAppendBenchmark.appendBounds                                          false       100  avgt   50    85,648 ±   1,087   ns/op
StringBuilderAppendBenchmark.appendBounds                                          false      1000  avgt   50   505,787 ±   6,477   ns/op

StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                       true        10  avgt   50   256,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                       true       100  avgt   50   904,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                       true      1000  avgt   50  6752,002 ±   0,002    B/op

StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                      false        10  avgt   50   152,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                      false       100  avgt   50   440,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                      false      1000  avgt   50  2688,001 ±   0,001    B/op

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------

5. Comparison of original and patched versions for C1

Original

Benchmark                                                               (appendNonLatin)  (length)  Mode  Cnt     Score     Error   Units
StringBuilderAppendBenchmark.appendBounds                                           true        10  avgt   50   117,387 ±   1,886   ns/op
StringBuilderAppendBenchmark.appendBounds                                           true       100  avgt   50   506,864 ±  10,523   ns/op
StringBuilderAppendBenchmark.appendBounds                                           true      1000  avgt   50  4172,215 ±  62,144   ns/op

StringBuilderAppendBenchmark.appendBounds                                          false        10  avgt   50    65,084 ±   1,547   ns/op
StringBuilderAppendBenchmark.appendBounds                                          false       100  avgt   50   350,656 ±   5,061   ns/op
StringBuilderAppendBenchmark.appendBounds                                          false      1000  avgt   50  2944,915 ±  43,500   ns/op

StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                       true        10  avgt   50   184,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                       true       100  avgt   50   688,001 ±   0,001    B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                       true      1000  avgt   50  5192,006 ±   0,007    B/op

StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                      false        10  avgt   50   104,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                      false       100  avgt   50   344,001 ±   0,001    B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                      false      1000  avgt   50  2144,004 ±   0,005    B/op

Patched

Benchmark                                                               (appendNonLatin)  (length)  Mode  Cnt     Score     Error   Units
StringBuilderAppendBenchmark.appendBounds                                           true        10  avgt   50   109,806 ±   1,013   ns/op
StringBuilderAppendBenchmark.appendBounds                                           true       100  avgt   50   212,775 ±   8,061   ns/op
StringBuilderAppendBenchmark.appendBounds                                           true      1000  avgt   50  1308,307 ±  17,675   ns/op

StringBuilderAppendBenchmark.appendBounds                                          false        10  avgt   50    69,315 ±   1,788   ns/op
StringBuilderAppendBenchmark.appendBounds                                          false       100  avgt   50   122,495 ±   5,311   ns/op
StringBuilderAppendBenchmark.appendBounds                                          false      1000  avgt   50   665,867 ±  31,839   ns/op

StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                       true        10  avgt   50   256,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                       true       100  avgt   50   904,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                       true      1000  avgt   50  6752,002 ±   0,002    B/op

StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                      false        10  avgt   50   152,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                      false       100  avgt   50   440,000 ±   0,001    B/op
StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm                      false      1000  avgt   50  2688,001 ±   0,001    B/op


More information about the core-libs-dev mailing list