[PATCH] Performance improvement of StringBuilder.append(CharSerquence, int, int)

Sergey Ponomarev stokito at gmail.com
Mon Jan 14 14:57:50 UTC 2019


Hi,

I recently used the append(str, beginIndex, endIndex) method for an
optimization an code simplification and I was wondered that performance
instead decreased. It turned out that the usual append(String) method is
intrinsic. I'm not a JVM expert but I guess that it should be easy to
implement such intrinsic for the  append(str, beginIndex, endIndex) and it
gives a good tool to make performance improvements.
But if even with the patch proposed by Sergey we'll get an improvement then
I'll ask you to accept the patch even if it allocates an additional object.
For those cases which you Tagir mentioned I'm pretty sure that performance
will be anyway better than now.


On Fri, 11 Jan 2019 at 17:06, Tagir Valeev <amaembo at gmail.com> wrote:

> Hello!
>
> Your benchmark tests only one of many possible cases. What if string
> contains non-latin1 characters? What if many strings are appended to the
> same string builder? What if you append only one character from the long
> string? Performance fixes in such widely used class could be introduced
> only after investigating all possible use scenarios. Also not only cpu
> time, but also memory allocations should be measured.
>
> With best regards,
> Tagir Valeev.
>
> пт, 11 янв. 2019 г., 2:35 Сергей Цыпанов sergei.tsypanov at yandex.ru:
>
> > Hi,
> >
> > while investigating an issue I've found out that performance of
> expression
> > (1) can be significantly improved by turning it into (2)
> >
> >
> >
> ----------------------------------------------------------------------------------------------------
> >
> >      //str is an instance of java.lang.String
> >
> > (1) return new StringBuilder().append(str, beginIndex,
> > endIndex).toString();
> >
> > ---->
> >
> > (2) String substring = str.substring(beginIndex, endIndex);
> >
> >      return new StringBuilder().append(substring).toString();
> >
> >
> >
> ----------------------------------------------------------------------------------------------------
> >
> > Using benchmark [1] I've got the following results:
> >
> > Benchmark
> > (length)  Mode  Cnt     Score     Error   Units
> > StringBuilderAppendBenchmark.appendBounds
> >       10  avgt   20    21,161 ±   0,545   ns/op
> > StringBuilderAppendBenchmark.appendBounds
> >      100  avgt   20    72,230 ±   3,452   ns/op
> > StringBuilderAppendBenchmark.appendBounds
> >     1000  avgt   20   476,256 ±  12,044   ns/op
> >
> > StringBuilderAppendBenchmark.appendSubString
> >        10  avgt   20    26,023 ±   0,871   ns/op
> > StringBuilderAppendBenchmark.appendSubString
> >       100  avgt   20    38,824 ±   1,185   ns/op
> > StringBuilderAppendBenchmark.appendSubString
> >      1000  avgt   20   214,072 ±   3,372   ns/op
> >
> > StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm
> >       10  avgt   20    80,000 ±   0,001    B/op
> > StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm
> >      100  avgt   20   320,000 ±   0,001    B/op
> > StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm
> >     1000  avgt   20  2120,001 ±   0,001    B/op
> >
> > StringBuilderAppendBenchmark.appendSubString:·gc.alloc.rate.norm
> >        10  avgt   20    96,000 ±   0,001    B/op
> > StringBuilderAppendBenchmark.appendSubString:·gc.alloc.rate.norm
> >       100  avgt   20   192,000 ±   0,001    B/op
> > StringBuilderAppendBenchmark.appendSubString:·gc.alloc.rate.norm
> >      1000  avgt   20  1088,000 ±   0,001    B/op
> >
> >
> > The reason for such a difference seems to be @HotSpotIntrinsicCandidate
> > over StringBuilder.append(String),
> > while StringBuilder.append(CharSerquence, int, int) copies chars/bytes
> one
> > by one.
> >
> > I've prepared a patch [2] and applied it to fresh sources of JDK 11.
> > After rebuild the same benchmark demonstrated severe performance
> > degradation:
> >
> > - before patch
> >
> > Benchmark
> >  (length)  Mode  Cnt     Score     Error   Units
> > StringBuilderAppendBenchmark.appendBounds
> >    10  avgt   20    21,592 ±   0,763   ns/op
> > StringBuilderAppendBenchmark.appendBounds
> >   100  avgt   20    72,241 ±   2,799   ns/op
> > StringBuilderAppendBenchmark.appendBounds
> >  1000  avgt   20   488,106 ±  43,042   ns/op
> >
> > StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm
> >    10  avgt   20    80,000 ±   0,001    B/op
> > StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm
> >   100  avgt   20   320,000 ±   0,001    B/op
> > StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm
> >  1000  avgt   20  2132,001 ±  10,691    B/op
> >
> > - after patch
> >
> > Benchmark
> >  (length)  Mode  Cnt     Score     Error   Units
> > StringBuilderAppendBenchmark.appendBounds
> >    10  avgt   20    40,186 ±   1,828   ns/op
> > StringBuilderAppendBenchmark.appendBounds
> >   100  avgt   20    89,031 ±   6,664   ns/op
> > StringBuilderAppendBenchmark.appendBounds
> >  1000  avgt   20   669,934 ±  76,888   ns/op
> >
> > StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm
> >    10  avgt   20   152,000 ±   0,001    B/op
> > StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm
> >   100  avgt   20   440,000 ±   0,001    B/op
> > StringBuilderAppendBenchmark.appendBounds:·gc.alloc.rate.norm
> >  1000  avgt   20  2688,001 ±   0,002    B/op
> >
> > As I see the reason of degradation is that patched code now dodges
> > intrinsified StringBuilder.append(String) and calls directly
> > AbstractStringBuilder.append(String) which in spite of utilizing
> > System::arraycopy performes worse than intrinsic.
> >
> > So my question is should I rework the patch somehow to make it perform
> > better or I need to apply the transformation mentioned above ad hoc?
> >
> >
> > Appendix:
> >
> > 1) benchmark
> >
> > @BenchmarkMode(Mode.AverageTime)
> > @OutputTimeUnit(TimeUnit.NANOSECONDS)
> > @Fork(jvmArgsAppend = {"-Xms2g", "-Xmx2g"})
> > public class StringBuilderAppendBenchmark {
> >
> >   @Benchmark
> >   @SuppressWarnings("StringBufferReplaceableByString")
> >   public String appendSubString(Data data) {
> >     String str = data.str;
> >     int beginIndex = data.beginIndex;
> >     int endIndex = data.endIndex;
> >
> >     String substring = str.substring(beginIndex, endIndex);
> >
> >     return new
> > StringBuilder().append('L').append(substring).append(';').toString();
> >   }
> >
> >   @Benchmark
> >   public String appendBounds(Data data) {
> >     String str = data.str;
> >     int beginIndex = data.beginIndex;
> >     int endIndex = data.endIndex;
> >
> >     return new StringBuilder().append('L').append(str, beginIndex,
> > endIndex).append(';').toString();
> >   }
> >
> >   @State(Scope.Thread)
> >   public static class Data {
> >     String str;
> >
> >     @Param({"10", "100", "1000"})
> >     private int length;
> >
> >     private int beginIndex;
> >     private int endIndex;
> >
> >     @Setup
> >     public void setup() {
> >       str = randomString();
> >       beginIndex = length / 4;
> >       endIndex = length / 4* 3;
> >     }
> >
> >     private String randomString() {
> >       ThreadLocalRandom random = ThreadLocalRandom.current();
> >       char[] chars = "abcdefghijklmnopqrstuvwxyz".toCharArray();
> >
> >       StringBuilder sb = new StringBuilder(length);
> >       for (int i = 0; i < length; i++) {
> >         char c = chars[random.nextInt(chars.length)];
> >         sb.append(c);
> >       }
> >       return sb.toString();
> >     }
> >   }
> >
> > }
> >
> >
> > 2) patch
> >
> > diff --git
> > a/src/java.base/share/classes/java/lang/AbstractStringBuilder.java
> > b/src/java.base/share/classes/java/lang/AbstractStringBuilder.java
> > --- a/src/java.base/share/classes/java/lang/AbstractStringBuilder.java
> > +++ b/src/java.base/share/classes/java/lang/AbstractStringBuilder.java
> > @@ -626,10 +626,7 @@
> >              s = "null";
> >          }
> >          checkRange(start, end, s.length());
> > -        int len = end - start;
> > -        ensureCapacityInternal(count + len);
> > -        appendChars(s, start, end);
> > -        return this;
> > +        return append(s.subSequence(start, end).toString());
> >      }
> >
> >      /**
> > @@ -1686,27 +1683,6 @@
> >          this.count = count + end - off;
> >      }
> >
> > -    private final void appendChars(CharSequence s, int off, int end) {
> > -        if (isLatin1()) {
> > -            byte[] val = this.value;
> > -            for (int i = off, j = count; i < end; i++) {
> > -                char c = s.charAt(i);
> > -                if (StringLatin1.canEncode(c)) {
> > -                    val[j++] = (byte)c;
> > -                } else {
> > -                    count = j;
> > -                    inflate();
> > -                    StringUTF16.putCharsSB(this.value, j, s, i, end);
> > -                    count += end - i;
> > -                    return;
> > -                }
> > -            }
> > -        } else {
> > -            StringUTF16.putCharsSB(this.value, count, s, off, end);
> > -        }
> > -        count += end - off;
> > -    }
> > -
> >      /* IndexOutOfBoundsException, if out of bounds */
> >      private static void checkRange(int start, int end, int len) {
> >          if (start < 0 || start > end || end > len) {
> >
> >
> >
> >
>


-- 
Sergey Ponomarev <https://linkedin.com/in/stokito>, skype:stokito


More information about the core-libs-dev mailing list