RFR: 8224986: (str) optimize StringBuilder.append(CharSequence, int, int) for String arguments

Wed May 29 20:20:22 UTC 2019

Hi Ivan,

interesting suggestion. I should try it out.

Also keep in mind that inlining decisions are sensitive to call depth,
and that my current patch peels off String in such a way that
appendChars(CharSequence, int, int) will never see a String argument, 
which... Argh... :-)

/Claes

On 2019-05-29 21:26, Ivan Gerasimov wrote:
> Hi Claes!
> 
> It looks like for the cases when this.isLatin1() != s.isLatin1() the 
> code is essentially the same as in appendChars(CharSequence, int, int).
> So, to save some byte codes (which can actually help with inlining), it 
> might have been written as:
> 
> private final void appendChars(String s, int off, int end) {
>      if (isLatin1() ^ s.isLatin1()) {
>          appendChars((CharSequence)s, off, end);
>      } else {
>          System.arraycopy(s.value(), off << coder, this.value, 
> this.count << coder, (end - off) << coder);
>          count += end - off;
>      }
> }
> 
> With kind regards,
> Ivan
> 
> On 5/29/19 8:36 AM, Claes Redestad wrote:
>> Hi,
>>
>> it's been pointed out[1] that append(string.substring(start, end)) can
>> outperform append(string, start, end) in some circumstances. In part due
>> the former being intrinsified, but also I think due it tickling some
>> string concat optimizations that allow for eliding the StringBuilder
>> itself completely when there's only one non-constant argument.
>>
>> While not ruling out intrinsification of append(String, int, int), I was
>> able to achieve a reasonable speed-up by peeling off String arguments
>> to append(CharSequence, int, int) and dropping into a routine that uses
>> System.arraycopy when appropriate.
>>
>> This seems like a reasonable improvement in the interim, and an
>> opportunity to incorporate the relevant microbenchmarks.
>>
>> Bug:    https://bugs.openjdk.java.net/browse/JDK-8224986
>> Webrev: http://cr.openjdk.java.net/~redestad/8224986/open.00/
>>
>> I extended and incorporated the microbenchmark provided in [1] to also 
>> include variants with UTF16 Strings and mixes of arguments, ensuring
>> speedups on all appendBounds* variants[2].
>>
>> Thanks!
>>
>> /Claes
>>
>> [1] 
>> http://mail.openjdk.java.net/pipermail/core-libs-dev/2019-January/057834.html 
>>
>> [2]
>> Baseline:
>>
>> Benchmark            (length)  Mode  Cnt     Score     Error Units
>> appendBounds             1000  avgt   10   576.939 ±  21.125 ns/op
>>  ·gc.alloc.rate.norm     1000  avgt   10  2144.000 ±   0.001 B/op
>> appendBoundsMix          1000  avgt   10   855.308 ±  27.071 ns/op
>>  ·gc.alloc.rate.norm     1000  avgt   10  3160.000 ±   0.001 B/op
>> appendBoundsUtf16        1000  avgt   10  1424.518 ±  33.319 ns/op
>>  ·gc.alloc.rate.norm     1000  avgt   10  5192.000 ±   0.001 B/op
>>
>> Patch:
>>
>> Benchmark            (length)  Mode  Cnt     Score     Error Units
>> appendBounds             1000  avgt   10   466.640 ±  15.069 ns/op
>>  ·gc.alloc.rate.norm     1000  avgt   10  2120.000 ±   0.001 B/op
>> appendBoundsMix          1000  avgt   10   758.886 ±  22.983 ns/op
>>  ·gc.alloc.rate.norm     1000  avgt   10  3160.000 ±   0.001 B/op
>> appendBoundsUtf16        1000  avgt   10  1320.360 ±  49.097 ns/op
>>  ·gc.alloc.rate.norm     1000  avgt   10  5192.000 ±   0.001 B/op
>>
>