RFR: 8254082: AbstractStringBuilder.insert(int dstOffset, CharSequence s, int start, int end) is missing fast-path for String [v4]

Sat Nov 28 14:12:57 UTC 2020

On Sat, 28 Nov 2020 14:00:10 GMT, Сергей Цыпанов <github.com+10835776+stsypanov at openjdk.org> wrote:

>> Original mail: https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-September/069197.html
>> 
>> Hello,
>> 
>> while working with `StringBuilder.insert()` I've spotted that its delegate `AbstractStringBuilder.insert()` is missing
>> a fast-path for the most frequent case when its argument is `String`.
>> 
>> Previously they did similart optimization for `StirngBuilder.append(CharSequence, int, int)`,
>> see https://bugs.openjdk.java.net/browse/JDK-8224986
>> 
>> I'd like to contribute a trivial patch that brings improvement for the case when SB's content is Latin1
>> and inserted String is Latin1 as well.
>> 
>> To measure improvement I've used simple benchmark:
>> @BenchmarkMode(Mode.AverageTime)
>> @OutputTimeUnit(TimeUnit.NANOSECONDS)
>> @Fork(jvmArgsAppend = {"-Xms2g", "-Xmx2g"})
>> public class StringBuilderInsertBenchmark {
>> 
>>   @Benchmark
>>   public StringBuilder insert(Data data) {
>>     String string = data.string;
>>     return new StringBuilder().append("ABC").insert(1, string, 1, data.length + 1);
>>   }
>> 
>>   @State(Scope.Thread)
>>   public static class Data {
>>     String string;
>> 
>>     @Param({"true", "false"})
>>     private boolean latin;
>> 
>>     @Param({"8", "64", "128", "1024"})
>>     private int length;
>> 
>>     @Setup
>>     public void setup() {
>>       String alphabet = latin
>>         ? "abcdefghijklmnopqrstuvwxyz"        // English
>>         : "абвгдеёжзиклмнопрстуфхцчшщьыъэюя"; // Russian
>> 
>>       string = new RandomStringGenerator().randomString(alphabet, length + 2);
>>     }
>>   }
>> }
>> 
>> public final class RandomStringGenerator {
>> 
>>   public String randomString(String alphabet, int length) {
>>     char[] chars = alphabet.toCharArray();
>> 
>>     ThreadLocalRandom random = ThreadLocalRandom.current();
>> 
>>     char[] array = new char[length];
>>     for (int i = 0; i < length; i++) {
>>       array[i] = chars[random.nextInt(chars.length)];
>>     }
>> 
>>     return new String(array);
>>   }
>> }
>> Which gives
>> 
>>             (latin)  (length)       original       patched    Units
>> insert         true         8    24.2 ±  0.1   22.2 ±  0.0    ns/op
>> insert         true        64    53.8 ±  0.2   36.1 ±  0.1    ns/op
>> insert         true       128    80.9 ±  0.2   44.6 ±  0.0    ns/op
>> insert         true      1024   365.4 ±  0.5  109.8 ±  3.9    ns/op
>> 
>> insert        false         8    33.5 ±  0.5   32.3 ±  0.2    ns/op
>> insert        false        64    73.2 ±  0.3   73.2 ±  0.2    ns/op
>> insert        false       128   103.9 ±  0.6  103.3 ±  0.1    ns/op
>> insert        false      1024   576.5 ±  4.8  569.5 ±  2.0    ns/op
>> Patch is attached. As of tests tier1 and tier2 are ok.
>> 
>> With best regards,
>> Sergey Tsypanov
>
> Сергей Цыпанов has updated the pull request incrementally with one additional commit since the last revision:
> 
>   8254082: Consolidate putStringAt() methods: make length represent char count

Looks good to me. 

Theoretically the refactored `getBytes(byte[], int, byte)` should have no cost, but could cause some issues in some microbenchmarks that touches this. You mentioned this was used by StringConcatHelper, so could you run the org.openjdk.bench.java.lang.StringConcat micros to check that there are no regressions?

-------------

Marked as reviewed by redestad (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/402