RFR: 8265237: String.join and StringJoiner can be improved further [v3]
Peter Levart
plevart at openjdk.java.net
Sun Apr 18 20:19:47 UTC 2021
On Sun, 18 Apr 2021 19:55:20 GMT, Peter Levart <plevart at openjdk.org> wrote:
>> While JDK-8148937 improved StringJoiner class by replacing internal use of getChars that copies out characters from String elements into a char[] array with StringBuilder which is somehow more optimal, the improvement was marginal in speed (0% ... 10%) and mainly for smaller strings, while GC was reduced by about 50% in average per operation.
>> Initial attempt to tackle that issue was more involved, but was later discarded because it was apparently using too much internal String details in code that lives outside String and outside java.lang package.
>> But there is another way to package such "intimate" code - we can put it into String itself and just call it from StringJoiner.
>> This PR is an attempt at doing just that. It introduces new package-private method in `java.lang.String` which is then used from both pubic static `String.join` methods as well as from `java.util.StringJoiner` (via SharedSecrets). The improvements can be seen by running the following JMH benchmark:
>>
>> https://gist.github.com/plevart/86ac7fc6d4541dbc08256cde544019ce
>>
>> The comparative results are here:
>>
>> https://jmh.morethan.io/?gist=7eb421cf7982456a2962269137f71c15
>>
>> The jmh-result.json files are here:
>>
>> https://gist.github.com/plevart/7eb421cf7982456a2962269137f71c15
>>
>> Improvement in speed ranges from 8% (for small strings) to 200% (for long strings), while creation of garbage has been further reduced to an almost garbage-free operation.
>>
>> So WDYT?
>
> Peter Levart has updated the pull request incrementally with one additional commit since the last revision:
>
> Fix overflow checking logic, add test for it, avoid branch in loop, add 1-char strings to JHM test
So here's another iteration where I have taken into account all the suggestions:
- avoid the branch in the append loop by appending the 1st element outside the loop so that delimiter can unconditionally be appended inside the loop for rest of the elements.
- fixed overflow checking logic + added a test for it - previously it was possible to provoke wrong exception: IllegalArgumentException: size is negative. I added another test instead of extending existing one because it needs a 1GiB string and together with 2GiB string in existing test it would not fit in 4GiB heap any more when running with unpatched version. I wanted the tests to pass with unpatched version too.
- extended the parameters of JMH benchmark to include 1-character strings + re-run the benchmarks
The results are not that different from previous version (that used a conditional in the loop) as JIT is probably smart enough to unroll the loop and divide it into at least two parts (the way I did it manually now). What brings the most improvement in this version is the annotation to force in-lining of the String.join package-private method. Without it, it would happen that for one combination of JMH parameters, the results were a little worse compared to unpatched baseline - not for much: about 7%, but with the annotation, all combinations of parameters bring consistent improvement in range from 12% to 102%:
https://jmh.morethan.io/?gist=babf00a655d21ff784ede887af53ec31
-------------
PR: https://git.openjdk.java.net/jdk/pull/3501
More information about the core-libs-dev
mailing list