RFR: 8265237: String.join and StringJoiner can be improved further
While JDK-8148937 improved StringJoiner class by replacing internal use of getChars that copies out characters from String elements into a char[] array with StringBuilder which is somehow more optimal, the improvement was marginal in speed (0% ... 10%) and mainly for smaller strings, while GC was reduced by about 50% in average per operation. Initial attempt to tackle that issue was more involved, but was later discarded because it was apparently using too much internal String details in code that lives outside String and outside java.lang package. But there is another way to package such "intimate" code - we can put it into String itself and just call it from StringJoiner. This PR is an attempt at doing just that. It introduces new package-private method in `java.lang.String` which is then used from both pubic static `String.join` methods as well as from `java.util.StringJoiner` (via SharedSecrets). The improvements can be seen by running the following JMH benchmark: https://gist.github.com/plevart/86ac7fc6d4541dbc08256cde544019ce The comparative results are here: https://jmh.morethan.io/?gist=7eb421cf7982456a2962269137f71c15 The jmh-result.json files are here: https://gist.github.com/plevart/7eb421cf7982456a2962269137f71c15 Improvement in speed ranges from 8% (for small strings) to 200% (for long strings), while creation of garbage has been further reduced to an almost garbage-free operation. So WDYT? ------------- Commit messages: - Make JavaLangAccess JLA field private static final - Alternative String.join Changes: https://git.openjdk.java.net/jdk/pull/3501/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3501&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8265237 Stats: 97 lines in 4 files changed: 61 ins; 18 del; 18 mod Patch: https://git.openjdk.java.net/jdk/pull/3501.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3501/head:pull/3501 PR: https://git.openjdk.java.net/jdk/pull/3501
On Wed, 14 Apr 2021 18:58:57 GMT, Peter Levart <plevart@openjdk.org> wrote:
While JDK-8148937 improved StringJoiner class by replacing internal use of getChars that copies out characters from String elements into a char[] array with StringBuilder which is somehow more optimal, the improvement was marginal in speed (0% ... 10%) and mainly for smaller strings, while GC was reduced by about 50% in average per operation. Initial attempt to tackle that issue was more involved, but was later discarded because it was apparently using too much internal String details in code that lives outside String and outside java.lang package. But there is another way to package such "intimate" code - we can put it into String itself and just call it from StringJoiner. This PR is an attempt at doing just that. It introduces new package-private method in `java.lang.String` which is then used from both pubic static `String.join` methods as well as from `java.util.StringJoiner` (via SharedSecrets). The improvements can be seen by running the following JMH benchmark:
https://gist.github.com/plevart/86ac7fc6d4541dbc08256cde544019ce
The comparative results are here:
https://jmh.morethan.io/?gist=7eb421cf7982456a2962269137f71c15
The jmh-result.json files are here:
https://gist.github.com/plevart/7eb421cf7982456a2962269137f71c15
Improvement in speed ranges from 8% (for small strings) to 200% (for long strings), while creation of garbage has been further reduced to an almost garbage-free operation.
So WDYT?
Some background: This change was motivated by Sergey Tsypanov's attempt to replace usage of StringBuilder with string concatenation in JDK-8265075 only to find out that string concatenation in java.base module is compiled down to inline usage of StringBuilder, so no improvement was possible. StringJoiner API and String.join static utility methods lend itself to a better implementation, but in last incarnation they are implemented with StringBuilder internally: String.join -> StringJoiner -> StringBuilder. A lot of JDK internal usages of StringBuilder were already replaced with StringJoiner (JDK-8054714) but under the hood the StringBuilder is still used. There were also lots of incremental attempts to improve StringJoiner. I think this one is the last one for some time to come... :-) ------------- PR: https://git.openjdk.java.net/jdk/pull/3501
On Wed, 14 Apr 2021 18:58:57 GMT, Peter Levart <plevart@openjdk.org> wrote:
While JDK-8148937 improved StringJoiner class by replacing internal use of getChars that copies out characters from String elements into a char[] array with StringBuilder which is somehow more optimal, the improvement was marginal in speed (0% ... 10%) and mainly for smaller strings, while GC was reduced by about 50% in average per operation. Initial attempt to tackle that issue was more involved, but was later discarded because it was apparently using too much internal String details in code that lives outside String and outside java.lang package. But there is another way to package such "intimate" code - we can put it into String itself and just call it from StringJoiner. This PR is an attempt at doing just that. It introduces new package-private method in `java.lang.String` which is then used from both pubic static `String.join` methods as well as from `java.util.StringJoiner` (via SharedSecrets). The improvements can be seen by running the following JMH benchmark:
https://gist.github.com/plevart/86ac7fc6d4541dbc08256cde544019ce
The comparative results are here:
https://jmh.morethan.io/?gist=7eb421cf7982456a2962269137f71c15
The jmh-result.json files are here:
https://gist.github.com/plevart/7eb421cf7982456a2962269137f71c15
Improvement in speed ranges from 8% (for small strings) to 200% (for long strings), while creation of garbage has been further reduced to an almost garbage-free operation.
So WDYT?
src/java.base/share/classes/java/lang/String.java line 3230:
3228: 3229: /** 3230: * Designated join routine.
Did you mean "dedicated"? ------------- PR: https://git.openjdk.java.net/jdk/pull/3501
On Wed, 14 Apr 2021 19:54:26 GMT, Florent Guillaume <github.com+592810+efge@openjdk.org> wrote:
While JDK-8148937 improved StringJoiner class by replacing internal use of getChars that copies out characters from String elements into a char[] array with StringBuilder which is somehow more optimal, the improvement was marginal in speed (0% ... 10%) and mainly for smaller strings, while GC was reduced by about 50% in average per operation. Initial attempt to tackle that issue was more involved, but was later discarded because it was apparently using too much internal String details in code that lives outside String and outside java.lang package. But there is another way to package such "intimate" code - we can put it into String itself and just call it from StringJoiner. This PR is an attempt at doing just that. It introduces new package-private method in `java.lang.String` which is then used from both pubic static `String.join` methods as well as from `java.util.StringJoiner` (via SharedSecrets). The improvements can be seen by running the following JMH benchmark:
https://gist.github.com/plevart/86ac7fc6d4541dbc08256cde544019ce
The comparative results are here:
https://jmh.morethan.io/?gist=7eb421cf7982456a2962269137f71c15
The jmh-result.json files are here:
https://gist.github.com/plevart/7eb421cf7982456a2962269137f71c15
Improvement in speed ranges from 8% (for small strings) to 200% (for long strings), while creation of garbage has been further reduced to an almost garbage-free operation.
So WDYT?
src/java.base/share/classes/java/lang/String.java line 3230:
3228: 3229: /** 3230: * Designated join routine.
Did you mean "dedicated"?
No, I meant designated. It is the routine that all other public API entry points call at the end to do the job. Would some other word more accurately describe that? I definitely didn't mean "dedicated". ------------- PR: https://git.openjdk.java.net/jdk/pull/3501
On Wed, 14 Apr 2021 22:23:57 GMT, Peter Levart <plevart@openjdk.org> wrote:
src/java.base/share/classes/java/lang/String.java line 3230:
3228: 3229: /** 3230: * Designated join routine.
Did you mean "dedicated"?
No, I meant designated. It is the routine that all other public API entry points call at the end to do the job. Would some other word more accurately describe that? I definitely didn't mean "dedicated".
Oh then sorry, I thought it was a typo of some sort. I'd have said something like "Centralized join logic". But whatever works for you. ------------- PR: https://git.openjdk.java.net/jdk/pull/3501
On Wed, 14 Apr 2021 18:58:57 GMT, Peter Levart <plevart@openjdk.org> wrote:
While JDK-8148937 improved StringJoiner class by replacing internal use of getChars that copies out characters from String elements into a char[] array with StringBuilder which is somehow more optimal, the improvement was marginal in speed (0% ... 10%) and mainly for smaller strings, while GC was reduced by about 50% in average per operation. Initial attempt to tackle that issue was more involved, but was later discarded because it was apparently using too much internal String details in code that lives outside String and outside java.lang package. But there is another way to package such "intimate" code - we can put it into String itself and just call it from StringJoiner. This PR is an attempt at doing just that. It introduces new package-private method in `java.lang.String` which is then used from both pubic static `String.join` methods as well as from `java.util.StringJoiner` (via SharedSecrets). The improvements can be seen by running the following JMH benchmark:
https://gist.github.com/plevart/86ac7fc6d4541dbc08256cde544019ce
The comparative results are here:
https://jmh.morethan.io/?gist=7eb421cf7982456a2962269137f71c15
The jmh-result.json files are here:
https://gist.github.com/plevart/7eb421cf7982456a2962269137f71c15
Improvement in speed ranges from 8% (for small strings) to 200% (for long strings), while creation of garbage has been further reduced to an almost garbage-free operation.
So WDYT?
I think this seems reasonable. The String encoding details doesn't leak outside of java.lang, and numbers do look convincing. There's a StringJoinerBenchmark micro added by JDK-8148937 which could perhaps be expanded with the scenarios you've experimented with here? ------------- PR: https://git.openjdk.java.net/jdk/pull/3501
On Wed, 14 Apr 2021 20:03:27 GMT, Claes Redestad <redestad@openjdk.org> wrote:
There's a StringJoinerBenchmark micro added by JDK-8148937 which could perhaps be expanded with the scenarios you've experimented with here?
I modified that micro benchmark and added a method to also measure String.join static method along with StringJoiner for same parameters and extended the range of parameters to cover more diversity. The results are here: https://jmh.morethan.io/?gist=c38cc13d63774ec505cc8d394c00d502 It is apparent that there is a huge speedup when strings are bigger. But even smaller strings get a substantial speedup. There's also substantial reduction of garbage per operation. Previously the garbage amounted to the internal array of String elements and the StringBuffer with its internal byte[] array of characters. Now only the array of elements is the garbage. ------------- PR: https://git.openjdk.java.net/jdk/pull/3501
While JDK-8148937 improved StringJoiner class by replacing internal use of getChars that copies out characters from String elements into a char[] array with StringBuilder which is somehow more optimal, the improvement was marginal in speed (0% ... 10%) and mainly for smaller strings, while GC was reduced by about 50% in average per operation. Initial attempt to tackle that issue was more involved, but was later discarded because it was apparently using too much internal String details in code that lives outside String and outside java.lang package. But there is another way to package such "intimate" code - we can put it into String itself and just call it from StringJoiner. This PR is an attempt at doing just that. It introduces new package-private method in `java.lang.String` which is then used from both pubic static `String.join` methods as well as from `java.util.StringJoiner` (via SharedSecrets). The improvements can be seen by running the following JMH benchmark:
https://gist.github.com/plevart/86ac7fc6d4541dbc08256cde544019ce
The comparative results are here:
https://jmh.morethan.io/?gist=7eb421cf7982456a2962269137f71c15
The jmh-result.json files are here:
https://gist.github.com/plevart/7eb421cf7982456a2962269137f71c15
Improvement in speed ranges from 8% (for small strings) to 200% (for long strings), while creation of garbage has been further reduced to an almost garbage-free operation.
So WDYT?
Peter Levart has updated the pull request incrementally with one additional commit since the last revision: Add String.join benchmark method to StringJoinerBenchmark and adjust some parameters to cover bigger range ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3501/files - new: https://git.openjdk.java.net/jdk/pull/3501/files/62b577fd..6160e5aa Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3501&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3501&range=00-01 Stats: 11 lines in 1 file changed: 8 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/3501.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3501/head:pull/3501 PR: https://git.openjdk.java.net/jdk/pull/3501
On Thu, 15 Apr 2021 19:26:48 GMT, Peter Levart <plevart@openjdk.org> wrote:
While JDK-8148937 improved StringJoiner class by replacing internal use of getChars that copies out characters from String elements into a char[] array with StringBuilder which is somehow more optimal, the improvement was marginal in speed (0% ... 10%) and mainly for smaller strings, while GC was reduced by about 50% in average per operation. Initial attempt to tackle that issue was more involved, but was later discarded because it was apparently using too much internal String details in code that lives outside String and outside java.lang package. But there is another way to package such "intimate" code - we can put it into String itself and just call it from StringJoiner. This PR is an attempt at doing just that. It introduces new package-private method in `java.lang.String` which is then used from both pubic static `String.join` methods as well as from `java.util.StringJoiner` (via SharedSecrets). The improvements can be seen by running the following JMH benchmark:
https://gist.github.com/plevart/86ac7fc6d4541dbc08256cde544019ce
The comparative results are here:
https://jmh.morethan.io/?gist=7eb421cf7982456a2962269137f71c15
The jmh-result.json files are here:
https://gist.github.com/plevart/7eb421cf7982456a2962269137f71c15
Improvement in speed ranges from 8% (for small strings) to 200% (for long strings), while creation of garbage has been further reduced to an almost garbage-free operation.
So WDYT?
Peter Levart has updated the pull request incrementally with one additional commit since the last revision:
Add String.join benchmark method to StringJoinerBenchmark and adjust some parameters to cover bigger range
Look very good. src/java.base/share/classes/java/lang/String.java line 3254:
3252: 3253: byte[] value = StringConcatHelper.newArray(((long) icoder << 32) | llen); 3254: int off = 0;
StringConcatHelper.newArray() can double the length (based on the coder) and it is then truncated to 32 bits when passed to UNSAFE.allocatlUnitializedArray. The test of length above only ensures llen can be truncated to 32 bits without loss of data. src/java.base/share/classes/java/lang/String.java line 3256:
3254: int off = 0; 3255: prefix.getBytes(value, off, coder); off += prefix.length(); 3256: for (int i = 0; i < size; i++) {
Can you save a branch inside the loop by handling element 0 outside the loop and then do the loop for the rest? ------------- PR: https://git.openjdk.java.net/jdk/pull/3501
On Fri, 16 Apr 2021 19:05:25 GMT, Roger Riggs <rriggs@openjdk.org> wrote:
Peter Levart has updated the pull request incrementally with one additional commit since the last revision:
Add String.join benchmark method to StringJoinerBenchmark and adjust some parameters to cover bigger range
src/java.base/share/classes/java/lang/String.java line 3254:
3252: 3253: byte[] value = StringConcatHelper.newArray(((long) icoder << 32) | llen); 3254: int off = 0;
StringConcatHelper.newArray() can double the length (based on the coder) and it is then truncated to 32 bits when passed to UNSAFE.allocatlUnitializedArray. The test of length above only ensures llen can be truncated to 32 bits without loss of data.
I thought about that, yes. And I think we have to do the check for the doubled length before calling the newArray. I checked the StringJoinerTest and it only deals with ascii strings unfortunately. Will have to add a test for that too...
src/java.base/share/classes/java/lang/String.java line 3256:
3254: int off = 0; 3255: prefix.getBytes(value, off, coder); off += prefix.length(); 3256: for (int i = 0; i < size; i++) {
Can you save a branch inside the loop by handling element 0 outside the loop and then do the loop for the rest?
Thanks, I'll do that and then re-test to see if there's any improvement. ------------- PR: https://git.openjdk.java.net/jdk/pull/3501
On Fri, 16 Apr 2021 23:02:33 GMT, Peter Levart <plevart@openjdk.org> wrote:
src/java.base/share/classes/java/lang/String.java line 3254:
3252: 3253: byte[] value = StringConcatHelper.newArray(((long) icoder << 32) | llen); 3254: int off = 0;
StringConcatHelper.newArray() can double the length (based on the coder) and it is then truncated to 32 bits when passed to UNSAFE.allocatlUnitializedArray. The test of length above only ensures llen can be truncated to 32 bits without loss of data.
I thought about that, yes. And I think we have to do the check for the doubled length before calling the newArray. I checked the StringJoinerTest and it only deals with ascii strings unfortunately. Will have to add a test for that too...
I do the checks before calling `StringConcatHelper.newArray()` now and pass a long value to it that already holds the number of bytes needed and where the upper 32 bits (coder) is always 0. ------------- PR: https://git.openjdk.java.net/jdk/pull/3501
On Thu, 15 Apr 2021 19:26:48 GMT, Peter Levart <plevart@openjdk.org> wrote:
While JDK-8148937 improved StringJoiner class by replacing internal use of getChars that copies out characters from String elements into a char[] array with StringBuilder which is somehow more optimal, the improvement was marginal in speed (0% ... 10%) and mainly for smaller strings, while GC was reduced by about 50% in average per operation. Initial attempt to tackle that issue was more involved, but was later discarded because it was apparently using too much internal String details in code that lives outside String and outside java.lang package. But there is another way to package such "intimate" code - we can put it into String itself and just call it from StringJoiner. This PR is an attempt at doing just that. It introduces new package-private method in `java.lang.String` which is then used from both pubic static `String.join` methods as well as from `java.util.StringJoiner` (via SharedSecrets). The improvements can be seen by running the following JMH benchmark:
https://gist.github.com/plevart/86ac7fc6d4541dbc08256cde544019ce
The comparative results are here:
https://jmh.morethan.io/?gist=7eb421cf7982456a2962269137f71c15
The jmh-result.json files are here:
https://gist.github.com/plevart/7eb421cf7982456a2962269137f71c15
Improvement in speed ranges from 8% (for small strings) to 200% (for long strings), while creation of garbage has been further reduced to an almost garbage-free operation.
So WDYT?
Peter Levart has updated the pull request incrementally with one additional commit since the last revision:
Add String.join benchmark method to StringJoinerBenchmark and adjust some parameters to cover bigger range
Great work, thanks! I tried to do something similar a couple of years ago but did not submit my patch. One thing that stopped me is that joining many one-character strings was slower with my patch, compared to the baseline. Something like this: // setup String[] strings = new String[100]; Arrays.fill(strings, "a"); // benchmark return String.join(",", strings); Could you add this case to your benchmark, for completeness? ------------- PR: https://git.openjdk.java.net/jdk/pull/3501
While JDK-8148937 improved StringJoiner class by replacing internal use of getChars that copies out characters from String elements into a char[] array with StringBuilder which is somehow more optimal, the improvement was marginal in speed (0% ... 10%) and mainly for smaller strings, while GC was reduced by about 50% in average per operation. Initial attempt to tackle that issue was more involved, but was later discarded because it was apparently using too much internal String details in code that lives outside String and outside java.lang package. But there is another way to package such "intimate" code - we can put it into String itself and just call it from StringJoiner. This PR is an attempt at doing just that. It introduces new package-private method in `java.lang.String` which is then used from both pubic static `String.join` methods as well as from `java.util.StringJoiner` (via SharedSecrets). The improvements can be seen by running the following JMH benchmark:
https://gist.github.com/plevart/86ac7fc6d4541dbc08256cde544019ce
The comparative results are here:
https://jmh.morethan.io/?gist=7eb421cf7982456a2962269137f71c15
The jmh-result.json files are here:
https://gist.github.com/plevart/7eb421cf7982456a2962269137f71c15
Improvement in speed ranges from 8% (for small strings) to 200% (for long strings), while creation of garbage has been further reduced to an almost garbage-free operation.
So WDYT?
Peter Levart has updated the pull request incrementally with one additional commit since the last revision: Fix overflow checking logic, add test for it, avoid branch in loop, add 1-char strings to JHM test ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3501/files - new: https://git.openjdk.java.net/jdk/pull/3501/files/6160e5aa..a2d5e819 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3501&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3501&range=01-02 Stats: 123 lines in 3 files changed: 110 ins; 4 del; 9 mod Patch: https://git.openjdk.java.net/jdk/pull/3501.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3501/head:pull/3501 PR: https://git.openjdk.java.net/jdk/pull/3501
On Sun, 18 Apr 2021 19:55:20 GMT, Peter Levart <plevart@openjdk.org> wrote:
While JDK-8148937 improved StringJoiner class by replacing internal use of getChars that copies out characters from String elements into a char[] array with StringBuilder which is somehow more optimal, the improvement was marginal in speed (0% ... 10%) and mainly for smaller strings, while GC was reduced by about 50% in average per operation. Initial attempt to tackle that issue was more involved, but was later discarded because it was apparently using too much internal String details in code that lives outside String and outside java.lang package. But there is another way to package such "intimate" code - we can put it into String itself and just call it from StringJoiner. This PR is an attempt at doing just that. It introduces new package-private method in `java.lang.String` which is then used from both pubic static `String.join` methods as well as from `java.util.StringJoiner` (via SharedSecrets). The improvements can be seen by running the following JMH benchmark:
https://gist.github.com/plevart/86ac7fc6d4541dbc08256cde544019ce
The comparative results are here:
https://jmh.morethan.io/?gist=7eb421cf7982456a2962269137f71c15
The jmh-result.json files are here:
https://gist.github.com/plevart/7eb421cf7982456a2962269137f71c15
Improvement in speed ranges from 8% (for small strings) to 200% (for long strings), while creation of garbage has been further reduced to an almost garbage-free operation.
So WDYT?
Peter Levart has updated the pull request incrementally with one additional commit since the last revision:
Fix overflow checking logic, add test for it, avoid branch in loop, add 1-char strings to JHM test
So here's another iteration where I have taken into account all the suggestions: - avoid the branch in the append loop by appending the 1st element outside the loop so that delimiter can unconditionally be appended inside the loop for rest of the elements. - fixed overflow checking logic + added a test for it - previously it was possible to provoke wrong exception: IllegalArgumentException: size is negative. I added another test instead of extending existing one because it needs a 1GiB string and together with 2GiB string in existing test it would not fit in 4GiB heap any more when running with unpatched version. I wanted the tests to pass with unpatched version too. - extended the parameters of JMH benchmark to include 1-character strings + re-run the benchmarks The results are not that different from previous version (that used a conditional in the loop) as JIT is probably smart enough to unroll the loop and divide it into at least two parts (the way I did it manually now). What brings the most improvement in this version is the annotation to force in-lining of the String.join package-private method. Without it, it would happen that for one combination of JMH parameters, the results were a little worse compared to unpatched baseline - not for much: about 7%, but with the annotation, all combinations of parameters bring consistent improvement in range from 12% to 102%: https://jmh.morethan.io/?gist=babf00a655d21ff784ede887af53ec31 ------------- PR: https://git.openjdk.java.net/jdk/pull/3501
On Sun, 18 Apr 2021 19:55:20 GMT, Peter Levart <plevart@openjdk.org> wrote:
While JDK-8148937 improved StringJoiner class by replacing internal use of getChars that copies out characters from String elements into a char[] array with StringBuilder which is somehow more optimal, the improvement was marginal in speed (0% ... 10%) and mainly for smaller strings, while GC was reduced by about 50% in average per operation. Initial attempt to tackle that issue was more involved, but was later discarded because it was apparently using too much internal String details in code that lives outside String and outside java.lang package. But there is another way to package such "intimate" code - we can put it into String itself and just call it from StringJoiner. This PR is an attempt at doing just that. It introduces new package-private method in `java.lang.String` which is then used from both pubic static `String.join` methods as well as from `java.util.StringJoiner` (via SharedSecrets). The improvements can be seen by running the following JMH benchmark:
https://gist.github.com/plevart/86ac7fc6d4541dbc08256cde544019ce
The comparative results are here:
https://jmh.morethan.io/?gist=7eb421cf7982456a2962269137f71c15
The jmh-result.json files are here:
https://gist.github.com/plevart/7eb421cf7982456a2962269137f71c15
Improvement in speed ranges from 8% (for small strings) to 200% (for long strings), while creation of garbage has been further reduced to an almost garbage-free operation.
So WDYT?
Peter Levart has updated the pull request incrementally with one additional commit since the last revision:
Fix overflow checking logic, add test for it, avoid branch in loop, add 1-char strings to JHM test
Looks good! test/micro/org/openjdk/bench/java/util/StringJoinerBenchmark.java line 48:
46: @Warmup(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS) 47: @Measurement(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS) 48: @Fork(value = 1, jvmArgsAppend = {"-Xms1g", "-Xmx1g"})
In general I think it's prudent to default to at least 3 forks, to catch issues with run-to-run variance. ------------- Marked as reviewed by redestad (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3501
On Mon, 19 Apr 2021 10:44:04 GMT, Claes Redestad <redestad@openjdk.org> wrote:
Peter Levart has updated the pull request incrementally with one additional commit since the last revision:
Fix overflow checking logic, add test for it, avoid branch in loop, add 1-char strings to JHM test
test/micro/org/openjdk/bench/java/util/StringJoinerBenchmark.java line 48:
46: @Warmup(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS) 47: @Measurement(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS) 48: @Fork(value = 1, jvmArgsAppend = {"-Xms1g", "-Xmx1g"})
In general I think it's prudent to default to at least 3 forks, to catch issues with run-to-run variance.
Thanks, Claes. Will change that to 3 forks before pushing. Any more comments? Will wait for a day, then push it. ------------- PR: https://git.openjdk.java.net/jdk/pull/3501
On Sun, 18 Apr 2021 19:55:20 GMT, Peter Levart <plevart@openjdk.org> wrote:
While JDK-8148937 improved StringJoiner class by replacing internal use of getChars that copies out characters from String elements into a char[] array with StringBuilder which is somehow more optimal, the improvement was marginal in speed (0% ... 10%) and mainly for smaller strings, while GC was reduced by about 50% in average per operation. Initial attempt to tackle that issue was more involved, but was later discarded because it was apparently using too much internal String details in code that lives outside String and outside java.lang package. But there is another way to package such "intimate" code - we can put it into String itself and just call it from StringJoiner. This PR is an attempt at doing just that. It introduces new package-private method in `java.lang.String` which is then used from both pubic static `String.join` methods as well as from `java.util.StringJoiner` (via SharedSecrets). The improvements can be seen by running the following JMH benchmark:
https://gist.github.com/plevart/86ac7fc6d4541dbc08256cde544019ce
The comparative results are here:
https://jmh.morethan.io/?gist=7eb421cf7982456a2962269137f71c15
The jmh-result.json files are here:
https://gist.github.com/plevart/7eb421cf7982456a2962269137f71c15
Improvement in speed ranges from 8% (for small strings) to 200% (for long strings), while creation of garbage has been further reduced to an almost garbage-free operation.
So WDYT?
Peter Levart has updated the pull request incrementally with one additional commit since the last revision:
Fix overflow checking logic, add test for it, avoid branch in loop, add 1-char strings to JHM test
Looks good. ------------- Marked as reviewed by rriggs (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3501
While JDK-8148937 improved StringJoiner class by replacing internal use of getChars that copies out characters from String elements into a char[] array with StringBuilder which is somehow more optimal, the improvement was marginal in speed (0% ... 10%) and mainly for smaller strings, while GC was reduced by about 50% in average per operation. Initial attempt to tackle that issue was more involved, but was later discarded because it was apparently using too much internal String details in code that lives outside String and outside java.lang package. But there is another way to package such "intimate" code - we can put it into String itself and just call it from StringJoiner. This PR is an attempt at doing just that. It introduces new package-private method in `java.lang.String` which is then used from both pubic static `String.join` methods as well as from `java.util.StringJoiner` (via SharedSecrets). The improvements can be seen by running the following JMH benchmark:
https://gist.github.com/plevart/86ac7fc6d4541dbc08256cde544019ce
The comparative results are here:
https://jmh.morethan.io/?gist=7eb421cf7982456a2962269137f71c15
The jmh-result.json files are here:
https://gist.github.com/plevart/7eb421cf7982456a2962269137f71c15
Improvement in speed ranges from 8% (for small strings) to 200% (for long strings), while creation of garbage has been further reduced to an almost garbage-free operation.
So WDYT?
Peter Levart has updated the pull request incrementally with one additional commit since the last revision: Fork JVM 3 times in JMH test to smooth out variance in a particular fork ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3501/files - new: https://git.openjdk.java.net/jdk/pull/3501/files/a2d5e819..84af9e6c Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3501&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3501&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/3501.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3501/head:pull/3501 PR: https://git.openjdk.java.net/jdk/pull/3501
On Wed, 14 Apr 2021 18:58:57 GMT, Peter Levart <plevart@openjdk.org> wrote:
While JDK-8148937 improved StringJoiner class by replacing internal use of getChars that copies out characters from String elements into a char[] array with StringBuilder which is somehow more optimal, the improvement was marginal in speed (0% ... 10%) and mainly for smaller strings, while GC was reduced by about 50% in average per operation. Initial attempt to tackle that issue was more involved, but was later discarded because it was apparently using too much internal String details in code that lives outside String and outside java.lang package. But there is another way to package such "intimate" code - we can put it into String itself and just call it from StringJoiner. This PR is an attempt at doing just that. It introduces new package-private method in `java.lang.String` which is then used from both pubic static `String.join` methods as well as from `java.util.StringJoiner` (via SharedSecrets). The improvements can be seen by running the following JMH benchmark:
https://gist.github.com/plevart/86ac7fc6d4541dbc08256cde544019ce
The comparative results are here:
https://jmh.morethan.io/?gist=7eb421cf7982456a2962269137f71c15
The jmh-result.json files are here:
https://gist.github.com/plevart/7eb421cf7982456a2962269137f71c15
Improvement in speed ranges from 8% (for small strings) to 200% (for long strings), while creation of garbage has been further reduced to an almost garbage-free operation.
So WDYT?
This pull request has now been integrated. Changeset: 98cb81b3 Author: Peter Levart <plevart@openjdk.org> URL: https://git.openjdk.java.net/jdk/commit/98cb81b3 Stats: 214 lines in 6 files changed: 174 ins; 17 del; 23 mod 8265237: String.join and StringJoiner can be improved further Reviewed-by: rriggs, redestad ------------- PR: https://git.openjdk.java.net/jdk/pull/3501
participants (5)
-
Claes Redestad
-
Florent Guillaume
-
Peter Levart
-
Roger Riggs
-
Tagir F.Valeev