[PATCH] Enhancement proposal for java.util.StringJoiner
Сергей Цыпанов
sergei.tsypanov at yandex.ru
Mon Feb 3 22:38:58 UTC 2020
Hello,
as of JDK14 java.util.StringJoiner still uses char[] as a storage of glued Strings.
This applies for the cases when all joined Strings as well as delimiter, prefix and suffix contain only ASCII symbols.
As a result when StringJoiner.toString() is invoked, byte[] stored in String is inflated in order to fill in char[] and
finally char[] is compressed when constructor of String is called:
String delimiter = this.delimiter;
char[] chars = new char[this.len + addLen];
int k = getChars(this.prefix, chars, 0);
if (size > 0) {
k += getChars(elts[0], chars, k); // inflate byte[] -> char[]
for(int i = 1; i < size; ++i) {
k += getChars(delimiter, chars, k);
k += getChars(elts[i], chars, k);
}
}
k += getChars(this.suffix, chars, k);
return new String(chars); // compress char[] -> byte[]
This can be improved by detecting cases when String.isLatin1() returns true for all involved Strings.
I've prepared a patch along with benchmark proving that this change is correct and brings improvement.
The only concern I have is about String.isLatin1(): as far as String belongs to java.lang and StringJoiner to java.util
package-private String.isLatin1() cannot be directly accessed, we need to make it public for successful compilation.
Another solution is to create an intermediate utility class located in java.lang which delegates the call to String.isLatin1():
package java.lang;
public class StringHelper {
public static boolean isLatin1(String str) {
return str.isLatin1();
}
}
This allows to keep java.lang.String intact and have access to it's package-private method outside of java.lang package.
Below I've added results of benchmarking for specified case (all Strings are Latin1). The other case (at least one String is UTF-8) uses existing code so there will be only a tiny regression due to several if-checks.
With best regards,
Sergey Tsypanov
(count) (length) Original Patched Units
stringJoiner 1 1 26.7 ± 1.3 38.2 ± 1.1 ns/op
stringJoiner 1 5 27.4 ± 0.0 40.5 ± 2.2 ns/op
stringJoiner 1 10 29.6 ± 1.9 38.4 ± 1.9 ns/op
stringJoiner 1 100 61.1 ± 6.9 47.6 ± 0.6 ns/op
stringJoiner 5 1 91.1 ± 6.7 83.6 ± 2.0 ns/op
stringJoiner 5 5 96.1 ± 10.7 85.6 ± 1.1 ns/op
stringJoiner 5 10 105.5 ± 14.3 84.7 ± 1.1 ns/op
stringJoiner 5 100 266.6 ± 30.1 139.6 ± 14.0 ns/op
stringJoiner 10 1 190.7 ± 23.0 162.0 ± 2.9 ns/op
stringJoiner 10 5 200.0 ± 16.9 167.5 ± 11.0 ns/op
stringJoiner 10 10 216.4 ± 12.4 164.8 ± 1.7 ns/op
stringJoiner 10 100 545.3 ± 49.7 282.2 ± 12.0 ns/op
stringJoiner 100 1 1467.0 ± 90.3 1302.0 ± 18.5 ns/op
stringJoiner 100 5 1491.8 ± 166.2 1493.0 ± 135.4 ns/op
stringJoiner 100 10 1768.8 ± 160.6 1760.8 ± 111.4 ns/op
stringJoiner 100 100 3654.3 ± 113.1 3120.9 ± 175.9 ns/op
stringJoiner:·gc.alloc.rate.norm 1 1 120.0 ± 0.0 120.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 1 5 128.0 ± 0.0 120.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 1 10 144.0 ± 0.0 136.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 1 100 416.0 ± 0.0 312.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 5 1 144.0 ± 0.0 136.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 5 5 200.0 ± 0.0 168.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 5 10 272.0 ± 0.0 216.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 5 100 1632.0 ± 0.0 1128.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 10 1 256.0 ± 0.0 232.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 10 5 376.0 ± 0.0 312.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 10 10 520.0 ± 0.0 408.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 10 100 3224.1 ± 0.0 2216.1 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 100 1 1760.2 ± 14.9 1544.2 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 100 5 2960.3 ± 14.9 2344.2 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 100 10 4440.4 ± 0.0 3336.3 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 100 100 31449.3 ± 12.2 21346.7 ± 14.7 B/op
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sj.patch
Type: text/x-diff
Size: 5066 bytes
Desc: not available
URL: <https://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20200204/89e67945/sj-0001.patch>
More information about the core-libs-dev
mailing list