[PATCH] Enhancement proposal for java.util.StringJoiner
Сергей Цыпанов
sergei.tsypanov at yandex.ru
Tue Feb 11 08:39:41 UTC 2020
Hello again,
I've reworked the patch according to your recommendation: no SharedSercrets and StringBUilder instead. The patch attached. The results are below:
Benchmark (count) (latin) (length) Original Patched1 SB Units
stringJoiner 1 true 1 26.9 ± 0.7 48.8 ± 2.2 33.4 ± 0.2 ns/op
stringJoiner 1 true 5 30.5 ± 1.0 46.1 ± 2.1 33.0 ± 0.1 ns/op
stringJoiner 1 true 10 31.2 ± 0.6 47.3 ± 1.3 34.3 ± 0.3 ns/op
stringJoiner 1 true 100 62.5 ± 3.3 79.9 ± 4.8 44.4 ± 0.1 ns/op
stringJoiner 5 true 1 78.2 ± 1.6 110.3 ± 2.9 87.8 ± 0.8 ns/op
stringJoiner 5 true 5 94.2 ± 8.7 116.6 ± 0.7 88.2 ± 0.9 ns/op
stringJoiner 5 true 10 95.3 ± 6.9 100.1 ± 0.4 91.6 ± 0.6 ns/op
stringJoiner 5 true 100 188.0 ± 10.2 136.0 ± 0.4 126.1 ± 0.7 ns/op
stringJoiner 10 true 1 160.3 ± 4.5 172.9 ± 0.8 177.6 ± 0.8 ns/op
stringJoiner 10 true 5 169.0 ± 4.7 180.2 ± 9.1 179.4 ± 1.0 ns/op
stringJoiner 10 true 10 205.7 ± 16.4 182.7 ± 1.1 189.5 ± 1.2 ns/op
stringJoiner 10 true 100 366.5 ± 17.0 284.5 ± 3.1 290.0 ± 0.8 ns/op
stringJoiner 100 true 1 1117.6 ± 11.1 2123.7 ± 11.1 1563.8 ± 2.8 ns/op
stringJoiner 100 true 5 1270.7 ± 40.2 2163.6 ± 12.4 1592.4 ± 4.0 ns/op
stringJoiner 100 true 10 1364.4 ± 14.0 2283.8 ± 16.1 1773.7 ± 57.7 ns/op
stringJoiner 100 true 100 3592.9 ± 164.8 3535.2 ± 29.9 2899.2 ± 51.0 ns/op
stringJoiner 1 false 1 35.6 ± 1.2 59.1 ± 3.0 52.7 ± 1.2 ns/op
stringJoiner 1 false 5 39.3 ± 1.2 52.6 ± 2.5 54.4 ± 1.6 ns/op
stringJoiner 1 false 10 42.2 ± 1.6 53.6 ± 0.3 52.2 ± 1.0 ns/op
stringJoiner 1 false 100 70.5 ± 1.8 86.4 ± 0.4 78.6 ± 1.2 ns/op
stringJoiner 5 false 1 89.0 ± 3.5 102.2 ± 1.0 116.3 ± 3.8 ns/op
stringJoiner 5 false 5 87.6 ± 0.7 106.5 ± 1.2 115.2 ± 2.9 ns/op
stringJoiner 5 false 10 109.0 ± 5.6 116.5 ± 1.2 126.5 ± 0.5 ns/op
stringJoiner 5 false 100 324.0 ± 16.5 221.9 ± 0.5 288.9 ± 0.5 ns/op
stringJoiner 10 false 1 183.9 ± 5.9 204.7 ± 5.5 261.2 ± 7.7 ns/op
stringJoiner 10 false 5 198.7 ± 9.7 202.4 ± 1.5 253.3 ± 6.7 ns/op
stringJoiner 10 false 10 196.7 ± 6.9 226.7 ± 6.4 274.3 ± 7.0 ns/op
stringJoiner 10 false 100 535.8 ± 2.3 553.0 ± 5.6 677.3 ± 6.4 ns/op
stringJoiner 100 false 1 1674.6 ± 122.1 1940.8 ± 16.2 2212.5 ± 32.2 ns/op
stringJoiner 100 false 5 1791.9 ± 58.1 2158.1 ± 12.0 2492.8 ± 30.9 ns/op
stringJoiner 100 false 10 2124.1 ± 193.3 2364.0 ± 25.2 2611.7 ± 17.5 ns/op
stringJoiner 100 false 100 4323.4 ± 29.2 4675.5 ± 11.8 5501.3 ± 21.1 ns/op
stringJoiner:·gc.alloc.rate.norm 1 true 1 120.0 ± 0.0 120.0 ± 0.0 144.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 1 true 5 128.0 ± 0.0 120.0 ± 0.0 144.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 1 true 10 144.0 ± 0.0 136.0 ± 0.0 160.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 1 true 100 416.0 ± 0.0 312.0 ± 0.0 336.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 5 true 1 144.0 ± 0.0 136.0 ± 0.0 160.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 5 true 5 200.0 ± 0.0 168.0 ± 0.0 192.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 5 true 10 272.0 ± 0.0 216.0 ± 0.0 240.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 5 true 100 1632.0 ± 0.0 1128.0 ± 0.0 1152.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 10 true 1 256.0 ± 0.0 232.0 ± 0.0 256.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 10 true 5 376.0 ± 0.0 316.8 ± 4.9 336.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 10 true 10 520.0 ± 0.0 408.0 ± 0.0 432.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 10 true 100 3224.1 ± 0.0 2236.9 ± 21.2 2240.1 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 100 true 1 1748.1 ± 4.0 1592.2 ± 0.0 1568.2 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 100 true 5 2948.2 ± 4.0 2392.3 ± 0.0 2368.2 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 100 true 10 4444.3 ± 4.0 3384.3 ± 0.0 3364.3 ± 4.0 B/op
stringJoiner:·gc.alloc.rate.norm 100 true 100 31441.4 ± 0.0 21385.4 ± 0.0 21365.1 ± 4.1 B/op
stringJoiner:·gc.alloc.rate.norm 1 false 1 144.0 ± 0.0 144.0 ± 0.0 192.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 1 false 5 160.0 ± 0.0 160.0 ± 0.0 208.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 1 false 10 184.0 ± 0.0 184.0 ± 0.0 240.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 1 false 100 640.0 ± 0.0 640.0 ± 0.0 784.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 5 false 1 184.0 ± 0.0 184.0 ± 0.0 240.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 5 false 5 280.0 ± 0.0 280.0 ± 0.0 349.6 ± 2.4 B/op
stringJoiner:·gc.alloc.rate.norm 5 false 10 400.0 ± 0.0 400.0 ± 0.0 496.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 5 false 100 2664.1 ± 0.0 2664.1 ± 0.0 3216.1 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 10 false 1 320.0 ± 0.0 334.4 ± 7.4 384.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 10 false 5 520.0 ± 0.0 520.0 ± 0.0 624.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 10 false 10 760.0 ± 0.0 769.6 ± 6.5 912.0 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 10 false 100 5264.2 ± 0.0 5273.8 ± 6.5 6320.3 ± 0.0 B/op
stringJoiner:·gc.alloc.rate.norm 100 false 1 2204.2 ± 4.0 2216.3 ± 0.0 2436.3 ± 6.8 B/op
stringJoiner:·gc.alloc.rate.norm 100 false 5 4196.3 ± 6.2 4216.4 ± 0.0 4832.4 ± 6.6 B/op
stringJoiner:·gc.alloc.rate.norm 100 false 10 6696.5 ± 5.4 6712.6 ± 0.0 7844.7 ± 4.0 B/op
stringJoiner:·gc.alloc.rate.norm 100 false 100 51702.0 ± 4.1 51714.4 ± 0.0 61838.6 ± 6.2 B/op
StringBuilder indeeed helps a lot the cases when all the Strings are latin1, however it significantly slows down the opposite case, especially regarding memory consumption.
The next step will be to measure performance again with SharedSecrets applied but with latin check in toString() method.
Regards,
Sergey Tsypanov
10.02.2020, 14:08, "Tagir Valeev" <amaembo at gmail.com>:
> Hello!
>
> In many tests, I see little or no performance improvements. E.g.:
> stringJoiner 100 10 1768.8 ±
> 160.6 1760.8 ± 111.4 ns/op
>
> How would you explain that this code change doesn't improve the
> performance for given count and length?
>
> Also, you are using `new String(bytes)` assuming that the platform
> charset is latin1-compatible. This is not always true, so your code
> would produce incorrect results depending on this setting. In general,
> decoding via charset is tons of extra work. Instead, I would
> experiment with pre-sized StringBuilder inside compactElts, instead of
> char[] array. While it does some extra checks, StringBuilder already
> can use the compact string representation, so if it's properly
> pre-sized, no extra memory will be allocated.
>
> Finally, if you optimize for the case when X is true you should always
> benchmark what happens when X is false. It's great that in some cases
> we see a speedup for latin1 strings. But what if not all of the
> strings are latin1? Is there any performance degradation? If yes, can
> we tolerate it?
>
> With best regards,
> Tagir Valeev.
>
> On Mon, Feb 10, 2020 at 3:22 PM Сергей Цыпанов
> <sergei.tsypanov at yandex.ru> wrote:
>> Hello,
>>
>> I've reworked the code, patch is attached. Could you please review my solution regarding usage of SahredSecrets?
>>
>> P.S. After I've created the patch it came to my mind that instead of checking all Strings when calling StringJoiner.add()
>> we can check them in toString() method and fail-fast in case at least one of them is non-latin. This likely to reduce
>> regression related to usage of reflection.
>>
>> Regards,
>> Sergey Tsypanov
>>
>> 05.02.2020, 23:21, "forax at univ-mlv.fr" <forax at univ-mlv.fr>:
>> > ----- Mail original -----
>> >> De: "Сергей Цыпанов" <sergei.tsypanov at yandex.ru>
>> >> À: "Remi Forax" <forax at univ-mlv.fr>, "core-libs-dev" <core-libs-dev at openjdk.java.net>
>> >> Envoyé: Mercredi 5 Février 2020 22:12:34
>> >> Objet: Re: [PATCH] Enhancement proposal for java.util.StringJoiner
>> >
>> >> Hello,
>> >>
>> >>> If you want to optimize StringJoiner, the best way to do it is to use the shared
>> >>> secret mechanism so a java.util class can see implementation details of a
>> >>> java.lang class without exposing those details publicly.
>> >>> As an example, take a look to EnumSet and its implementations.
>> >>
>> >> I've looked into SharedSecrets, it seems there's no ready-to-use method for
>> >> accessing package-private method. Do you mean it's necessary to add particular
>> >> functionality to JavaLangReflectionAccess as they did for JavaLangAccess in
>> >> order to deal with EnumSet?
>> >
>> > yes !
>> > crossing package boundary in a non public way is not free,
>> > but given that StringJoiner is used quite often (directly or indirectly using Collectors.joining()), it may worth the cost.
>> >
>> >> Regards,
>> >> Sergey
>> >
>> > Regards,
>> > Rémi
>> >
>> >> 04.02.2020, 12:12, "Remi Forax" <forax at univ-mlv.fr>:
>> >>> ----- Mail original -----
>> >>>> De: "Сергей Цыпанов" <sergei.tsypanov at yandex.ru>
>> >>>> À: "jonathan gibbons" <jonathan.gibbons at oracle.com>, "core-libs-dev"
>> >>>> <core-libs-dev at openjdk.java.net>
>> >>>> Envoyé: Mardi 4 Février 2020 08:53:31
>> >>>> Objet: Re: [PATCH] Enhancement proposal for java.util.StringJoiner
>> >>>
>> >>>> Hello,
>> >>>
>> >>> Hi Sergey,
>> >>>
>> >>>> I'd probably agree about a new class in java.lang, but what is wrong about
>> >>>> exposing package-private method
>> >>>> which doesn't modify the state of the object and has no side effects?
>> >>>
>> >>> You can not change the implementation anymore,
>> >>> by example if instead of having a split between latin1 and non latin1, we decide
>> >>> in the future to split between utf8 and non utf8.
>> >>>
>> >>> If you want to optimize StringJoiner, the best way to do it is to use the shared
>> >>> secret mechanism so a java.util class can see implementation details of a
>> >>> java.lang class without exposing those details publicly.
>> >>> As an example, take a look to EnumSet and its implementations.
>> >>>
>> >>> regards,
>> >>> Rémi
>> >>>
>> >>>> 04.02.2020, 00:58, "Jonathan Gibbons" <jonathan.gibbons at oracle.com>:
>> >>>>> Sergey,
>> >>>>>
>> >>>>> It is equally bad to create a new class in the java.lang package as it
>> >>>>> is to add a new public method to java.lang.String.
>> >>>>>
>> >>>>> -- Jon
>> >>>>>
>> >>>>> On 2/3/20 2:38 PM, Сергей Цыпанов wrote:
>> >>>>>> Hello,
>> >>>>>>
>> >>>>>> as of JDK14 java.util.StringJoiner still uses char[] as a storage of glued
>> >>>>>> Strings.
>> >>>>>>
>> >>>>>> This applies for the cases when all joined Strings as well as delimiter, prefix
>> >>>>>> and suffix contain only ASCII symbols.
>> >>>>>>
>> >>>>>> As a result when StringJoiner.toString() is invoked, byte[] stored in String is
>> >>>>>> inflated in order to fill in char[] and
>> >>>>>> finally char[] is compressed when constructor of String is called:
>> >>>>>>
>> >>>>>> String delimiter = this.delimiter;
>> >>>>>> char[] chars = new char[this.len + addLen];
>> >>>>>> int k = getChars(this.prefix, chars, 0);
>> >>>>>> if (size > 0) {
>> >>>>>> k += getChars(elts[0], chars, k); // inflate byte[] -> char[]
>> >>>>>>
>> >>>>>> for(int i = 1; i < size; ++i) {
>> >>>>>> k += getChars(delimiter, chars, k);
>> >>>>>> k += getChars(elts[i], chars, k);
>> >>>>>> }
>> >>>>>> }
>> >>>>>>
>> >>>>>> k += getChars(this.suffix, chars, k);
>> >>>>>> return new String(chars); // compress char[] -> byte[]
>> >>>>>>
>> >>>>>> This can be improved by detecting cases when String.isLatin1() returns true for
>> >>>>>> all involved Strings.
>> >>>>>>
>> >>>>>> I've prepared a patch along with benchmark proving that this change is correct
>> >>>>>> and brings improvement.
>> >>>>>> The only concern I have is about String.isLatin1(): as far as String belongs to
>> >>>>>> java.lang and StringJoiner to java.util
>> >>>>>> package-private String.isLatin1() cannot be directly accessed, we need to make
>> >>>>>> it public for successful compilation.
>> >>>>>>
>> >>>>>> Another solution is to create an intermediate utility class located in java.lang
>> >>>>>> which delegates the call to String.isLatin1():
>> >>>>>>
>> >>>>>> package java.lang;
>> >>>>>>
>> >>>>>> public class StringHelper {
>> >>>>>> public static boolean isLatin1(String str) {
>> >>>>>> return str.isLatin1();
>> >>>>>> }
>> >>>>>> }
>> >>>>>>
>> >>>>>> This allows to keep java.lang.String intact and have access to it's
>> >>>>>> package-private method outside of java.lang package.
>> >>>>>>
>> >>>>>> Below I've added results of benchmarking for specified case (all Strings are
>> >>>>>> Latin1). The other case (at least one String is UTF-8) uses existing code so
>> >>>>>> there will be only a tiny regression due to several if-checks.
>> >>>>>>
>> >>>>>> With best regards,
>> >>>>>> Sergey Tsypanov
>> >>>>>>
>> >>>>>> (count) (length) Original Patched Units
>> >>>>>> stringJoiner 1 1 26.7 ± 1.3 38.2 ± 1.1 ns/op
>> >>>>>> stringJoiner 1 5 27.4 ± 0.0 40.5 ± 2.2 ns/op
>> >>>>>> stringJoiner 1 10 29.6 ± 1.9 38.4 ± 1.9 ns/op
>> >>>>>> stringJoiner 1 100 61.1 ± 6.9 47.6 ± 0.6 ns/op
>> >>>>>> stringJoiner 5 1 91.1 ± 6.7 83.6 ± 2.0 ns/op
>> >>>>>> stringJoiner 5 5 96.1 ± 10.7 85.6 ± 1.1 ns/op
>> >>>>>> stringJoiner 5 10 105.5 ± 14.3 84.7 ± 1.1 ns/op
>> >>>>>> stringJoiner 5 100 266.6 ± 30.1 139.6 ± 14.0 ns/op
>> >>>>>> stringJoiner 10 1 190.7 ± 23.0 162.0 ± 2.9 ns/op
>> >>>>>> stringJoiner 10 5 200.0 ± 16.9 167.5 ± 11.0 ns/op
>> >>>>>> stringJoiner 10 10 216.4 ± 12.4 164.8 ± 1.7 ns/op
>> >>>>>> stringJoiner 10 100 545.3 ± 49.7 282.2 ± 12.0 ns/op
>> >>>>>> stringJoiner 100 1 1467.0 ± 90.3 1302.0 ± 18.5 ns/op
>> >>>>>> stringJoiner 100 5 1491.8 ± 166.2 1493.0 ± 135.4 ns/op
>> >>>>>> stringJoiner 100 10 1768.8 ± 160.6 1760.8 ± 111.4 ns/op
>> >>>>>> stringJoiner 100 100 3654.3 ± 113.1 3120.9 ± 175.9 ns/op
>> >>>>>>
>> >>>>>> stringJoiner:·gc.alloc.rate.norm 1 1 120.0 ± 0.0 120.0 ± 0.0 B/op
>> >>>>>> stringJoiner:·gc.alloc.rate.norm 1 5 128.0 ± 0.0 120.0 ± 0.0 B/op
>> >>>>>> stringJoiner:·gc.alloc.rate.norm 1 10 144.0 ± 0.0 136.0 ± 0.0 B/op
>> >>>>>> stringJoiner:·gc.alloc.rate.norm 1 100 416.0 ± 0.0 312.0 ± 0.0 B/op
>> >>>>>> stringJoiner:·gc.alloc.rate.norm 5 1 144.0 ± 0.0 136.0 ± 0.0 B/op
>> >>>>>> stringJoiner:·gc.alloc.rate.norm 5 5 200.0 ± 0.0 168.0 ± 0.0 B/op
>> >>>>>> stringJoiner:·gc.alloc.rate.norm 5 10 272.0 ± 0.0 216.0 ± 0.0 B/op
>> >>>>>> stringJoiner:·gc.alloc.rate.norm 5 100 1632.0 ± 0.0 1128.0 ± 0.0 B/op
>> >>>>>> stringJoiner:·gc.alloc.rate.norm 10 1 256.0 ± 0.0 232.0 ± 0.0 B/op
>> >>>>>> stringJoiner:·gc.alloc.rate.norm 10 5 376.0 ± 0.0 312.0 ± 0.0 B/op
>> >>>>>> stringJoiner:·gc.alloc.rate.norm 10 10 520.0 ± 0.0 408.0 ± 0.0 B/op
>> >>>>>> stringJoiner:·gc.alloc.rate.norm 10 100 3224.1 ± 0.0 2216.1 ± 0.0 B/op
>> >>>>>> stringJoiner:·gc.alloc.rate.norm 100 1 1760.2 ± 14.9 1544.2 ± 0.0 B/op
>> >>>>>> stringJoiner:·gc.alloc.rate.norm 100 5 2960.3 ± 14.9 2344.2 ± 0.0 B/op
>> >>>>>> stringJoiner:·gc.alloc.rate.norm 100 10 4440.4 ± 0.0 3336.3 ± 0.0 B/op
>> >> >> >> stringJoiner:·gc.alloc.rate.norm 100 100 31449.3 ± 12.2 21346.7 ± 14.7 B/op
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sj_1.patch
Type: text/x-diff
Size: 2882 bytes
Desc: not available
URL: <https://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20200211/c25ce7ab/sj_1.patch>
More information about the core-libs-dev
mailing list