[PATCH] enhancement proposal for String concatenation
Сергей Цыпанов
sergei.tsypanov at yandex.ru
Sat Mar 9 19:42:35 UTC 2019
Hi Remi,
you are right, for a long time I've used String.join("") or new StringJoiner("") to do some cheating for String concatenation as StringJoiner performs better in many cases. It came to mind that we could use existing StringJoiner, but introduce into it a constructor taking expected length of 'elts' array because currently it is initialized at first add call with amount of 8 causing either overallocation or repeating reallocation with data copying.
Imagine the case when we have elts of lenght 1000 fully populated and one more element is added. Current implementation creates new array of length 2000 eventually containing only 1001 elements.
Making it possible to allocate StringJoiner's elts of exact size prevents from adding new class but allows performance improvements for the cases when we know the amount of elements to be added.
08.03.2019, 21:58, "Remi Forax" <forax at univ-mlv.fr>:
> Hi,
> adding StringChain send the wrong message IMO, we already have StringBuffer, StringBuilder and StringJoiner,
> let's try not to add another way to concatenate a variable number of Strings in Java.
>
> I wonder if you can not achieve what you want by specializing String.join() and StringJoiner if the delimiter is an empty String ?
>
> Rémi
>
> ----- Mail original -----
>> De: "Сергей Цыпанов" <sergei.tsypanov at yandex.ru>
>> À: "core-libs-dev" <core-libs-dev at openjdk.java.net>
>> Envoyé: Vendredi 8 Mars 2019 20:22:10
>> Objet: [PATCH] enhancement proposal for String concatenation
>
>> Hello,
>>
>> I have an enhancement proposal for some cases of String concatenation in Java.
>>
>> Currently we concat Strings mostly using java.lang.StringBuilder. The main
>> disadvantage of StringBuilder is underlying char array or rather a need to
>> resize it when the capacity is about to exceed array length and subsequent
>> copying of array content into newly allocated array.
>>
>> One alternative solution existing is StringJoiner. Before JDK 9 it was a kind of
>> decorator over StringBuilder, but later it was reworked in order to store
>> appended Strings into String[] and overall capacity accumulated into int field.
>> This makes it possible to allocate char[] only once and of exact size in
>> toString() method reducing allocation cost.
>>
>> My proposal is to copy-paste the code of StringJoinder into newly created class
>> java.util.StringChain, drop the code responsible for delimiter, prefix and
>> suffix and use it instead of StringBuilder in common StringBuilder::append
>> concatenation pattern.
>>
>> Possible use-cases for proposed code are:
>> - plain String concatenation
>> - String::chain (new methods)
>> - Stream.collect(Collectors.joining())
>> - StringConcatFactory
>>
>> We can create new methods String.chain(Iterable<CharSequence>) and
>> String.chain(CharSequence...) which allow to encapsulate boilerplate code like
>>
>> StringBuilder sb = new StringBuilder();
>> for (CharSequence cs : charSequences) {
>> sb.append(cs);
>> }
>> String result = sb.toString():
>>
>> into one line:
>>
>> String result = String.chain(charSequences);
>>
>> As of performance I've done some measurements using JMH on my work machine
>> (Intel i7-7700) for both Latin and non-Latin Strings of different size and
>> count.
>> Here are the results:
>>
>> https://github.com/stsypanov/string-chain/blob/master/results/StringBuilderVsStringChainBenchmark.txt
>>
>> There is a few corner cases (e.g. 1000 Strings of length 1 appended) when
>> StringBuilder takes over StringChain constructed with default capacity of 8,
>> but StringChain constructed with exact added Strings count almost always wins,
>> especially when dealing with non-Latin chars (Russian in my case).
>>
>> I've also created a separate repo on GitHub with benchmarks:
>>
>> https://github.com/stsypanov/string-chain
>>
>> Key feature here is ability to allocate String array of exact size is cases we
>> know added elements count.
>> Thus I think that if the change will be accepted we can add also an overloaded
>> method String.chain(Collection<CharSequence>) as Collection::size allows to
>> contruct StringChain of exact size.
>>
>> Patch is attached.
>>
>> Kind regards,
>> Sergei Tsypanov
More information about the core-libs-dev
mailing list