[PATCH] enhancement proposal for String concatenation

Ivan Gerasimov ivan.gerasimov at oracle.com
Mon Mar 11 21:26:52 UTC 2019


Yes, I agree that StringJoiner could benefit from a hint about the 
expected number of elements to join.

On the other hand, with the current allocation scheme, each reference 
stored in the elts[] will only be copied at most twice on average, so 
the total performance improvement might not be that significant to 
justify the API change.

With kind regards,
Ivan

On 3/9/19 11:52 AM, Сергей Цыпанов wrote:
> Hello Ivan,
>
> indeed your solution for Iterables is more compact than mine (it can be event shorter with method reference), however it doesn't solve the problem of array reallocation.
>
> See my response to Remi below
>
> 08.03.2019, 22:01, "Ivan Gerasimov" <ivan.gerasimov at oracle.com>:
>> Hi Sergei!
>>
>> As you said, this new class is pretty much like StringJoiner with
>> reduced functionality.
>>
>> For appending all elements of an Iterable you could use list.forEach(s
>> -> sj.add(s)).
>>
>> With kind regards,
>> Ivan
>>
>> On 3/8/19 11:22 AM, Сергей Цыпанов wrote:
>>>   Hello,
>>>
>>>   I have an enhancement proposal for some cases of String concatenation in Java.
>>>
>>>   Currently we concat Strings mostly using java.lang.StringBuilder. The main disadvantage of StringBuilder is underlying char array or rather a need to resize it when the capacity is about to exceed array length and subsequent copying of array content into newly allocated array.
>>>
>>>   One alternative solution existing is StringJoiner. Before JDK 9 it was a kind of decorator over StringBuilder, but later it was reworked in order to store appended Strings into String[] and overall capacity accumulated into int field. This makes it possible to allocate char[] only once and of exact size in toString() method reducing allocation cost.
>>>
>>>   My proposal is to copy-paste the code of StringJoinder into newly created class java.util.StringChain, drop the code responsible for delimiter, prefix and suffix and use it instead of StringBuilder in common StringBuilder::append concatenation pattern.
>>>
>>>   Possible use-cases for proposed code are:
>>>   - plain String concatenation
>>>   - String::chain (new methods)
>>>   - Stream.collect(Collectors.joining())
>>>   - StringConcatFactory
>>>
>>>   We can create new methods String.chain(Iterable<CharSequence>) and String.chain(CharSequence...) which allow to encapsulate boilerplate code like
>>>
>>>      StringBuilder sb = new StringBuilder();
>>>      for (CharSequence cs : charSequences) {
>>>        sb.append(cs);
>>>      }
>>>      String result = sb.toString():
>>>
>>>   into one line:
>>>
>>>      String result = String.chain(charSequences);
>>>
>>>   As of performance I've done some measurements using JMH on my work machine (Intel i7-7700) for both Latin and non-Latin Strings of different size and count.
>>>   Here are the results:
>>>
>>>   https://github.com/stsypanov/string-chain/blob/master/results/StringBuilderVsStringChainBenchmark.txt
>>>
>>>   There is a few corner cases (e.g. 1000 Strings of length 1 appended) when StringBuilder takes over StringChain constructed with default capacity of 8, but StringChain constructed with exact added Strings count almost always wins, especially when dealing with non-Latin chars (Russian in my case).
>>>
>>>   I've also created a separate repo on GitHub with benchmarks:
>>>
>>>   https://github.com/stsypanov/string-chain
>>>
>>>   Key feature here is ability to allocate String array of exact size is cases we know added elements count.
>>>   Thus I think that if the change will be accepted we can add also an overloaded method String.chain(Collection<CharSequence>) as Collection::size allows to contruct StringChain of exact size.
>>>
>>>   Patch is attached.
>>>
>>>   Kind regards,
>>>   Sergei Tsypanov
>> --
>> With kind regards,
>> Ivan Gerasimov

-- 
With kind regards,
Ivan Gerasimov



More information about the core-libs-dev mailing list