RFR: 8305774: String.join(CharSequence, Iterable) can be optimized if Iterable is a Collection

Tingjun Yuan duke at openjdk.org
Sun Apr 9 02:40:43 UTC 2023


On Sun, 9 Apr 2023 02:11:44 GMT, jmehrens <duke at openjdk.org> wrote:

>> In the current implementation of `String.join(CharSequence, Iterable)`, the temp array `elems` is always initialized with a length of 8. It will cause many array recreations when the `Iterable` contains more than 8 elements. Furthermore, it's very common that an `Iterable` is also a `Collection`. So if the `Iterable` is an instance of `Collection`, the initial length of the array can be `((Collection<?>)elements).size()`. It will not change the current behavior even if the `Collection` is modified asynchronously.
>> 
>> I don't know whether this change requires a CSR request.
>
> What about using Iterable::spliterator instead of Iterable::iterator (enhanced for loop)?  Spliterator would allow you to call Spliterator::estimateSize and then to default to 8 if less than 8, cast and use if less than or equal to ArraysSupport.SOFT_MAX_ARRAY_LENGTH, or throw OutOfMemoryError if greater than Integer.MAX_VALUE.  The array growth code would have to use ArraysSupport.newLength to deal with huge sizes.
> 
> I'm not sure if bootstrapping issues will prevent you from using lambdas expressions to traverse the Spliterator.  If lambdas expressions are forbidden then this approach will introduce some code smell by having to use inner classes.

Maybe I should consider @jmehrens 's suggestion to check the return value of `elements.spliterator().estimateSize()`. It will return `Long.MAX_VALUE` if the size is expensive to compute (e.g. `ConcurrentLinkedDeque`). This also works if `elements` is not a `Collection`.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/13383#issuecomment-1501023984


More information about the core-libs-dev mailing list