RFR: 8305774: String.join(CharSequence, Iterable) can be optimized if Iterable is a Collection [v2]

Chen Liang liach at openjdk.org
Mon Apr 10 04:31:44 UTC 2023


On Mon, 10 Apr 2023 03:36:44 GMT, Chen Liang <liach at openjdk.org> wrote:

>> Tingjun Yuan has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Add benchmark
>
> Nah. I mean like:
> 
>     public static String join(CharSequence delimiter,
>             Iterable<? extends CharSequence> elements) {
>         Objects.requireNonNull(delimiter);
>         Objects.requireNonNull(elements);
>         var delim = delimiter.toString();
>         Object[] elems;
>         final int size;
>         if (elements instanceof Collection<?> c) {
>             elems = c.toArray();
>             size = elems.length;
>             for (int i = 0; i < size; i++) {
>                 elems[i] = String.valueOf(elems[i]);
>             }
>         } else {
>             elems = new String[elements instanceof Collection<?> c ? c.size() : 8]; // or whatever spliterator you decide to use
>             size = 0;
>             for (CharSequence cs: elements) {
>                 if (size >= elems.length) {
>                     elems = Arrays.copyOf(elems, elems.length << 1);
>                 }
>                 elems[size++] = String.valueOf(cs);
>             }
>         }
>         return join("", "", delim, elems, size);
>     }
> // ...
> static String join(String prefix, String suffix, String delimiter, Object[] elements, int size) { // change array type to Object[], cast on access elements

> @liach I don't think it's safe because `c.toArray()` can return an array that may be modified by others. In fact, although the specification of `Collection.toArray()` indicates that the returned array is free to be modified, nobody actually trusts this contract. They only trust `ArrayList` which is most widely-used.

It's safe. We don't keep the array in another object, and it's discarded after we finish the join. The most risky scenario is that the array is shared to another thread that can execute in parallel with the malicious Collection, which can swap out the String elements in the array we've obtained with irrelevant items.

Then you might go for https://github.com/openjdk/jdk/pull/13383#issuecomment-1501340839 but it involves one extra array copy.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/13383#issuecomment-1501378416


More information about the core-libs-dev mailing list