RFR: 8305774: String.join(CharSequence, Iterable) can be optimized if Iterable is a Collection [v2]

Tingjun Yuan duke at openjdk.org
Mon Apr 10 04:53:51 UTC 2023


On Mon, 10 Apr 2023 04:28:34 GMT, Chen Liang <liach at openjdk.org> wrote:

>> Nah. I mean like:
>> 
>>     public static String join(CharSequence delimiter,
>>             Iterable<? extends CharSequence> elements) {
>>         Objects.requireNonNull(delimiter);
>>         Objects.requireNonNull(elements);
>>         var delim = delimiter.toString();
>>         Object[] elems;
>>         final int size;
>>         if (elements instanceof Collection<?> c) {
>>             elems = c.toArray();
>>             size = elems.length;
>>             for (int i = 0; i < size; i++) {
>>                 elems[i] = String.valueOf(elems[i]);
>>             }
>>         } else {
>>             elems = new String[elements instanceof Collection<?> c ? c.size() : 8]; // or whatever spliterator you decide to use
>>             size = 0;
>>             for (CharSequence cs: elements) {
>>                 if (size >= elems.length) {
>>                     elems = Arrays.copyOf(elems, elems.length << 1);
>>                 }
>>                 elems[size++] = String.valueOf(cs);
>>             }
>>         }
>>         return join("", "", delim, elems, size);
>>     }
>> // ...
>> static String join(String prefix, String suffix, String delimiter, Object[] elements, int size) { // change array type to Object[], cast on access elements
>
>> @liach I don't think it's safe because `c.toArray()` can return an array that may be modified by others. In fact, although the specification of `Collection.toArray()` indicates that the returned array is free to be modified, nobody actually trusts this contract. They only trust `ArrayList` which is most widely-used.
> 
> It's safe. We don't keep the array in another object, and it's discarded after we finish the join. The most risky scenario is that the array is shared to another thread that can execute in parallel with the malicious Collection, which can swap out the String elements in the array we've obtained with irrelevant items.
> 
> Then you might go for https://github.com/openjdk/jdk/pull/13383#issuecomment-1501340839 but it involves one extra array copy.

But @liach 's sample modifies the returned array directly. It cause problems with this situation:


class BadList implements List<Object> {
    private Object[] array;

    public Object get(int i) {
        return array[i];
    }

    public Object[] toArray() {
        return array;  // no clone
    }

    // ...
}


Modifying the array will accidentally modify the collection implemented as above. That's why we should not trust that the array is *safe to be modified*.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/13383#issuecomment-1501387583


More information about the core-libs-dev mailing list