RFR: 8193031: Collections.addAll is likely to perform worse than Collection.addAll
Stuart Marks
stuart.marks at oracle.com
Tue Dec 5 07:27:41 UTC 2017
On 12/4/17 7:50 PM, Martin Buchholz wrote:
> https://bugs.openjdk.java.net/browse/JDK-8193031
> http://cr.openjdk.java.net/~martin/webrevs/jdk/Collections-addAll/
* Adds all of the specified elements to the specified collection.
* Elements to be added may be specified individually or as an array.
* The behavior of this convenience method is identical to that of
- * {@code c.addAll(Arrays.asList(elements))}, but this method is likely
- * to run significantly faster under most implementations.
+ * {@code c.addAll(Arrays.asList(elements))}.
*
* <p>When elements are specified individually, this method provides a
* convenient way to add a few elements to an existing collection:
I agree with the removal of "significantly faster." In some real cases, it's
definitely not! Usually the behavior wording isn't "is identical to". I'd prefer
something like "the behavior is as if...."
public static <T> boolean addAll(Collection<? super T> c, T... elements) {
- boolean result = false;
- for (T element : elements)
- result |= c.add(element);
- return result;
+ return c.addAll(Arrays.asList(elements));
}
So, this seems sensible, but the tradeoffs aren't obvious.
If the destination is an ArrayList, the first thing that addAll() does is call
toArray() on the argument. And Arrays$ArrayList.toArray() makes a copy, as
required by spec. So this redundantly copies every element, compared to the
for-loop.
But I did some quick benchmarks and I found that bulk copies are so fast that
even doing two of them is quite a bit faster than a single copy in a for-loop,
ranging from 4x-6x faster over a variety of sizes.
This also requires temp space the size of the input array. This gives me (and
the garbage collector) a bit of a pause.
Turning to other collections, the AbstractCollection.addAll() implementation
runs essentially the same loop as here, except that it runs it over its
Collection argument instead of an array. Thus it uses an Iterator to loop
instead of indexing over an array. Here, converting to a list has a slight
performance disadvantage, around 10-15% (for HashSet, which uses
AbstractCollection.addAll).
You point out in the bug report that the implementation of Collections.addAll()
misses out on optimizations that implementations can provide for bulk updates.
This is true, but my feeling is that loses something by converting the array
into a Collection before calling the virtual addAll(Collection) method.
How about this alternative: do the same thing we did with sort().
Add a new default method Collection.addEach(T...) whose default implementation
is either the original implementation of Collections.addAll() or your proposed
modification. (We should use a name other than addAll, since overloading with a
varargs method is a bad idea.) Add overrides for key implementations like
ArrayList. Then, have Collections.addAll(c, T... elements) delegate to
c.addEach(elements).
For ArrayList, at least, this would avoid the redundant copy and temporary
allocation, which seems nice.
s'marks
More information about the core-libs-dev
mailing list