RFR: 8193031: Collections.addAll is likely to perform worse than Collection.addAll
Remi Forax
forax at univ-mlv.fr
Tue Dec 5 07:36:34 UTC 2017
+1
being also able to write,
List<String> l = ...
l.addEach("foo", "bar", "baz");
is a nice addition.
Rémi
----- Mail original -----
> De: "Stuart Marks" <stuart.marks at oracle.com>
> À: "Martin Buchholz" <martinrb at google.com>
> Cc: "core-libs-dev" <core-libs-dev at openjdk.java.net>
> Envoyé: Mardi 5 Décembre 2017 08:27:41
> Objet: Re: RFR: 8193031: Collections.addAll is likely to perform worse than Collection.addAll
> On 12/4/17 7:50 PM, Martin Buchholz wrote:
>> https://bugs.openjdk.java.net/browse/JDK-8193031
>> http://cr.openjdk.java.net/~martin/webrevs/jdk/Collections-addAll/
>
> * Adds all of the specified elements to the specified collection.
> * Elements to be added may be specified individually or as an array.
> * The behavior of this convenience method is identical to that of
> - * {@code c.addAll(Arrays.asList(elements))}, but this method is likely
> - * to run significantly faster under most implementations.
> + * {@code c.addAll(Arrays.asList(elements))}.
> *
> * <p>When elements are specified individually, this method provides a
> * convenient way to add a few elements to an existing collection:
>
> I agree with the removal of "significantly faster." In some real cases, it's
> definitely not! Usually the behavior wording isn't "is identical to". I'd prefer
> something like "the behavior is as if...."
>
> public static <T> boolean addAll(Collection<? super T> c, T... elements) {
> - boolean result = false;
> - for (T element : elements)
> - result |= c.add(element);
> - return result;
> + return c.addAll(Arrays.asList(elements));
> }
>
> So, this seems sensible, but the tradeoffs aren't obvious.
>
> If the destination is an ArrayList, the first thing that addAll() does is call
> toArray() on the argument. And Arrays$ArrayList.toArray() makes a copy, as
> required by spec. So this redundantly copies every element, compared to the
> for-loop.
>
> But I did some quick benchmarks and I found that bulk copies are so fast that
> even doing two of them is quite a bit faster than a single copy in a for-loop,
> ranging from 4x-6x faster over a variety of sizes.
>
> This also requires temp space the size of the input array. This gives me (and
> the garbage collector) a bit of a pause.
>
> Turning to other collections, the AbstractCollection.addAll() implementation
> runs essentially the same loop as here, except that it runs it over its
> Collection argument instead of an array. Thus it uses an Iterator to loop
> instead of indexing over an array. Here, converting to a list has a slight
> performance disadvantage, around 10-15% (for HashSet, which uses
> AbstractCollection.addAll).
>
> You point out in the bug report that the implementation of Collections.addAll()
> misses out on optimizations that implementations can provide for bulk updates.
> This is true, but my feeling is that loses something by converting the array
> into a Collection before calling the virtual addAll(Collection) method.
>
> How about this alternative: do the same thing we did with sort().
>
> Add a new default method Collection.addEach(T...) whose default implementation
> is either the original implementation of Collections.addAll() or your proposed
> modification. (We should use a name other than addAll, since overloading with a
> varargs method is a bad idea.) Add overrides for key implementations like
> ArrayList. Then, have Collections.addAll(c, T... elements) delegate to
> c.addEach(elements).
>
> For ArrayList, at least, this would avoid the redundant copy and temporary
> allocation, which seems nice.
>
> s'marks
More information about the core-libs-dev
mailing list