RFR: 8193031: Collections.addAll is likely to perform worse than Collection.addAll

Remi Forax forax at univ-mlv.fr
Tue Dec 5 07:36:34 UTC 2017


+1

being also able to write,
  List<String> l = ...
  l.addEach("foo", "bar", "baz");
is a nice addition.

Rémi

----- Mail original -----
> De: "Stuart Marks" <stuart.marks at oracle.com>
> À: "Martin Buchholz" <martinrb at google.com>
> Cc: "core-libs-dev" <core-libs-dev at openjdk.java.net>
> Envoyé: Mardi 5 Décembre 2017 08:27:41
> Objet: Re: RFR: 8193031: Collections.addAll is likely to perform worse than Collection.addAll

> On 12/4/17 7:50 PM, Martin Buchholz wrote:
>> https://bugs.openjdk.java.net/browse/JDK-8193031
>> http://cr.openjdk.java.net/~martin/webrevs/jdk/Collections-addAll/
> 
>       * Adds all of the specified elements to the specified collection.
>       * Elements to be added may be specified individually or as an array.
>       * The behavior of this convenience method is identical to that of
> -     * {@code c.addAll(Arrays.asList(elements))}, but this method is likely
> -     * to run significantly faster under most implementations.
> +     * {@code c.addAll(Arrays.asList(elements))}.
>       *
>       * <p>When elements are specified individually, this method provides a
>       * convenient way to add a few elements to an existing collection:
> 
> I agree with the removal of "significantly faster." In some real cases, it's
> definitely not! Usually the behavior wording isn't "is identical to". I'd prefer
> something like "the behavior is as if...."
> 
>      public static <T> boolean addAll(Collection<? super T> c, T... elements) {
> -        boolean result = false;
> -        for (T element : elements)
> -            result |= c.add(element);
> -        return result;
> +        return c.addAll(Arrays.asList(elements));
>      }
> 
> So, this seems sensible, but the tradeoffs aren't obvious.
> 
> If the destination is an ArrayList, the first thing that addAll() does is call
> toArray() on the argument. And Arrays$ArrayList.toArray() makes a copy, as
> required by spec. So this redundantly copies every element, compared to the
> for-loop.
> 
> But I did some quick benchmarks and I found that bulk copies are so fast that
> even doing two of them is quite a bit faster than a single copy in a for-loop,
> ranging from 4x-6x faster over a variety of sizes.
> 
> This also requires temp space the size of the input array. This gives me (and
> the garbage collector) a bit of a pause.
> 
> Turning to other collections, the AbstractCollection.addAll() implementation
> runs essentially the same loop as here, except that it runs it over its
> Collection argument instead of an array. Thus it uses an Iterator to loop
> instead of indexing over an array. Here, converting to a list has a slight
> performance disadvantage, around 10-15% (for HashSet, which uses
> AbstractCollection.addAll).
> 
> You point out in the bug report that the implementation of Collections.addAll()
> misses out on optimizations that implementations can provide for bulk updates.
> This is true, but my feeling is that loses something by converting the array
> into a Collection before calling the virtual addAll(Collection) method.
> 
> How about this alternative: do the same thing we did with sort().
> 
> Add a new default method Collection.addEach(T...) whose default implementation
> is either the original implementation of Collections.addAll() or your proposed
> modification. (We should use a name other than addAll, since overloading with a
> varargs method is a bad idea.) Add overrides for key implementations like
> ArrayList. Then, have Collections.addAll(c, T... elements) delegate to
> c.addEach(elements).
> 
> For ArrayList, at least, this would avoid the redundant copy and temporary
> allocation, which seems nice.
> 
> s'marks


More information about the core-libs-dev mailing list