Proposal to enhance Stream.collect
August Nagro
augustnagro at gmail.com
Mon Mar 4 00:35:57 UTC 2019
Hi everyone,
My implementation is at
https://github.com/AugustNagro/presize-collectors-bench and I have a webrev
here: http://august.nagro.us/presized-collectors/webrev/.
The changes that I made were:
- Add `default IntFunction<A> sizedSupplier()` to Collector
- Add 2 new Collector.of helper methods taking a sized supplier
- Update the applicable collectors in Collectors to provide a
sizedSupplier
- Have ReferencePipeline & ReduceOps call sizedSupplier when appropriate
Some notes/questions I have:
- I considered adding Collector.Characteristics.PRESIZEABLE, but figured
it was not needed given that sizedSupplier should always delegate to
supplier.
- Collector.sizedSupplier could be a LongFunction instead of IntFunction
(since Sink.begin provides long sizes), but every Collection type I've
looked at uses int for initialCapacity, so I think IntFunction makes more
sense.
- StringJoiner should be updated to take an initalCapacity (should I
commit this change in the webrev, or leave for another enhancement?)
- What tests should I add for this enhancement?
There are some benchmarks in the github repo. I was initially surprised
when I saw that my bare-bones parallel stream did not have a throughput
improvement like the serial one, but after doing some debugging I think it
is from Collector.combiner dominating the runtime.
Regards,
August
On Thu, Feb 28, 2019 at 12:56 AM Tagir Valeev <amaembo at gmail.com> wrote:
> Hello!
>
> I wouldn't use presized HashSet, because you never know how many
> duplicates are expected. What if the input stream has a million of
> elements, but only two of them are distinct? Do you really want to allocate
> a hash table for million elements in advance?
>
> For toMap() without custom supplier and merger preallocation sounds
> reasonable as in this case duplicating key is an exceptional case, so we
> may expect a specific number of elements.
>
> чт, 28 февр. 2019 г., 11:38 August Nagro augustnagro at gmail.com:
>
>> Tagir,
>>
>> Great to see the validating benchmarks.
>>
>> > I think it's the best solution given the fact that very few collectors may
>> benefit from the exact known size, and this benefit usually disappears
>> when collectors are composed (e.g. using groupingBy: downstream toList()
>> would not win anything if it provides a sizedSupplier).
>>
>> I would like to see your benchmarks for that statement too. After all,
>> Hashmap, HashSet, etc all have presizing constructors, aren't those there
>> for a reason?
>>
>> So I am still convinced that presizing is important for other collector
>> types, including custom collectors. Although I'm open to having my mind
>> changed.
>>
>> I'm happy to contribute an implementation as well, I was hoping that
>> this this (smallish) patch would help me learn the OpenJdk process and make
>> future contributions easier!
>>
>> - August
>>
>> On Wed, Feb 27, 2019 at 1:06 AM Tagir Valeev <amaembo at gmail.com> wrote:
>>
>>> Hello!
>>>
>>> > A less intrusive API direction might be a version of Collector whose
>>> > supplier function took a size-estimate argument; this might even help
>>> in
>>> > parallel since it allows for intermediate results to start with a
>>> better
>>> > initial size guess. (And this could be implemented as a default that
>>> > delegates to the existing supplier.) Still, not really sure this
>>> > carries its weight.
>>>
>>> It's interesting that such optimization is possible without API
>>> change, using the "hidden knowledge".
>>> While public API stays the same, the collect method implementation may
>>> check if custom
>>> non-public Collector implementation is supplied, and in this case may
>>> use the sized supplier.
>>> I think it's the best solution given the fact that very few collectors
>>> may benefit from the exact
>>> known size, and this benefit usually disappears when collectors are
>>> composed (e.g. using groupingBy:
>>> downstream toList() would not win anything if it provides a
>>> sizedSupplier).
>>>
>>> I created a quick patch and launched a quick benchmark like this:
>>> return new SplittableRandom(0).ints(length, 0,
>>> 100).boxed().collect(Collectors.toList());
>>> For length = 100, 10000, 1000000
>>>
>>> Here's results for vanilla 13-ea+8:
>>> length = 100, avg time = 1434,469 ± 27,147 ns/op, alloc rate = 1712 B/op
>>> length = 10000, avg time = 155032,390 ± 2657,927 ns/op, alloc rate =
>>> 169280 B/op
>>> length = 1000000, avg time = 27099621,763 ± 1979054,500 ns/op, alloc
>>> rate = 14586750 B/op
>>>
>>> Patched:
>>> length = 100, avg time = 1414,480 ± 30,040 ns/op, alloc rate = 768 B/op
>>> length = 10000, avg time = 137338,320 ± 1316,789 ns/op, alloc rate =
>>> 40368 B/op
>>> length = 1000000, avg time = 17654635,871 ± 210607,441 ns/op, alloc
>>> rate = 4000382 B/op
>>>
>>> As you can see, exact allocation really helps to reduce alloc pressure
>>> and produces better performance for big lists.
>>>
>>> If such kind of patch is acceptable for Stream API, I can file a
>>> ticket and submit a webrev.
>>>
>>> With best regards,
>>> Tagir Valeev
>>>
>>> The patch follows:
>>>
>>> --- src/java.base/share/classes/java/util/stream/Collectors.java
>>> (revision 53626:e2fc434b410a35b28ab433c29863c8a26e4e813a)
>>> +++ src/java.base/share/classes/java/util/stream/Collectors.java
>>> (revision 53626+:e2fc434b410a+)
>>> @@ -50,6 +50,7 @@
>>> import java.util.function.BinaryOperator;
>>> import java.util.function.Consumer;
>>> import java.util.function.Function;
>>> +import java.util.function.IntFunction;
>>> import java.util.function.Predicate;
>>> import java.util.function.Supplier;
>>> import java.util.function.ToDoubleFunction;
>>> @@ -194,23 +195,34 @@
>>> */
>>> static class CollectorImpl<T, A, R> implements Collector<T, A, R> {
>>> private final Supplier<A> supplier;
>>> + private final IntFunction<A> sizedSupplier;
>>> private final BiConsumer<A, T> accumulator;
>>> private final BinaryOperator<A> combiner;
>>> private final Function<A, R> finisher;
>>> private final Set<Characteristics> characteristics;
>>>
>>> CollectorImpl(Supplier<A> supplier,
>>> + IntFunction<A> sizedSupplier,
>>> BiConsumer<A, T> accumulator,
>>> BinaryOperator<A> combiner,
>>> Function<A,R> finisher,
>>> Set<Characteristics> characteristics) {
>>> this.supplier = supplier;
>>> + this.sizedSupplier = sizedSupplier;
>>> this.accumulator = accumulator;
>>> this.combiner = combiner;
>>> this.finisher = finisher;
>>> this.characteristics = characteristics;
>>> }
>>>
>>> + CollectorImpl(Supplier<A> supplier,
>>> + BiConsumer<A, T> accumulator,
>>> + BinaryOperator<A> combiner,
>>> + Function<A,R> finisher,
>>> + Set<Characteristics> characteristics) {
>>> + this(supplier, null, accumulator, combiner, finisher,
>>> characteristics);
>>> + }
>>> +
>>> CollectorImpl(Supplier<A> supplier,
>>> BiConsumer<A, T> accumulator,
>>> BinaryOperator<A> combiner,
>>> @@ -228,6 +240,10 @@
>>> return supplier;
>>> }
>>>
>>> + IntFunction<A> sizedSupplier() {
>>> + return sizedSupplier;
>>> + }
>>> +
>>> @Override
>>> public BinaryOperator<A> combiner() {
>>> return combiner;
>>> @@ -275,8 +291,11 @@
>>> */
>>> public static <T>
>>> Collector<T, ?, List<T>> toList() {
>>> - return new CollectorImpl<>((Supplier<List<T>>)
>>> ArrayList::new, List::add,
>>> + return new CollectorImpl<>((Supplier<List<T>>) ArrayList::new,
>>> + ArrayList::new,
>>> + List::add,
>>> (left, right) -> {
>>> left.addAll(right); return left; },
>>> + castingIdentity(),
>>> CH_ID);
>>> }
>>>
>>> Index: src/java.base/share/classes/java/util/stream/ReduceOps.java
>>> IDEA additional info:
>>> Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
>>> <+>UTF-8
>>> ===================================================================
>>> --- src/java.base/share/classes/java/util/stream/ReduceOps.java
>>> (revision 53626:e2fc434b410a35b28ab433c29863c8a26e4e813a)
>>> +++ src/java.base/share/classes/java/util/stream/ReduceOps.java
>>> (revision 53626+:e2fc434b410a+)
>>> @@ -36,6 +36,7 @@
>>> import java.util.function.BinaryOperator;
>>> import java.util.function.DoubleBinaryOperator;
>>> import java.util.function.IntBinaryOperator;
>>> +import java.util.function.IntFunction;
>>> import java.util.function.LongBinaryOperator;
>>> import java.util.function.ObjDoubleConsumer;
>>> import java.util.function.ObjIntConsumer;
>>> @@ -155,13 +156,20 @@
>>> public static <T, I> TerminalOp<T, I>
>>> makeRef(Collector<? super T, I, ?> collector) {
>>> Supplier<I> supplier =
>>> Objects.requireNonNull(collector).supplier();
>>> + @SuppressWarnings("unchecked")
>>> + IntFunction<I> sizedSupplier = collector instanceof
>>> Collectors.CollectorImpl ?
>>> + ((Collectors.CollectorImpl<?, I, ?>)
>>> collector).sizedSupplier() : null;
>>> BiConsumer<I, ? super T> accumulator = collector.accumulator();
>>> BinaryOperator<I> combiner = collector.combiner();
>>> class ReducingSink extends Box<I>
>>> implements AccumulatingSink<T, I, ReducingSink> {
>>> @Override
>>> public void begin(long size) {
>>> - state = supplier.get();
>>> + if (sizedSupplier != null && size >= 0 && size <=
>>> Integer.MAX_VALUE) {
>>> + state = sizedSupplier.apply((int) size);
>>> + } else {
>>> + state = supplier.get();
>>> + }
>>> }
>>>
>>> @Override
>>>
>>
More information about the core-libs-dev
mailing list