Proposal to enhance Stream.collect

August Nagro augustnagro at gmail.com
Sat Feb 23 22:27:54 UTC 2019


Calling Stream.collect(Collector) is a popular terminal stream operation.
But because the collect methods provide no detail of the stream's
characteristics, collectors are not as efficient as they could be.

For example, consider a non-parallel, sized stream that is to be collected
as a List. This is a very common case for streams with a Collection source.
Because of the stream characteristics, the Collector.supplier() could
initialize a list with initial size (since the merging function will never
be called), but the current implementation prevents this.

I should note that the characteristics important to collectors are those
defined by Spliterator, like: Spliterator::characteristics,
Spliterator::estimateSize, and Spliterator::getExactSizeIfKnown.

One way this enhancement could be implemented is by adding a method
Stream.collect(Function<ReadOnlySpliterator, Collector> collectorBuilder).
ReadOnlySpliterator would implement the spliterator methods mentioned
above, and Spliterator would be made to implement this interface.

For example, here is a gist with what Collectors.toList could look like:
https://gist.github.com/AugustNagro/e66a0ddf7d47b4f11fec8760281bb538

ReadOnlySpliterator may need to be replaced with some stream specific
abstraction, however, since Stream.spliterator() does not return with the
correct characteristics. The below code returns false, for example (is this
a bug?):

Stream.of(1,2,3).parallel().map(i ->
i+1).spliterator().hasCharacteristics(Spliterator.CONCURRENT)

Looking forward to your thoughts,

- August Nagro


More information about the core-libs-dev mailing list