JDK-8072840: Presizing for Stream Collectors

Remi Forax forax at univ-mlv.fr
Sat Feb 15 14:44:09 UTC 2025


> From: "Viktor Klang" <viktor.klang at oracle.com>
> To: "Paul Sandoz" <paul.sandoz at oracle.com>, "Fabian Meumertzheim"
> <fabian at buildbuddy.io>
> Cc: "core-libs-dev" <core-libs-dev at openjdk.org>
> Sent: Thursday, February 13, 2025 11:30:59 PM
> Subject: Re: JDK-8072840: Presizing for Stream Collectors

> Indeed. I hope I didn't sound discouraging about the possibility to propagate
> the stream size information.

> I merely want to emphasize that it may necessitate a slightly broader take on
> the problem of propagation of stream-instance metadata, especially in the face
> of Gatherers becoming a finalized feature.

We already have an abstraction for propagating metadata, it's the query part of Spliterator (characteristics/estimateSize/comparator etc, technically all abstract methods that does not starts with "try"). 
For a Gatherer, we need a way to say if a characteristics is preserved or removed. 
For a collector, we need a way to have a supplier that takes a Spliterator (a synthetic one, not the one that powers the actual stream) so the characteristics can be queried. 

> It's great that you started this conversation, Fabian!

> Cheers,
>
regards, 
Rémi 

> Viktor Klang
> Software Architect, Java Platform Group
> Oracle

> From: core-libs-dev <core-libs-dev-retn at openjdk.org> on behalf of Paul Sandoz
> <paul.sandoz at oracle.com>
> Sent: Thursday, 13 February 2025 20:18
> To: Fabian Meumertzheim <fabian at buildbuddy.io>
> Cc: core-libs-dev <core-libs-dev at openjdk.org>
> Subject: Re: JDK-8072840: Presizing for Stream Collectors
> Hi Fabian,

> Thanks for sharing and reaching out with the idea before getting too beholden to
> it.

> I logged this is quite a while ago. It seemed like a possible good idea at the
> time, although I never liked the duplication of suppliers. I have become less
> enthusiastic overtime, especially so as Gatherers have been added. (Gatherer is
> the underlying primitive we could not find when we were furiously developing
> streams and meeting the Java 8 deadline.) My sense is if we are going to
> address we need to think more broadly about Gatherers. And, Viktor being the
> lead on Gatherers has a good take on where this might head.

> Paul.

> > On Feb 12, 2025, at 2:09 AM, Fabian Meumertzheim <fabian at buildbuddy.io> wrote:

> > As an avid user of Guava's ImmutableCollections, I have been
> > interested in ways to close the efficiency gap between the built-in
> > `Stream#toList()` and third-party `Collector` implementations such as
> > `ImmutableList#toImmutableList()`. I've found the biggest problem to
> > be the lack of sizing information in `Collector`s, which led to me to
> > draft a solution to JDK-8072840:
>> [ https://github.com/openjdk/jdk/pull/23461 |
> > https://github.com/openjdk/jdk/pull/23461 ]

> > The benchmark shows pretty significant gains for sized streams that
> > mostly reshape data (e.g. slice records or turn a list into a map by
> > associating keys), which I've found to be a pretty common use case.

> > Before I formally send out the PR for review, I would like to gather
> > feedback on the design aspects of it (rather than the exact
> > implementation). I will thus leave it in draft mode for now, but
> > invite anyone to comment on it or on this thread.

> > Fabian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20250215/e09d9f89/attachment.htm>


More information about the core-libs-dev mailing list