<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
>I could see that being useful for properties such as non-nullness,<br>
which would allow collections such as ImmutableList to skip the null<br>
check in the end.</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
I'm thinking things like ordered/unordered, whether the stream is parallel or not (might want to use different representation for a sequential stream), etc.<br>
<br>
>Do you think that there could be a need to pass stream information to<br>
anything other than the Gatherer's state initializer? Based on a<br>
cursory glance, it looks straightforward to pass the same info to it<br>
as to the Collector. If that's true and we go with a more extensible<br>
design than a plain long, Gatherers could be opted in in follow-up<br>
work.</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
It's more involved than that—as Gatherers produce output, it would be necessary to devise a scheme which allows Gatherers to communicate upper and lower bounds on the output. This information would then need to be threaded through the chain of gatherers and
emerge on the other side. This is slightly more involved than just communicating characteristics, since it is information based off of the stream and not merely the operation itself.</div>
<div style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="color: inherit;" id="Signature">
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Cheers,<br>
√</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<b><br>
</b></div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<b>Viktor Klang</b></div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Software Architect, Java Platform Group<br>
Oracle</div>
</div>
<div id="appendonsend" style="color: inherit;"></div>
<div style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<hr style="display: inline-block; width: 98%;">
<div id="divRplyFwdMsg" dir="ltr" style="color: inherit;"><span style="font-family: Calibri, sans-serif; font-size: 11pt; color: rgb(0, 0, 0);"><b>From:</b> Fabian Meumertzheim <fabian@buildbuddy.io><br>
<b>Sent:</b> Thursday, 13 February 2025 17:11<br>
<b>To:</b> Viktor Klang <viktor.klang@oracle.com><br>
<b>Cc:</b> core-libs-dev@openjdk.org <core-libs-dev@openjdk.org><br>
<b>Subject:</b> [External] : Re: JDK-8072840: Presizing for Stream Collectors</span>
<div> </div>
</div>
<div class="elementToProof" style="font-size: 11pt;">On Thu, Feb 13, 2025 at 3:06 PM Viktor Klang <viktor.klang@oracle.com> wrote:<br>
> While it may look enticing to merely propagate expected element count as an input parameter to the supplier function,<br>
> I think it deserves some extra thought, specifically if it may make more sense to pass some sort of StreamInfo type which can provide more metadata in the future.<br>
<br>
I could see that being useful for properties such as non-nullness,<br>
which would allow collections such as ImmutableList to skip the null<br>
check in the end.<br>
<br>
> Another open question is how to propagate this information through Gatherers (i.e. a bigger scope than Collector-augmentation) to enable more sophisticated optimizations—because ultimately the availability of the information throughout the pipeline is going
to be important for Collector.<br>
<br>
Do you think that there could be a need to pass stream information to<br>
anything other than the Gatherer's state initializer? Based on a<br>
cursory glance, it looks straightforward to pass the same info to it<br>
as to the Collector. If that's true and we go with a more extensible<br>
design than a plain long, Gatherers could be opted in in follow-up<br>
work.<br>
<br>
Best,<br>
Fabian<br>
<br>
><br>
><br>
> Cheers,<br>
> √<br>
><br>
><br>
> Viktor Klang<br>
> Software Architect, Java Platform Group<br>
> Oracle<br>
> ________________________________<br>
> From: core-libs-dev <core-libs-dev-retn@openjdk.org> on behalf of Fabian Meumertzheim <fabian@buildbuddy.io><br>
> Sent: Wednesday, 12 February 2025 11:09<br>
> To: core-libs-dev@openjdk.org <core-libs-dev@openjdk.org><br>
> Subject: JDK-8072840: Presizing for Stream Collectors<br>
><br>
> As an avid user of Guava's ImmutableCollections, I have been<br>
> interested in ways to close the efficiency gap between the built-in<br>
> `Stream#toList()` and third-party `Collector` implementations such as<br>
> `ImmutableList#toImmutableList()`. I've found the biggest problem to<br>
> be the lack of sizing information in `Collector`s, which led to me to<br>
> draft a solution to JDK-8072840:<br>
> <a href="https://urldefense.com/v3/__https://github.com/openjdk/jdk/pull/23461__;!!ACWV5N9M2RV99hQ!N-RbriJ93dED1WYLFxFZ4dD5oTx5wqPCPTmv4Oivm3IFJTHNwZ1v3d228Ifs8SdFJwcc7YZnCuNZXG9LmQ3ZCA4$" id="OWA1aba4f11-0826-37c1-6e84-c53b7878075d" class="OWAAutoLink" data-auth="NotApplicable">
https://urldefense.com/v3/__https://github.com/openjdk/jdk/pull/23461__;!!ACWV5N9M2RV99hQ!N-RbriJ93dED1WYLFxFZ4dD5oTx5wqPCPTmv4Oivm3IFJTHNwZ1v3d228Ifs8SdFJwcc7YZnCuNZXG9LmQ3ZCA4$</a><br>
><br>
> The benchmark shows pretty significant gains for sized streams that<br>
> mostly reshape data (e.g. slice records or turn a list into a map by<br>
> associating keys), which I've found to be a pretty common use case.<br>
><br>
> Before I formally send out the PR for review, I would like to gather<br>
> feedback on the design aspects of it (rather than the exact<br>
> implementation). I will thus leave it in draft mode for now, but<br>
> invite anyone to comment on it or on this thread.<br>
><br>
> Fabian</div>
</body>
</html>