[External] : Re: Fw: New candidate JEP: 485: Stream Gatherers

Viktor Klang viktor.klang at oracle.com
Wed Oct 16 11:42:14 UTC 2024


Hi Cay,

>Maybe just a mental block on my side. If other programmers worked with the API and found it consistent, don't mind me. I just thought to bring it up in case this has not received a lot of scrutiny. It is a pretty specialized aspect, after all.

Perhaps the best thing here is to just let it sit for a while and see if some time is all that's needed.

>PS. I also think it would be helpful to be more explicit in the javadoc about the value of exploiting greediness. What is the sweet spot? Recognizing that nothing in the pipeline is short-circuiting and then pushing elements through without checking? Or more aggressive parallelization?

I guess it depends on "sweet spot from whom". From an implementor perspective if you don't intend to initiate a short-circuit signal, using ofGreedy is a good way to both remind yourself, and send that signal to Gatherer-evaluators. Omitting ofGreedy won't break anything. If you implement a Gatherer-evaluator, I think it is safe to presume that it completely depends on the use-case—and of course all optimization work should be based on sensible benchmarking.

>I looked at the source code of Gatherer/Gatherers/GathererOp, and it is not obvious to me how one can see a significant performance win with greediness. Can you recommend a benchmark for me to study?

I think it makes sense to look at the mentions of "greedy" in GathererOp and Gatherers (specifically the composition).

There are a fair number of benchmarks here (they all start with "Gather"): https://github.com/openjdk/jdk/tree/master/test/micro/org/openjdk/bench/java/util/stream/ops/ref
[https://opengraph.githubassets.com/004d92ce0313fcb0df04cee2d5ae62241a675b042bb73c5de9481333278bcfa2/openjdk/jdk]<https://github.com/openjdk/jdk/tree/master/test/micro/org/openjdk/bench/java/util/stream/ops/ref>
jdk/test/micro/org/openjdk/bench/java/util/stream/ops/ref at master · openjdk/jdk<https://github.com/openjdk/jdk/tree/master/test/micro/org/openjdk/bench/java/util/stream/ops/ref>
JDK main-line development https://openjdk.org/projects/jdk - openjdk/jdk
github.com


Cheers,
√


Viktor Klang
Software Architect, Java Platform Group
Oracle

________________________________
From: Cay Horstmann <cay.horstmann at gmail.com>
Sent: Tuesday, 15 October 2024 20:19
To: Viktor Klang <viktor.klang at oracle.com>; core-libs-dev at openjdk.org <core-libs-dev at openjdk.org>
Subject: Re: [External] : Re: Fw: New candidate JEP: 485: Stream Gatherers

Thanks Viktor, I like your concept of multiple vantage points a lot.

> Implementor: "use ofGreedy to signal that this operation does not itself initiate short-circuiting"
> Reviewer: "when you see ofGreedy verify that this operation does not itself initiate short-circuiting"
> Debugger: "when the expectation was that the stream should be short-circuiting and isn't, one thing to look for is ofGreedy"
>
> Implementor: "use ofSequential to signal that this operation is intended to always be executed sequentially"
> Reviewer: "when you see ofSequential, understand why the operation can't/shouldn't be parallelized for parallel streams"
> Debugger: "if the parallel stream is not performing as well as expected, one thing to look for is ofSequential!

I just don't see how these scenarios are parallel. To me, the following  would be parallel.

Scenario I: Brag about better performance

As implementor, I can always go the easy route with of/of: sequential, non-greedy. It'll work, but performance may suffer. Or I can assert that I have done the extra care: ofParallel/ofGreedy.

As reviewer, I see the deluxe ofParallel/ofGreedy, and I verify that the operation enables parallel execution/doesn't short circuit.

As debugger, I see slow performance, and I am not surprised if it's the lame of/of.

Scenario II: Warn of worse performance

As implementor, I aim to produce the most performant way with of/of: parallelizable, greedy. Nice names for the good stuff. If I can't, I'll alert with ofSerial/ofShortCircuit.

As reviewer, I see the coveted of/of, and I verify that the operation enables parallel execution/doesn't short circuit.

As debugger, I see slow performance, and I am not surprised if it is ofSerial/ofShortCircuit. They had it coming.

Either naming pair (ofParallel/ofGreedy or ofSerial/ofShortCircuit) feels consistent to me. But I find of/ofGreedy and ofSerial/of awkward to explain.

Maybe just a mental block on my side. If other programmers worked with the API and found it consistent, don't mind me. I just thought to bring it up in case this has not received a lot of scrutiny. It is a pretty specialized aspect, after all.

Cheers,

Cay

PS. I also think it would be helpful to be more explicit in the javadoc about the value of exploiting greediness. What is the sweet spot? Recognizing that nothing in the pipeline is short-circuiting and then pushing elements through without checking? Or more aggressive parallelization?

I looked at the source code of Gatherer/Gatherers/GathererOp, and it is not obvious to me how one can see a significant performance win with greediness. Can you recommend a benchmark for me to study?

--

Cay S. Horstmann | https://urldefense.com/v3/__https://horstmann.com__;!!ACWV5N9M2RV99hQ!L7TUnjIXEj8etBmsbDi4Y0NIBR_DpdDkgJ_6QMR7JTEcftYTsVg6pinwac2lDODJd-lbUAcTYIMVWWLk0ej1qtAeqw$

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20241016/afe0effb/attachment-0001.htm>


More information about the core-libs-dev mailing list