[External] : Re: Update on JEP-461: Stream Gatherers (Preview)

Viktor Klang viktor.klang at oracle.com
Tue Oct 31 11:21:01 UTC 2023


I will ask for a step further and add an overload to collect that takes a Gatherer and calls gather(gatherer).collect(Collectors.singleton()) so we can directly write
  var neatlyFolded = stream.collect(fold(...));

As a user, I see value in using the more powerful Gatherer API like a collector without having to think too much.

While I can definitely sympathize with this idea, very few Gatherers emit strictly a single element under all executions, and I'd expect that users will not always want that enforced by default (throwing an exception when number of elements not being strictly 1).

Cheers,
√


Viktor Klang
Software Architect, Java Platform Group
Oracle
________________________________
From: Remi Forax <forax at univ-mlv.fr>
Sent: Monday, 30 October 2023 23:39
To: Viktor Klang <viktor.klang at oracle.com>
Cc: Tyler Kindy <me at tylerkindy.com>; core-libs-dev <core-libs-dev at openjdk.org>
Subject: Re: [External] : Re: Update on JEP-461: Stream Gatherers (Preview)



________________________________
From: "Viktor Klang" <viktor.klang at oracle.com>
To: "Tyler Kindy" <me at tylerkindy.com>
Cc: "core-libs-dev" <core-libs-dev at openjdk.org>
Sent: Monday, October 30, 2023 10:59:48 PM
Subject: Re: [External] : Re: Update on JEP-461: Stream Gatherers (Preview)
That's also a good point, and I've heard from multiple sources that sometimes you want to make sure that you only have a single element left at the end, and otherwise throw an exception.

Hello,


Fortunately it is possible to do something to the effect of:

// Proof of concept only
public static <T> Collector<T,?,T> singleton() {
    class State { T value; boolean hasValue; }
    return Collector.of(
        State::new,
        (state, e) -> {
            if (state.hasValue)
                throw new IllegalStateException("Stream has more than one element!");

            state.hasValue = true;
            state.value = e;
        },
        (left, right) -> {
            if (left.hasValue && right.hasValue)
                throw new IllegalStateException("Stream has more than one element!");
            else if (left.hasValue)
                return left;
            else
                return right;
        },
        (state) -> {
            if (!state.hasValue)
                throw new IllegalStateException("Stream has less than one element!");
            else
                return state.value;
        });
}

Which means that you could write things like:

var neatlyFolded = stream.gather(fold(…)).collect(singleton());

I will ask for a step further and add an overload to collect that takes a Gatherer and calls gather(gatherer).collect(Collectors.singleton()) so we can directly write
  var neatlyFolded = stream.collect(fold(...));

As a user, I see value in using the more powerful Gatherer API like a collector without having to think too much.



Cheers,
√

regards,
Rémi



Viktor Klang
Software Architect, Java Platform Group
Oracle
________________________________
From: Tyler Kindy <me at tylerkindy.com>
Sent: Monday, 30 October 2023 18:38
To: Viktor Klang <viktor.klang at oracle.com>
Cc: core-libs-dev at openjdk.org <core-libs-dev at openjdk.org>
Subject: Re: [External] : Re: Update on JEP-461: Stream Gatherers (Preview)

Hi Viktor,

Thanks for the response! I see what you mean, it’s a good point that single-element streams are just as valid as streams with other numbers of elements.

The weird part for me, though, is getting the folded result out of the stream at the end (which, while not always, I believe will be the most common thing to want to do after `fold`). The obvious way, which you used in your presentation, is `findAny`. Since the stream could be empty, it makes sense that `findAny` returns an `Optional`, but with `fold` we know that the stream will have exactly one element in it.

Of course, this API is totally serviceable. You can use `Optional::orElseThrow` on the result of `findAny` to communicate your intent that you expect there to be an element. But that feels a bit roundabout to me; I feel that a cleaner API would be to get the folded result out directly, like how `Collector` works.

To your point, implementing `fold` as a Gatherer doesn’t mean we can never have a terminal `fold`. And having `fold` in any way at all is great; that’s one feature of Streams I commonly find myself wanting when writing code (the other is windowing, which I think you’ve covered the need for well; no notes on those \uD83D\uDE04).

But it does make me wish that `Collector` was also flexible enough to implement `fold`. Maybe a topic for future work. \uD83D\uDE04

Thanks again,
Tyler

On Oct 30, 2023, at 10:39 AM, Viktor Klang <viktor.klang at oracle.com> wrote:

Hi Tyler,

Thank you for the kind words -- they are much appreciated. And you have a very good question indeed!

My thinking behind making fold a Gatherer is that I think that it is strictly more powerful than "only" having it as a collector (It wouldn't be able to be a Collector since you'd need a combiner for it, but also being able to compose it with other operations, choosing the output type at a later stage in the process, and so forth).

Also, if you think about it -- single-element Streams are just as valid as empty Streams, N-sized Streams, or even unbounded Streams, and conceptually, there's no difference between the following two Streams:

var a = Stream.of("1234")
var b = Stream.of(1,2,3,4).gather(fold(() -> "", (str, next) -> str + next))

So that was my thinking—allow developers to stay within Stream processing for as long as they want, and choose the terminal operation when they need to break out from the Stream.


Cheers,
√


Viktor Klang
Software Architect, Java Platform Group
Oracle
________________________________
From: Tyler Kindy <me at tylerkindy.com>
Sent: Monday, 30 October 2023 12:14
To: Viktor Klang <viktor.klang at oracle.com>
Cc: core-libs-dev at openjdk.org <core-libs-dev at openjdk.org>
Subject: [External] : Re: Update on JEP-461: Stream Gatherers (Preview)

Thanks for the JEP and your talk, Viktor! I think `Stream::gather` will be super useful in my day-to-day as a Java developer.

I’m curious why `fold` is being implemented with gatherers. I recognize `Gatherer` is designed to support intermediate operations, but `fold` feels inherently like a terminal operation to me since it, like `reduce` or `collect`, consumes all the elements in the stream and produces a single result.

Is there a technical limitation to making `fold` a terminal operation? For example, does `Collector` inherently presume parallelization in a way that `Gatherer` does not?

Or is the idea mainly to demonstrate the power of gatherers, and we could also make `fold` a terminal operation with the current `Stream` API?

Thank you!
Tyler Kindy

On Oct 27, 2023, at 9:50 AM, Viktor Klang <viktor.klang at oracle.com> wrote:

Greetings,

As you may have already seen,Stream Gatherersis now aPreview JEP with Candidate status<https://openjdk.org/jeps/461>

Work-in-progress (interfaces, implementation, tests, benches, and documentation) for JEP-461 is currently availablehere<https://urldefense.com/v3/__https://github.com/viktorklang-ora/jdk/tree/gatherer__;!!ACWV5N9M2RV99hQ!Lz2sC02xW35XpuwoaqBvD_iR80Xrzkbj-60oOKuQklUOm8e69-O3WC9N93leBFbMkFmULJmsat9k1dmt$>.

Whilethe design<https://cr.openjdk.org/~vklang/Gatherers.html>has held up well, there are some important improvements made since the original design document was written.

Notable changes (without any particular order):

  *   Stream::gather()now has a default implementation.

  *   Gatherer::supplier()was renamed toGatherer::initializer()to better reflect intent.

  *   Gatherer.Sink<R>was renamed toGatherer.Downstream<R>to better signal what it represents.

  *   Gatherer::collect(Collector)and its companion typeGatherer.ThenCollectorwas dropped due to compatibility concerns with existing code which operates onCollector.

  *
Gatherer.Characteristicshave been eliminated and superseded by having default values that are used as sentinels.
 (discussed further down the list)
This is important because with the Characteristics-model keeping alignment between Characteristics and actual implementation proved brittle, and under composition of Gatherers computing the union was inefficient and mostly lead to an empty set in the end.

  *   Gatherer.defaultInitializer(),Gatherer.defaultCombiner(), andGatherer.defaultFinisher()were added as static methods—these are the sentinels used to elide calling the initializer, to elide calling the combiner (avoid parallelization), and to elide calling the finisher, respectively.

  *   Gatherer::initializer(),Gatherer::combiner(), andGatherer::finisher()default implementations return the respective sentinels.

  *   A subtype ofIntegratornamedGreedywas added, together with a factory method to guide lambda-to-nominal-type conversion. This allows creators of Gatherers to signal that anIntegratorwill never initiate a short-circuit (but may relay one from downstream), and that is available during evaluation to determine if the operation can short-circuit or not.

  *   Factories for creating anonymous Gatherers were expanded upon to includeGatherer.of()andGatherer.ofSequential()with different sets of parameters, primarily to make it more ergonomical and easier to read and write the code using those factories.

  *   A curated set of initial built-in Gatherers is located injava.util.stream.Gatherers

I recently presentedGatherers at Devoxx<https://urldefense.com/v3/__https://www.youtube.com/watch?v=8fMFa6OqlY8__;!!ACWV5N9M2RV99hQ!Lz2sC02xW35XpuwoaqBvD_iR80Xrzkbj-60oOKuQklUOm8e69-O3WC9N93leBFbMkFmULJmsam2MP-_S$>, which I'd recommend watching for an introduction to the feature.
<Outlook-cztoycvq.jpg><https://urldefense.com/v3/__https://www.youtube.com/watch?v=8fMFa6OqlY8__;!!ACWV5N9M2RV99hQ!Lz2sC02xW35XpuwoaqBvD_iR80Xrzkbj-60oOKuQklUOm8e69-O3WC9N93leBFbMkFmULJmsam2MP-_S$>
Teaching old Streams new tricks By Viktor Klang<https://urldefense.com/v3/__https://www.youtube.com/watch?v=8fMFa6OqlY8__;!!ACWV5N9M2RV99hQ!Lz2sC02xW35XpuwoaqBvD_iR80Xrzkbj-60oOKuQklUOm8e69-O3WC9N93leBFbMkFmULJmsam2MP-_S$>
Have you ever wanted to perform an operation on a java.util.stream.Stream only to find that the existing set of operations didn't provide what you needed—forcing you to break out early from the Stream and perform the logic outside of it? As a matter of fact, java.util.stream was the first JDK API designed with lambdas in mind and was ...
www.youtube.com<https://urldefense.com/v3/__http://www.youtube.com/__;!!ACWV5N9M2RV99hQ!Lz2sC02xW35XpuwoaqBvD_iR80Xrzkbj-60oOKuQklUOm8e69-O3WC9N93leBFbMkFmULJmsalMQI96j$>


Cheers,
√


Viktor Klang
Software Architect, Java Platform Group
Oracle


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20231031/654f9fb4/attachment-0001.htm>


More information about the core-libs-dev mailing list