RFR: JDK-8319123 : Implementation of JEP-461: Stream Gatherers (Preview) [v2]
Tagir F. Valeev
tvaleev at openjdk.org
Thu Nov 9 09:45:24 UTC 2023
On Wed, 8 Nov 2023 16:37:29 GMT, Viktor Klang <vklang at openjdk.org> wrote:
>> Sorry, I'm not sure what do you mean by incrementally. But I've realized that to implement this you may need to push to downstream during combining. E.g., imagine the stream of numbers 1...1000, and you want to make `parallel().gather(pairMap((left, right) -> left+"->"+right)).collect(toList())`, creating a list of "1->2", "2->3", ..., "999->1000" strings. Now, let's assume that the stream was split into four parts during the parallelization: `1..250`, `251..500`, `501..750`, `751..1000`. So you create downstream accumulators and push the pairs there with integrator, like:
>> "1->2", ..., "249->250" into first toList() accumulator,
>> "251->252", ..., "499->500" into second toList() accumulator
>> "501->502", ..., "749->750" into third toList() accumulator
>> "751->752", ..., "999->1000" into fourth toList() accumulator
>> At this point we have four independent gatherer states, and they keep the first and last element from the upstream. Now, you combine for example, first and second gatherer states. The first one is [first=1, last=250], and the second is [first=251, last=500]. The combined gatherer should be [first=1, last=500] and at the same time during the combining it should push the "250->251" object to the downstream, which will land into the first toList() accumulator.
>>
>> Hopefully my explanation is clear. Will it be possible to extend the combiner function and provide the Downstream there as well, so combiner can also work as a finisher for the first combined chunk?
>
> Ah, now I see what you mean. In this specific case you might be able to create a combiner (which would also need a Downstream) such that you can add the missing combinations (the end of the "left" and the beginning of the "right"). But imagine something like `windowFixed(3)`, where the presence of 1 or 2 elements to the "left" will skew all the elements to the "right", potentially rippling through the entire stream.
>
> I think the Preview is a perfect place to see if the combiner would be made even better with a Downstream reference or not. Logically, the output of said Downstream would be placed *after* the "left"'s output, and *before* the "right"'s output (so in the middle of the two).
>
> But then someone might want to be able to inject elements outside of that order, but I guess that remains to be seen.
>
> Great thoughts, @amaembo 👍
With windowFixed, true parallelization is not possible, unless the stream is SIZED/SUBSIZED, but this information is not available for Gatherer, so we cannot solve this at all (and this is probably a reasonable limitation of the API).
> But then someone might want to be able to inject elements outside of that order, but I guess that remains to be seen.
Not sure where one may want to inject other elements. If you want to add something after the right output, you can do this inside the finisher, or the next combiner, when the "right" will become the "left". There are no other places to insert. E.g., you cannot insert "before left", because downstream collector has no corresponding operation. To me it looks like inserting after the "left" part is the only possibility.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/16420#discussion_r1387733722
More information about the core-libs-dev
mailing list