BiCollector
Kirk Pepperdine
kirk.pepperdine at gmail.com
Tue Jun 19 06:17:56 UTC 2018
> On Jun 19, 2018, at 9:11 AM, Zheka Kozlov <orionllmain at gmail.com> wrote:
>
> The function you propose is just a binary variant of mapping:
>
> Collector<T, ?, R> mapping(
> Function<T, U> mapper,
> Collector<U, ?, R> downstream);
>
> (omitted '? super' for readability)
>
> So, it is logical to use the name biMapping:
>
> Collector<T, ?, R> biMapping(
> Function<T, U1> mapper1,
> Function<T, U2> mapper2,
> Collector<U1 ?, R1> downstream1,
> Collector<U2 ?, R2> downstream2,
> BiFunction<R1, R2, R> finisher);
+1
>
>
> 2018-06-19 7:38 GMT+07:00 John Rose <john.r.rose at oracle.com>:
>
>> On Jun 18, 2018, at 2:29 PM, Brian Goetz <Brian.Goetz at Oracle.COM> wrote:
>>>
>>> "bisecting" sounds like it sends half the elements to one collector and
>> half to the other …
>>
>> The main bisection or splitting operation that's relevant to a stream is
>> what
>> a spliterator does, so this is a concern.
>>
>> Nobody has mentioned "unzipping" yet; this is a term of art which applies
>> to streams
>> of tuples. The image of a zipper is relatively clear and unambiguous, and
>> the tradition
>> is pretty strong. https://en.wikipedia.org/wiki/
>> Convolution_(computer_science)
>>
>> The thing we are looking at differs in two ways from classic "unzipping":
>> First, the
>> two collectors themselves convert the same T elements to whatever internal
>> value
>> (T1, T2) is relevant. Second, we are looking at a new terminal operation
>> (a collector) which
>> consolidates the results from both of streams (a notional Stream<T1> and
>> Stream<T2>,
>> if you like), rather than delivering the streams as a pair of outputs.
>>
>> The classic "unzip" operation applies "fst" and "snd" (or some other
>> conventional
>> set of access functions) to each T-element of the input stream. Since we
>> don't
>> have a privileged 2-tuple type (like Pair<T1,T2>) in Java, the user would
>> need
>> to nominate those two functions explicitly, either by folding them into a
>> "mapping"
>> on each collector, or as a utility overloading like this:
>>
>> unzipping(
>> Function<? super T, T1> f1, // defaults to identity
>> Collector<? super T1, ?, R1> c1,
>> Function<? super T, T2> f2, // defaults to identity
>> Collector<? super T2, ?, R2> c2,
>> BiFunction<? super R1, ? super R2, ? extends R> finisher) {
>> return toBoth(mapping(f1, c1), mapping(f2, c2));
>> }
>>
>>
>>> "tee" might be a candidate, though it doesn't follow the `ing
>> convention. "teeing" sounds dumb.
>>
>>
>> "tee" sounds asymmetrical. "diverting" or "detouring" are "*ing" words
>> that might
>> express asymmetrical disposition of derivative streams.
>>
>> An asymmetrical operation might be interesting if it could fork off a
>> stream of
>> its own. It would have to have a side-effecting void-producing terminal
>> operation,
>> so the main (undiverted) stream could continue to progress at the top
>> level of
>> the expression.
>>
>> interface Stream<T> {
>> default Stream<T> diverting(Consumer<Stream<T>> tee) { … }
>> }
>>
>> values.stream().diverting(s2->s2.forEach(System.out::
>> println)).filter(…).collect(…);
>>
>> Or (and this might be a sweet spot) a symmetric stream-tee operation could
>> materialize two sibling streams and rejoin their results with a bifunction:
>>
>> class Collectors {
>> static <R1, R2, R> Stream<T> unzipping(
>> Function<? super Stream<T>, R1> f1,
>> Function<? super Stream<T>, R2> f2,
>> BiFunction<? super R1, ? super R2, ? extends R> finisher)
>> { … }
>> }
>>
>> values.stream().unzipping(
>> s1->s1.forEach(System.out::println),
>> s2->s2.filter(…).collect(…),
>> (void1, r2)->r2
>> );
>>
>> This would allow each "fork child" of the stream to continue to use the
>> Stream API instead of the more restrictive Collector operators.
>>
>> Optimal code generation for forked/unzipped/teed streams would be tricky,
>> requiring simultaneous loop control logic for each stream.
>> To me that's a feature, not a bug, since hand-writing ad hoc
>> simultaneous loops is a pain.
>>
>> My $0.02.
>>
>> — John
More information about the core-libs-dev
mailing list