BiCollector

Zheka Kozlov orionllmain at gmail.com
Tue Jun 19 06:11:15 UTC 2018


The function you propose is just a binary variant of mapping:

Collector<T, ?, R> mapping(
                              Function<T, U> mapper,
                              Collector<U, ?, R> downstream);

(omitted '? super' for readability)

So, it is logical to use the name biMapping:

Collector<T, ?, R> biMapping(
                              Function<T, U1> mapper1,
                              Function<T, U2> mapper2,
                              Collector<U1 ?, R1> downstream1,
                              Collector<U2 ?, R2> downstream2,
                              BiFunction<R1, R2, R> finisher);


2018-06-19 7:38 GMT+07:00 John Rose <john.r.rose at oracle.com>:

> On Jun 18, 2018, at 2:29 PM, Brian Goetz <Brian.Goetz at Oracle.COM> wrote:
> >
> > "bisecting" sounds like it sends half the elements to one collector and
> half to the other …
>
> The main bisection or splitting operation that's relevant to a stream is
> what
> a spliterator does, so this is a concern.
>
> Nobody has mentioned "unzipping" yet; this is a term of art which applies
> to streams
> of tuples.  The image of a zipper is relatively clear and unambiguous, and
> the tradition
> is pretty strong.  https://en.wikipedia.org/wiki/
> Convolution_(computer_science)
>
> The thing we are looking at differs in two ways from classic "unzipping":
> First, the
> two collectors themselves convert the same T elements to whatever internal
> value
> (T1, T2) is relevant.  Second, we are looking at a new terminal operation
> (a collector) which
> consolidates the results from both of streams (a notional Stream<T1> and
> Stream<T2>,
> if you like), rather than delivering the streams as a pair of outputs.
>
> The classic "unzip" operation applies "fst" and "snd" (or some other
> conventional
> set of access functions) to each T-element of the input stream.  Since we
> don't
> have a privileged 2-tuple type (like Pair<T1,T2>) in Java, the user would
> need
> to nominate those two functions explicitly, either by folding them into a
> "mapping"
> on each collector, or as a utility overloading like this:
>
>    unzipping(
>                 Function<? super T, T1> f1,  // defaults to identity
>                 Collector<? super T1, ?, R1> c1,
>                 Function<? super T, T2> f2,  // defaults to identity
>                 Collector<? super T2, ?, R2> c2,
>                 BiFunction<? super R1, ? super R2, ? extends R> finisher) {
>      return toBoth(mapping(f1, c1), mapping(f2, c2));
>   }
>
>
> > "tee" might be a candidate, though it doesn't follow the `ing
> convention.  "teeing" sounds dumb.
>
>
> "tee" sounds asymmetrical.  "diverting" or "detouring" are "*ing" words
> that might
> express asymmetrical disposition of derivative streams.
>
> An asymmetrical operation might be interesting if it could fork off a
> stream of
> its own.  It would have to have a side-effecting void-producing terminal
> operation,
> so the main (undiverted) stream could continue to progress at the top
> level of
> the expression.
>
> interface Stream<T> {
>   default Stream<T> diverting(Consumer<Stream<T>> tee) { … }
> }
>
> values.stream().diverting(s2->s2.forEach(System.out::
> println)).filter(…).collect(…);
>
> Or (and this might be a sweet spot) a symmetric stream-tee operation could
> materialize two sibling streams and rejoin their results with a bifunction:
>
> class Collectors {
>   static <R1, R2, R> Stream<T> unzipping(
>                 Function<? super Stream<T>, R1> f1,
>                 Function<? super Stream<T>, R2> f2,
>                 BiFunction<? super R1, ? super R2, ? extends R> finisher)
> { … }
> }
>
> values.stream().unzipping(
>                 s1->s1.forEach(System.out::println),
>                 s2->s2.filter(…).collect(…),
>                 (void1, r2)->r2
>                 );
>
> This would allow each "fork child" of the stream to continue to use the
> Stream API instead of the more restrictive Collector operators.
>
> Optimal code generation for forked/unzipped/teed streams would be tricky,
> requiring simultaneous loop control logic for each stream.
> To me that's a feature, not a bug, since hand-writing ad hoc
> simultaneous loops is a pain.
>
> My $0.02.
>
> — John


More information about the core-libs-dev mailing list