Tabulators, reducers, etc
Brian Goetz
brian.goetz at oracle.com
Thu Dec 27 07:31:43 PST 2012
Currently we have the following reduce-like methods:
T reduce(T zero, BinaryOperator<T> reducer);
Optional<T> reduce(BinaryOperator<T> reducer);
<U> U reduce(U zero, BiFunction<U, T, U> accumulator,
BinaryOperator<U> reducer);
<R> R mutableReduce(MutableReducer<T, R> reducer);
<R> R mutableReduce(Supplier<R> seedFactory,
BiBlock<R, T> accumulator,
BiBlock<R, R> reducer);
<R> R tabulate(Tabulator<T, R> tabulator);
<R> R tabulate(ConcurrentTabulator<T, R> tabulator);
The first two are "real" reduce; the next three are really "fold"
(mutable or not), the first tabulate() is trivial sugar around fold, and
the last is really sugar around forEach. I think some naming
consolidation is in order.
From a user's perspective, these are all various flavors of the same
thing, whether you call them reduce, summarize, tabulate, accumulate,
aggregate, whatever.
The argument for calling the first two reduce is that there is a
historically consistent meaning for "reduce", so we might as well use
the right word. But the others start to get farther afield from that
consistent meaning, undermining this benefit.
There are a few choices to make here about what we're shooting for.
1: base naming choice. We could:
- Call the first two forms reduce, and call all the other forms
something like "accumulate".
- Call all the forms something like "accumulate".
2: merging vs concurrent. Since the concurrent accumulations are really
based on a different primitive (forEach vs reduce), and have very
different requirements on the user (operations had better be
commutative; target containers had better be concurrent; user had better
not care about encounter order), should these be named differently?
3: mutative vs pure functional. Should we distinguish between "pure"
reduce and mutable accumulation?
One option might be: use "reduce" for the purely functional forms, use
accumulate/accumulateConcurrent for the others:
T reduce(T zero, BinaryOperator<T> reducer);
Optional<T> reduce(BinaryOperator<T> reducer);
<U> U reduce(U zero, BiFunction<U, T, U> accumulator,
BinaryOperator<U> reducer);
<R> R accumulate(Accumulator<T, R> reducer);
<R> R accumulate(Supplier<R> seedFactory,
BiBlock<R, T> accumulator,
BiBlock<R, R> reducer);
<R> R accumulateConcurrent(ConcurrentAccumulator<T, R> tabulator);
This would let us get rid of the Tabulator abstraction (it is
identical to MutableReducer; both get renamed to Accumulator).
Separately, with a small crowbar, we could simplify
ConcurrentAccumulator down to fitting into existing SAMs, and the
top-level abstraction could go away.
We would continue to have the same set of combinators for making
tabulators, and would likely have concurrent and not flavors for the Map
ones (since there's a real choice for the user to make there.)
More information about the lambda-libs-spec-observers
mailing list