Question about Iterable.reduce signature

Mon May 21 08:42:59 PDT 2012

I agree.  François is requesting a *fold *operation, which compared to
map-reduce is more expressive (every map-reduce can be expressed as a fold,
but not vice-versa) and has more precisely defined semantics (map-reduce
depends on associativity).  The left fold is a natural operation to provide
on lazy sequences, since it processes values in a well-defined order and
consumes one value at a time from the left.  The right fold, which is what
he requested, requires eager evaluation of the sequence, since it starts
its computation at the tail end.  It therefore has all of the performances
disadvantages of eager evaluation without the parallelizability.
 (Nevertheless, right fold is occasionally the best fit for the semantics
of the problem to be solved)

One streaming analog to the left-fold operation is called *scan* or *prefix
sum*[1], and there is a natural parallel analog called *parallel prefix*[2].
 A surprising number of problems can be reduced naturally to an application
of this primitive.  It is natural to consider this for the set of
operations common between the lazy and parallel collection APIs.

Cheers,
Neal

[1] http://en.wikipedia.org/wiki/Prefix_sum
[2]
http://ocw.mit.edu/courses/mathematics/18-337j-applied-parallel-computing-sma-5505-spring-2005/lecture-notes/chapter_3.pdf

On Mon, May 21, 2012 at 5:15 AM, Brian Goetz <brian.goetz at oracle.com> wrote:

> What you are suggesting is to fuse mapping and reducing into a single
> operation.  While this often makes sense, the approach you suggest gets
> more difficult when you want to execute in parallel.
>
> If you define your reducer over T as a (T,T) -> T (assuming
> associativity), then to compute the reduction in parallel, you can
> decompose your collection into chunks, reduce each chunk to a T, and
> then reduce the reduced values up the tree until you have a single T.
>
> If you define your reducer as a (T,U) -> U, you can reduce each chunk to
> a U, and then you have no way to combine the per-chunk Us into a single
> answer.
>
> On 5/21/2012 2:25 AM, François Sarradin wrote:
> > Hi all,
> >
> > I'm looking for an answer to a question about the method reduce. But, I
> > still haven't found it in archives of the mailing list. So here is my
> > question:
> >
> > The current signature of Iterable<T>.reduce is:
> >
> >      T reduce(T base, BinaryOperator<T>  reducer)
> >
> > I would like to know why the signature doesn't look like this one?
> >
> >      <U>  U reduce (U base, BiOp<T, U>  reducer)
> >
> > where BiOp<T, U>  is supposed to represent a function of two arguments of
> > respective types T and U, that returns a value of type U.
> >
> > I really think that this other signature can lead to more readable
> > code. Below is an example with, say, a set of products that you want to
> buy
> > in whatever store you want. And you want to know the total.
> >
> > With the current version of reduce:
> >
> >      double total = products.map(p ->  p.getPrice()).reduce(0.0, (price,
> > subTotal) ->  price + subTotal);
> >
> > With the other version of reduce in this mail:
> >
> >      double total = products.reduce(0.0, (p, subTotal) ->  p.getPrice() +
> > subTotal);
> >
> > I know that there is a method named mapReduce. But I'm still
> uncomfortable
> > with it:
> >
> >      double total = products.mapReduce(p ->  p.getPrice(), 0.0, (price,
> > subTotal) ->  price + subTotal);
> >
> >
> > Regards,
> >
> > Francois
> >
>
>