Reducing reduce

Joe Bowbeer joe.bowbeer at gmail.com
Mon Feb 11 08:57:41 PST 2013


My parallel string-compare sample provides two implementations (below).

Will both of these survive your changes?

bitbucket.org/joebowbeer/stringcompare

  int compareMapReduce(String s1, String s2) {
    assert s1.length() == s2.length();
    return intRange(0, s1.length()).parallel()
        .map(i -> compare(s1.charAt(i), s2.charAt(i)))
        .reduce(0, (l, r) -> (l != 0) ? l : r);
  }
  int compareBoxedReduce(String s1, String s2) {
    assert s1.length() == s2.length();
    return intRange(0, s1.length()).parallel().boxed()
        .reduce(0, (l, i) -> (l != 0) ? l : compare(s1.charAt(i), s2.charAt(i)),
                   (l, r) -> (l != 0) ? l : r);
  }



The person who sold me the second form told me it would "burn less heat".
He said that I could optimize my map/reduce by having it "not even
calculate *f* if the left operand is nonzero, by combining the map and
reduce steps into a fold."

What is that person going to sell me now?

Joe


On Mon, Feb 11, 2013 at 8:41 AM, Brian Goetz <brian.goetz at oracle.com> wrote:

> Now that we've added all the shapes of map() to Stream (map to
> ref/int/long/double), and we've separated functional reduce (currently
> called reduce) from mutable reduce (currently called collect), I think that
> leaves room for taking out one of the reduce methods from Stream:
>
>     <U> U reduce(U identity,
>                  BiFunction<U, ? super T, U> accumulator,
>                  BinaryOperator<U> reducer);
>
> This is the one that confuses everyone anyway, and I don't think we need
> it any more.
>
> The argument for having this form instead of discrete map+reduce are:
>  - fused map+reduce reduces boxing
>  - this three-arg form can also fold filtering into the accumulation
>
> However, since we now have primitive-bearing map methods, and we can do
> filtering before and after the map, is this form really carrying its
> weight?  Specifically because people find it counterintuitive, we should
> consider dropping it and guiding people towards map+reduce.
>
> For example, "sum of pages" over a stream of Documents is better written
> as:
>
>   docs.map(Document::**getPageCount).sum()
>
> rather than
>
>   docs.reduce(0, (count, d) -> count + d.getPageCount(), Integer::sum)
>
> The big place where we need three-arg reduce is when we're folding into a
> mutable store.  But that's now handled by collect().
>
> Have I missed any use cases that would justify keeping this form?
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130211/27a6f996/attachment-0001.html 


More information about the lambda-libs-spec-experts mailing list