Yet another run at reduce
Tim Peierls
tim at peierls.net
Tue Jan 8 06:41:18 PST 2013
Sounds great -- I'd be interested in seeing some basic examples of both
types of reducer.
--tim
On Mon, Jan 7, 2013 at 8:31 PM, Brian Goetz <brian.goetz at oracle.com> wrote:
> I think Doug and I made some progress on reduce forms today.
>
> Recap: there are two ways to do a reduction to a mutable data structure
> like a Map; a classic fold (where you create lots of little maps, one per
> leaf of the computation tree, accumulate leaf data into them in a
> thread-confined manner, then merge their contents up the tree into one big
> map), and a concurrent bombardment (where you create one big
> concurrent-safe map, and blast elements in from many threads.) Call these
> "A" and "B".
>
> The requirement for A is that:
> - your combining functions are associative
> - you can create multiple containers
> - you can incorporate a new element into a container
> - you can merge containers in a way that satisfies the obvious
> distributive requirement
>
> If you meet these requirements, you can do a parallel A reduce that
> respects encounter order.
>
> The requirement for B is that:
> - your combining functions are associative and commutative
> - you can incorporate a new element into a container
> - your container support concurrent incorporation
>
> If you meet these requirements, you can do a parallel B reduce that does
> NOT respect order, but may be faster (or slower, if contention is high
> enough.)
>
> Key observation: A "B" reducer can become an "A" reducer simply by adding
> the merge ability, which is no harder than for regular A reducers.
>
> So rather than have A reducers and B reducers, we can have A reducers and
> A+B reducers. It's only marginally more work for B reducer writers.
>
> So...
>
> public interface Reducer<T, R> {
> boolean isConcurrent();
> R makeResult();
> void accumulate(R result, T value);
> R combine(R result, R other);
> }
>
> A reducers return 'false' for isConcurrent; B reducers return 'true'.
>
> What was Concurrent{Reducer,Tabulator,**Accumulator} goes away.
>
> Now, in addition to the various forms of reduce(), we add overloaded:
>
> reduce(Reducer)
> reduceUnordered(Reducer)
>
> You will get a B reduction if (a) you selected reduceUnordered and (b)
> your reducer is concurrent. Otherwise you will get an A reduction.
>
> This is nice because the knowledge of properties of "user doesn't care
> about order" and "target supports concurrent insertion" naturally live in
> different places; this separates them properly. The latter is a property
> of the reducer implementation; the former only the user knows about and
> therefore should live at the call site. Each contributes their bit and can
> mostly remain ignorant of the other; the library combines these bits, and
> if both are present, you get a B reduction.
>
> The reduceUnordered() method can be cast as an optimization to reduce();
> it is always correct to use reduce(), but may be faster to use
> reduceUnordered(), as long as you are willing to forgo order and the
> reducer is coooperative.
>
> In neither case (assuming a properly implemented reducer) will you ever
> get a thread-unsafe result; if you ask for an unordered reduction with a
> nonconcurrent reducer, you get a safe ordered reduction instead, which
> might be slower (or faster.)
>
> Pros:
> - ConcurrentReducer goes away
> - .unordered() goes away
> - Properly separates order-sensitivity from concurrency
>
> Cons:
> - Two variations on various aspects of reducing are surfaced in the API
> (unordered and concurrent) that will make users say "huh?" (but if you get
> it wrong /choose to stay ignorant you still get the right answer.)
>
> Implementation coming.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130108/4dda030d/attachment.html
More information about the lambda-libs-spec-experts
mailing list