accumulate locally and mutatively but combine concurrently (a use case)

Sat May 11 03:41:57 PDT 2013

On 05/10/13 21:40, John Rose wrote:
> On May 10, 2013, at 4:59 PM, Doug Lea <dl at cs.oswego.edu
> <mailto:dl at cs.oswego.edu>> wrote:
>
>> Parallel reduction/accumulation is of course an old and well-studied
>> area, so there are many known good ways and even more known bad ways
>> to handle particular cases. But for collecting into hash tables,
>> we're pretty sure that using CHM in CONCURRENT mode is the best
>> available choice.
>
> I accept that, with pleasure, assuming there are more table keys than processors.
>
> The case I'm asking about is the keyless case, which seems to benefit from
> buffering before inter-processor communication.  In this case, there is no need
> to move the data once it has been buffered.
>

... although most techniques that avoid copying entail
associating keys with elements (for array-based, an index serves as key).
Expanding the kinds of cases where this can apply (such as using
hash-based buffers for non-Map, non-Set elements) is also among
potential improvements. Hash-based are nice because the
probability of contention is (less than) the probability of
collision. See the JDK8 ConcurrentHashMap internal docs for
some details on this.) Array-index-based are even nicer if you can
structurally guarantee that each partition only uses a
non-overlapping index range.

-Doug