Concerns about parallel streams

Doug Lea dl at cs.oswego.edu
Fri Jul 12 06:47:37 PDT 2013


On 07/11/13 16:52, Joe Bowbeer wrote:
> Aleksey: Can you add memory parameters to your model?  Including both the memory
> overhead of parallel streams and the memory working set of each parallel task.

Memory effects are always complicated. here are a few issues:

1. The more sequentially-oriented your code, the more
memory it will consume when run in parallel, for less benefit.
And these effects are not small.

2. There are usage idioms that will save vast amount of memory
in parallel pipelines, for example merge-while-grouping.
(MultiSets/MultiMaps: just say no!) But most programmers won't
be familiar with the tradeoffs because they don't hurt as badly
in most sequential usages.

3. Boxing has such enormous impact on space (and time)
to have justified all the work on int/long/double forms.
People really need to use them in cases where they might
not have noticed performance problems when they wrote
non-stream (and thus non-parallel) versions.

4. Using more cores intrinsically uses more memory. These days,
vendors tend not to ship systems with memory proportional to cores,
but most are not so disproportional to be a huge concern for
most users.

5. We still face many GC and memory-system implementation issues
that limit scalability.


> By the way, another concern not mentioned is UI responsiveness.  Adding
> .parallel() competes with the UI thread's ability to respond to user input.

I'd like to just say, Not our problem: The basic execution support
can cope fine if the OS decides not to gives us all the CPU time
and instead prioritize UI tasks. But OSes themselves are still
evolving ways to do this. So these days, if you have <= 4cores,
you might feel some impact.

-Doug



More information about the lambda-libs-spec-experts mailing list