Processing-Mode Equality

Sun Feb 9 11:15:53 PST 2014

My point is that it is currently not decidable whether parallel execution
will be a win or a loss at compile time -- but currently that is where we
expose the choice. So it may be a tool, but it isn't one someone should use
except in extremely constrained circumstances where the runtime context is
well known, i.e. not libraries, very few applications. However, I feel like
it is presented as if you should always use it if your stream code is
parallizable. Also, if you look at the javadoc, I'm not sure we could have
picked a more obviously bad example where it will essentially never make
sense to run the code in parallel:

int sumOfWeights = widgets.*parallelStream()*
                               .filter(b -> b.getColor() == RED)
                               .mapToInt(b -> b.getWeight())
                               .sum();

I'm not sure where the tip over point would be, but my guess is in the
millions of widgets[1].

Sam

[1] On my machine it turns over somewhere between 1m and 2m widgets. Sadly
YMMV quite a bit.

On Sun, Feb 9, 2014 at 10:51 AM, Vitaly Davidovich <vitalyd at gmail.com>wrote:

> In my view, there's no need to characterize the Stream API in terms of
> sequential vs parallel.  It simply is another tool that allows to run
> parallel (or sequential) code with less boilerplate and the syntactic
> difference between the two modes is minimal.  The need to measure, tune,
> and design the algo/data structures for a specific workload doesn't go
> away, and YMMV fine print is still applicable.
>
> Sent from my phone
> On Feb 9, 2014 1:39 PM, "Sam Pullara" <spullara at gmail.com> wrote:
>
>> Here are just a few of the things you need to know at runtime to decide
>> if you want to run something in parallel:
>>
>> 1) The cost of executing the per element code. On current architectures
>> and JVMs where Java runs this spans 3+ orders of magnitude in performance.
>> The variables include CPU, GPU, power constraints, hotspot effectiveness,
>> etc. As a library author is generally impossible if you accept a lambda to
>> know anything about the performance of that lambda at compile time and
>> therefore impossible to decide to run in parallel.
>>
>> 2) The number of elements that you are going to process. This is only
>> forecastable in very degenerate cases. Rarely will you write a Stream based
>> function that has any idea how many elements are going to be run through it.
>>
>> 3) The cost for copying the input data and output data across memory
>> barriers. The differences between non-NUMA and NUMA architectures is
>> another couple order of magnitude gap. Even the cost of synchronization on
>> a real system may make it too high to consider running some things in
>> parallel.
>>
>> 4) Whether you want to optimize for throughput or latency. This is
>> especially bad for library authors. As Brian stated, it is always more
>> expensive in total CPU and memory usage to run in parallel. Systems that
>> optimize for throughput may not want this additional deoptimization.
>>
>> 5) Runtime context. If you are already running in parallel at a higher
>> level additional parallelism in the the leaf code will often be just more
>> noise and actually interfere with the high level splitting of work.
>>
>> I think your straw man argument that if only people changed the way they
>> wrote code everything would work fine is just not true. Runtime measurement
>> and opportunistically parallelizing operations when it makes sense no more
>> requires solving the halting problem than runtime optimization in hotspot
>> requires solving the halting problem. Obviously it would help, but you can
>> make real progress by characterizing loads and dispatching them
>> intelligently.
>>
>> Sam
>>
>> On Feb 9, 2014, at 4:42 AM, Doug Lea <dl at cs.oswego.edu> wrote:
>>
>> > One slogan is "Data-centric, parallel-agnostic".
>> >
>> > On 02/08/2014 01:36 PM, Sam Pullara wrote:
>> >> That is one way to think about it and programming wise you would be
>> correct.
>> >> However, if you run everything in parallel that can be you will likely
>> be
>> >> disappointed in the performance.
>> >
>> > This of course assumes that people will continue to write programs in
>> > ways that present a high likelihood of disappointment. This is sure to
>> > sometimes happen: Classic object-oriented programmers will tend to use
>> > unpartitionable side-effecting methods, classic functional programmers
>> > will tend  to use hopelessly sequential data structures, and classic
>> > event-driven programmers will tend to use one-by-one vs bulk updates.
>> > Among those likely to cope best are database programmers, who are
>> > already comfortable with data-centric bulk updates.
>> >
>> >> A pretty special set of conditions need to
>> >> be present for it to make sense to run things in parallel.
>> >
>> > In some alternative universe, someone is now complaining that a
>> > pretty special set of conditions need to be present for it to make
>> > sense to run things sequentially even if eligible for parallelism:
>> > The combination of intrinsically sequential data structures,
>> > less than a few thousand elements, and trivially cheap per-element
>> > functions. If you designed programs so these cases rarely occurred,
>> > you wouldn't worry about occasional slowdowns when they do.
>> > But we cannot write such code for you. We cannot even tell you
>> > with certainty when you do/don't: Just trying to figure out the
>> > cost of per-element functions hits the Halting problem.
>> >
>> > We will see some bad reactions and experiences along the way
>> > as people decide how and when to use the Stream framework.
>> > And we will surely see people deciding to never use it because
>> > data-centric, parallel-agnostic programming clashes with their
>> > adopted programming style. Fine. Java (among other JVM languages)
>> > succeed because people with different religious views about
>> > programming can coexist.
>> >
>> > -Doug
>> >
>> >
>> >
>> >
>>
>>
>>