Demo for Parallel Core Collection API

Thu Dec 19 04:16:37 PST 2013

Hi Paul And Everyone
Sorry for getting back late.
I took Paul's suggestion and have written other two demos which presents 
usage of parallel computation. One is using Monte-Carlo to calculate 
value of PI. Other is find a big prime by given length. Please review it.
http://cr.openjdk.java.net/~tyan/sample/webrev.00/ 
<http://cr.openjdk.java.net/%7Etyan/sample/webrev.00/>
There is another demo which present mandelbrot set was designed 
Alexander Kouznetsov has been already in reviewing. It's not my code 
review request.
Thank you very much
Tristan

On 10/15/2013 11:20 PM, Paul Sandoz wrote:
>
> On Oct 15, 2013, at 4:35 PM, Tristan Yan <tristan.yan at oracle.com 
> <mailto:tristan.yan at oracle.com>> wrote:
>
>> Hi Paul
>> you have comments "suggest that all streams are sequential. There is 
>> an inconsistency in the use and in some cases it is embedded in other 
>> stream usages."
>>
>> We do not really understand what exactly is meant, could you 
>> elaborate a little bit. Is it because we want to show ppl that we 
>> should use stream more than parallelStream?
>
> Going parallel is easy to do but not always the right thing to do. 
> Going parallel almost always requires more work with the expectation 
> that work will complete sooner than the work required to get the same 
> result sequentially. There are a number of factors that affect whether 
> parallel is faster than sequential. Two of those factors are N, the 
> size of the data, and Q the cost of processing an element in the 
> pipeline. N * Q is a simple cost model, the large that product the 
> better the chances of parallel speed up. N is easy to know, Q not so 
> easy but can often be intuitively guessed. (Note that there are other 
> factors such as the properties of the stream source and operations 
> that Brian and I talked about in our J1 presentation.)
>
> Demo code that just makes everything (or most streams) parallel is 
> sending out the wrong message.
>
> So i think the demo code should present two general things:
>
> 1) various stream functionality, as you have done;
>
> 2) parallel vs. sequential for various cases where it is known that 
> parallel is faster on a multi-core system.
>
> For 2) i strongly recommend measuring using jmh [1]. The data sets you 
> have may or may not be amenable to parallel processing, it's worth 
> investigating though.
>
> I have ideas for other parallel demos. One is creating probably primes 
> (now that SecureRandom is replaced with ThreadLocalRandom), creating a 
> probably prime that is a BigInteger is an relatively expensive 
> operation so Q should be high. Another more advanced demo is a 
> Monte-Carlo calculation of PI using SplittableRandom and a special 
> Spliterator, in this case N should be largish. But there are other 
> simpler demonstrations like sum of squares etc to get across that N 
> should be large. Another demo could be calculation of a mandelbrot 
> set, which is embarrassingly parallel over an area in the complex plane.
>
> So while you should try and fit some parallel vs. sequential execution 
> into your existing demos i do think it worth having a separate set of 
> demos that get across the the simple cost model of N * Q. So feel free 
> to use some of those ideas previously mentioned, i find those ideas 
> fun so perhaps others will too :-)
>
> Paul.
>
> [1] http://openjdk.java.net/projects/code-tools/jmh/
>
> On Oct 15, 2013, at 4:37 PM, Tristan Yan <tristan.yan at oracle.com 
> <mailto:tristan.yan at oracle.com>> wrote:
>
>> Also there is one more question I missed
>>
>> You suggested ""ParallelCore" is not a very descriptive name. Suggest 
>> "streams"."
>> 1) yes we agree this demo is not for parallel computation per se
>> 2) but we do not have a clear demo for parallel computation
>> 3) if we are to rename this, we need to develop another one, do you 
>> have a scenario for that?
>