RFR 8003981: Support Parallel Array Sorting - JEP 103

Fri Dec 21 20:09:18 UTC 2012

Alright, I have used Intel's Thread Building Blocks, where the user 
has(or had) to do the splitting up of tasks himself, and I remember 
having to do some level of profiling on different
machines for getting some performance advantage over the serial version 
of any algorithm. But I guess we are not aiming for that kind of 
accuracy here.

Thanks for clarifying,
- Kurchi

On 21.12.2012 12:01, Chris Hegarty wrote:
> The granularity is the threshold beyond which we no longer split the array, for its parts to be sorted as separate tasks, and then merged together. At some point it becomes more costly to do this split and merge than simply sorting the array. I think Paul put it best by calling finding the optimal threshold a black art.
>
> This API is targeted at the average developer that simply wants the sorting to be as past a reasonably possible, on their multi-core machine, without having to run benchmarks, profiling, etc. I am concerned that exposing the threshold will simply confuse most developers.
>
> -Chris
>
> On 21 Dec 2012, at 18:21, Kurchi Hazra<kurchi.subhra.hazra at oracle.com>  wrote:
>
>> Isn't the optimal granularity value different for different number of cores, differing amounts of memory available in a machine etc?
>> If someone is trying to use a parallelSort to take adavntage of all the cores in a machine, he is anyway a little bit of an
>> advanced user I guess.
>>
>> - Kurchi
>>
>> On 21.12.2012 09:43, Remi Forax wrote:
>>> On 12/21/2012 05:49 PM, Chris Hegarty wrote:
>>>> Updated webrev/specdiff:
>>>>    http://cr.openjdk.java.net/~chegar/8003981/ver.01/
>>>>
>>>> I also included 'webrev_diffAgainstVer00', so you can easily see the differences against the last webrev.
>>>>
>>>> Note: I remove any reference to the threshold,MIN_ARRAY_SORT_GRAN, in the API, given the other comments on this change.
>>>>
>>>>
>>>> Remi,
>>>>   Thank you for the detailed review.
>>> Chris,
>>> all is green for me, thumb up.
>>>
>>> [...]
>>>
>>>>> In all Sorter, instead of calling Arrays.sort(), you should call
>>>>> DualPivotQuicksort.sort to avoid unnecessary range checks.
>>>> Yes, I see your point. The implementation note in the spec says Arrays.sort, but I think this suggestion is good. I made the change, and since DualPivotQuicksort.sort is inclusive of toIndex/right -1 from toIndex/right.
>>> good catch !
>>>
>>> [...]
>>>
>>>>
>>>> -Chris.
>>> Rémi
>>>
>>>>
>>>>
>>>> On 12/21/2012 11:19 AM, Remi Forax wrote:
>>>>> On 12/20/2012 06:31 PM, Chris Hegarty wrote:
>>>>>> This is a review request for the addition of utility methods to
>>>>>> java.util.Arrays that provide sorting of arrays in parallel, JEP 103 [1].
>>>>>>
>>>>>> Webrev:
>>>>>>   http://cr.openjdk.java.net/~chegar/8003981/ver.00/
>>>>>>
>>>>>> Current sorting implementations provided by the Java Collections
>>>>>> Framework (Collections.sort and Arrays.sort) all perform the sorting
>>>>>> operation sequentially in the calling thread. This enhancement will
>>>>>> offer the same set of sorting operations currently provided by the
>>>>>> Arrays class, but with a parallel implementation that utilizes the
>>>>>> Fork/Join framework. These new API’s are still synchronous with regard
>>>>>> to the calling thread as it will not proceed past the sorting
>>>>>> operation until the parallel sort is complete.
>>>>>>
>>>>>> The actual sorting API this proposal adds will leverage the
>>>>>> ForkJoinPool.commonPool (default Fork/Join pool defined as part of JEP
>>>>>> 155). The sorting algorithm is that used in Doug Lea’s ParallelArray
>>>>>> implementation.
>>>>>>
>>>>>> This work was originally done over in the lambda/lambda repo by David
>>>>>> Holmes, and is now being cleaned up and brought into jdk8.
>>>>>>
>>>>>> Open issues/differences from version in lambda:
>>>>>> 1) Is is necessary for MIN_ARRAY_SORT_GRAN to be in the public API?
>>>>>>    It is an implementation detail (easy to remove).
>>>>>> 2) The use of FJP.commonPool is an implementation detail, not
>>>>>>    part of the spec. Should not be a problem, just worth pointing
>>>>>>    out, as it differs from what is in lambda/lambda.
>>>>>>
>>>>>> -Chris.
>>>>>>
>>>>>> [1] http://openjdk.java.net/jeps/103
>>>>> Hi Chris,
>>>>> in Arrays.parallelSort(TYPE[] a, int fromIndex, int toIndex),
>>>>> there is no need to have the two supplementary local variable origin and
>>>>> fence,
>>>>> ws is not a great better name  (workspaceArray?), I had to read the code
>>>>> of ArraySortHelpers to understand.
>>>>>
>>>>> ArraySortHelpers should be renamed to ArrayParallelSortHelpers.
>>>>>
>>>>> In ArraySortHelpers, all Sorter, Merger, etc. should declare only one
>>>>> field by line, currently, the formatting of the fields is weird.
>>>>>
>>>>> In all Sorter, instead of calling Arrays.sort(), you should call
>>>>> DualPivotQuicksort.sort to avoid unnecessary range checks.
>>>>>
>>>>> In compute() of Sorter and Merger, you should copy a and w in local
>>>>> variables (a = this.a) to help the VM to figure out that they never
>>>>> changed,
>>>>> it will also reduce the bytecode size.
>>>>>
>>>>> In Sorter.compute, n should be loaded in a local field,
>>>>>    int l = origin;
>>>>>    int g = gran;
>>>>>    int n = this.n;
>>>>>
>>>>> In Merger.compute, the sequence of inits before the second while should be:
>>>>>    int l = lo;
>>>>>    int lFence = l + nleft;  // instead of lo + nleft
>>>>>    int r = ro;
>>>>>    int rFence = r + nright;  // instead of ro + nright
>>>>>    int k = wo;
>>>>> it should not changed the performance but it's more inline with the
>>>>> coding style of the rest of the code IMO.
>>>>>
>>>>> cheers,
>>>>> Rémi
>> -- 
>> -Kurchi
>>

-- 
-Kurchi