The new optimized version of Dual-Pivot Quicksort

Laurent Bourgès bourges.laurent at gmail.com
Sat Nov 10 11:49:40 UTC 2018


Dear Vladimir & other Java sort experts,

I made the port of the DPQS 2018.2 code last night to support a secondary
array to be sorted and use preallocation (aux/run for merge sort):
https://github.com/bourgesl/nearly-optimal-mergesort-code/blob/master/src/wildinter/net/mergesort/DualPivotQuicksort2018Ext.java

I succeded in using this variant in Marlin renderer (dev) and it is
promising. I figured out a rendering bug in 1 test wigh 65535 random
segments, certainly due to array swaps in mergesort (my side)...

I will look again at my changes to check if I miss some permutation (a //
b) or made any mistake on their indices... tricky.

PS: Please share your updated webrev when possible to rebase my changes.

Regards,
Laurent

Le ven. 9 nov. 2018 à 10:08, Laurent Bourgès <bourges.laurent at gmail.com> a
écrit :

> Hi Vladimir,
>
> Thank you for your attention, you are the Sort Master.
>
>
> Le ven. 9 nov. 2018 à 09:02, Vladimir Yaroslavskiy <vlv.spb.ru at mail.ru> a
> écrit :
>
>> Hi Laurent,
>>
>> The new version is still under review, there were a lot of suggestions
>> and ideas from Doug Lea.
>> I needed time to apply and check them. I'm going to send final version in
>> few days.
>>
>
> Excellent news, so it will ship in jdk12...
>
>>
>> As to new method sort(a[], low, high, indices): do you mean this method
>> in Arrays class?
>> Are you going to extend Arrays API?
>>
>
> I benchmarked my MergeSort and adding extra array implies many more moves:
> thresholds should be adjusted and my sort is sometimes better as it makes
> half moves (a <-> buffer swaps), so this complementary sort should have its
> own tuned implementation (tests & benchmarks).
>
> I coded my sort as such need did not match any Arrays.sort() methods.
> Having it publicly in jdk.base would be great for any other data handling
> algorithm (science, AI ?)
>
> If you agree it is useful, I could try proposing an Arrays/Collection API
> extension like :
> sort(<primitive or Value>[], low, high, int[]indices)
>
>
>> About new parameters workbase[]: method sort(...) in DualPivotQuicksort
>> class (version is under review)
>> has extra parameter - parallel context (Sorter sorter) which has required
>> workbase[].
>> If we run sorting sequentially, sorter parameter is null (no any objects)
>> and temporary buffer
>> will be created inside in tryMerge method if we go ahead with merging.
>>
>
> I will have a look. For Marlin, parallel sort is not possible as rendering
> shapes can be parallelized in the application (geoserver map rendering).
>
>
>> I will send new sources in few days and add you in cc, so you can look at
>> it
>> and find if new methods are acceptable for you. Then we can discuss
>> required changes (if any).
>>
>
> Perfect, I started adapting your code and will share soon the link to my
> github repo.
>
>
>> Thank you,
>> Vladimir
>>
>
> Thank you very much for your first thoughts,
> Laurent
>
>
>> Пятница, 9 ноября 2018, 10:44 +03:00 от Laurent Bourgès <
>> bourges.laurent at gmail.com>:
>>
>> Hi,
>>
>> I am currently testing many sort algorithms to improve the Marlin
>> renderer (AA 2D shape rasterizer), I integrated since OpenJDK9 and am still
>> improving for OpenJDK12 .
>>
>> I created my MergeSort (top down, check for sorted parts, array / buffer
>> swap to minimize moves, isort on small sub arrays) that sort 2 int[]:
>> crossing + edge indices.
>>  It is critical as edge crossings are sorted at every scanline, data are
>> almost sorted/reversed, not really random.
>>
>> 3 questions:
>> - why is this patch dormant ? I checked in openjdk12 repo and your
>> changes were not integrated.
>> - I need a sort() method that works with 2 arrays: data + indices
>> (pointer like). Such sorted indices are useful to use for lookups c[idx[k]]
>> or to perform other array permutations...
>> Would you agree adding such new sort(a[], low, high, indices)
>> - Marlin uses a ThreadLocal context to avoid any allocation. JDK sort()
>> impl do not accept parameters to give any buffer[] and avoid allocations.
>> Would you agree adding such optional parameters (like workbase[]) ?
>>
>> I will experiment adapting the DualPivotQuickSort in Marlin renderer and
>> perform array sort race & rendering benchmarks.
>>
>> Cheers,
>> Laurent
>>
>> Le ven. 19 janv. 2018 à 14:38, Vladimir Yaroslavskiy <vlv.spb.ru at mail.ru
>> <https://e.mail.ru/compose/?mailto=mailto%3avlv.spb.ru@mail.ru>> a
>> écrit :
>>
>> Hi team,
>>
>> In Sept 2009 Josh Bloch, Jon Bentley and I introduced new sorting
>> algorithm, Dual-Pivot Quicksort, for primitives in JDK 7 and later
>> I suggested several improvements of Dual-Pivot Quicksort, which
>> were integrated into JDK 8.
>>
>> Now I have more optimized and faster version of Dual-Pivot Quicksort.
>> I have been working on it for the last 5 years. Please, find the
>> summary of changes below and sources / diff at webrev [1].
>>
>> All tests and benchmarking were run on the most recent build of JDK 10,
>> jdk-10-ea+39. The new version shows the better performance on different
>> inputs and guarantees n*log(n) on any data.
>>
>> The new implementation of Dual-Pivot Quicksort is 'all-in-one' version:
>> it contains one code for both parallel and sequential sorting algorithms.
>>
>> Suggested version is 10-20% faster on random data, 1.5-4 times faster
>> on nearly structured arrays, 1.5-2 times faster on period inputs.
>>
>> Parallel Dual-Pivot Quicksort is 1.5-3 times faster than current
>> algorithm based on merge sort.
>>
>> Benchmarking on the test suite, suggested by Jon Bentley, shows the
>> boost of performance in 1.4 times. This test suite contains several
>> types of inputs, such as random data, nearly structured arrays, data
>> with period and so on. Also there are several modifications of inputs
>> and parameters which are used in data building. We run sorting on all
>> combinations to compare two algorithms.
>>
>> Please let me know if you have any questions / comments.
>>
>> Summary of changes:
>>
>> DualPivotQuicksort class
>> ------------------------
>> * Pivots are chosen with another step, the 1-st and 5-th candidates
>>    are taken as pivots instead of 2-nd and 4-th.
>> * Splitting into parts is related to the golden ratio
>> * Pivot candidates are sorted by combination of 5-element
>>    network sorting + insertion sort
>> * New backwards partitioning is simpler and more efficient
>> * Quicksort tuning parameters were updated
>> * Merging sort is invoked on each iteration from Quicksort
>> * Additional steps for float/double were fully rewritten
>> * Heap sort is invoked on the leftmost part
>> * Heap sort is used as a guard against quadratic time
>> * Parallel sorting is based on Dual-Pivot Quicksort
>>    instead of merge sort
>>
>> SortingAlgorithms class
>> -----------------------
>> * Merging sort and pair insertion sort were moved from
>>    DualPivotQuicksort class
>> * Pair insertion sort was simplified and optimized
>> * New nano insertion sort was introduced for tiny arrays
>> * Merging sort was fully rewritten
>> * Optimized merging partitioning is used
>> * Merging parameters were updated
>> * Merging of runs was fully rewritten
>> * Fast version of heap sort was introduced
>> * Parallel merging sort was also provided
>>
>> Sorting / ParallelSorting classes
>> ---------------------------------
>> * New test cases were added
>> * Sources of these classes were unified
>>
>> Arrays class
>> ------------
>> * Calls of Dual-Pivot Quicksort were updated
>> * Parallel sorting of primitives was switched to parallel
>>    Dual-Pivot Quicksort
>> * Javadoc was modified
>>
>> ArraysParallelSortHelpers class
>> -------------------------------
>> * Old implementation of parallel sorting for primitives was removed
>> * Javadoc was updated
>>
>> Thank you,
>> Vladimir
>>
>> --------------------------------------------------------------------
>> [1] http://cr.openjdk.java.net/~alanb/DualPivotSortUpdate/webrev.01/
>> --------------------------------------------------------------------
>>
>>
>>
>> --
>> Vladimir Yaroslavskiy
>>
>


More information about the core-libs-dev mailing list