The new optimized version of Dual-Pivot Quicksort
Zheka Kozlov
orionllmain at gmail.com
Tue Nov 13 09:44:16 UTC 2018
Thanks, Tagir.
The benchmark was indeed incorrect. There was also another mistake:
dualPivot() and radix() sorted different arrays. I believe I fixed it now:
https://gist.github.com/orionll/595d10ff37ffe0d8c3824e734055cf00
Benchmark (seed) (size) Mode Cnt Score Error Units
Sort.dualPivot 1 10 avgt 5 0,066 ? 0,008 us/op
Sort.dualPivot 1 1000 avgt 5 15,611 ? 0,778 us/op
Sort.dualPivot 1 100000 avgt 5 5968,684 ? 883,238 us/op
Sort.dualPivot 1 10000000 avgt 5 873228,528 ? 93489,438 us/op
Sort.dualPivot 2 10 avgt 5 0,053 ? 0,013 us/op
Sort.dualPivot 2 1000 avgt 5 18,489 ? 0,874 us/op
Sort.dualPivot 2 100000 avgt 5 6821,103 ? 1367,378 us/op
Sort.dualPivot 2 10000000 avgt 5 873083,780 ? 55668,302 us/op
Sort.dualPivot 3 10 avgt 5 0,049 ? 0,001 us/op
Sort.dualPivot 3 1000 avgt 5 17,539 ? 2,661 us/op
Sort.dualPivot 3 100000 avgt 5 5959,760 ? 288,375 us/op
Sort.dualPivot 3 10000000 avgt 5 904949,768 ? 95230,256 us/op
Sort.radix 1 10 avgt 5 1,252 ? 0,133 us/op
Sort.radix 1 1000 avgt 5 8,403 ? 1,465 us/op
Sort.radix 1 100000 avgt 5 2106,251 ? 310,883 us/op
Sort.radix 1 10000000 avgt 5 377041,746 ? 42450,236 us/op
Sort.radix 2 10 avgt 5 1,244 ? 0,100 us/op
Sort.radix 2 1000 avgt 5 8,685 ? 1,726 us/op
Sort.radix 2 100000 avgt 5 2201,534 ? 204,659 us/op
Sort.radix 2 10000000 avgt 5 381438,308 ? 47735,981 us/op
Sort.radix 3 10 avgt 5 1,259 ? 0,090 us/op
Sort.radix 3 1000 avgt 5 8,338 ? 1,127 us/op
Sort.radix 3 100000 avgt 5 2081,945 ? 225,230 us/op
Sort.radix 3 10000000 avgt 5 376115,679 ? 47896,963 us/op
Now the benchmark shows that radix() wins significantly except for small
arrays.
вт, 13 нояб. 2018 г. в 15:32, Tagir Valeev <amaembo at gmail.com>:
> Hello, Zheka!
>
> Seems that your benchmark is wrong: after the first iteration (which
> is part of warmup) you're always sorting an array which is already
> sorted, so in fact you are testing how algorithms behave on already
> sorted arrays.
>
> With best regards,
> Tagir Valeev.
> On Mon, Nov 12, 2018 at 10:08 AM Zheka Kozlov <orionllmain at gmail.com>
> wrote:
> >
> > Hi Tagir!
> >
> > I wrote a simple benchmark comparing Arrays.sort() and your
> implementation:
> https://gist.github.com/orionll/595d10ff37ffe0d8c3824e734055cf00
> >
> > Here are the results on my computer (JDK 11):
> >
> > REMEMBER: The numbers below are just data. To gain reusable insights,
> you need to follow up on
> > why the numbers are the way they are. Use profilers (see -prof, -lprof),
> design factorial
> > experiments, perform baseline and negative tests that provide
> experimental control, make sure
> > the benchmarking environment is safe on JVM/OS/HW level, ask for reviews
> from the domain experts.
> > Do not assume the numbers tell you what you want them to tell.
> >
> > Benchmark (size) Mode Cnt Score Error Units
> > Sort.dualPivot 10 avgt 4 0,012 ? 0,002 us/op
> > Sort.dualPivot 1000 avgt 4 0,282 ? 0,014 us/op
> > Sort.dualPivot 100000 avgt 4 31,625 ? 10,063 us/op
> > Sort.dualPivot 10000000 avgt 4 10077,948 ? 601,204 us/op
> > Sort.radix 10 avgt 4 1,310 ? 0,151 us/op
> > Sort.radix 1000 avgt 4 1,297 ? 0,063 us/op
> > Sort.radix 100000 avgt 4 33,009 ? 2,303 us/op
> > Sort.radix 10000000 avgt 4 10150,036 ? 1095,015 us/op
> >
> > I don't want to make any conclusions. These are just numbers. Probably
> your implementation can be optimized so it beats the platform sort at least
> on large arrays.
> >
> > вс, 11 нояб. 2018 г. в 11:48, Tagir Valeev <amaembo at gmail.com>:
> >>
> >> Hello!
> >>
> >> By the way why not using radix sort, at least for int[] arrays? The
> >> implementation is much simpler, it uses constant-size additional
> >> buffer (1024 ints) and performance should be better than DPQS, at
> >> least for large arrays. Here's sample implementation written by me
> >> several years ago (reusing merge sort part from JDK) which degrades to
> >> merge sort if array is nearly sorted.
> >> http://cr.openjdk.java.net/~tvaleev/patches/radix/RadixSort.java
> >>
> >> With best regards,
> >> Tagir Valeev.
> >>
> >> On Fri, Jan 19, 2018 at 8:38 PM Vladimir Yaroslavskiy
> >> <vlv.spb.ru at mail.ru> wrote:
> >> >
> >> > Hi team,
> >> >
> >> > In Sept 2009 Josh Bloch, Jon Bentley and I introduced new sorting
> >> > algorithm, Dual-Pivot Quicksort, for primitives in JDK 7 and later
> >> > I suggested several improvements of Dual-Pivot Quicksort, which
> >> > were integrated into JDK 8.
> >> >
> >> > Now I have more optimized and faster version of Dual-Pivot Quicksort.
> >> > I have been working on it for the last 5 years. Please, find the
> >> > summary of changes below and sources / diff at webrev [1].
> >> >
> >> > All tests and benchmarking were run on the most recent build of JDK
> 10,
> >> > jdk-10-ea+39. The new version shows the better performance on
> different
> >> > inputs and guarantees n*log(n) on any data.
> >> >
> >> > The new implementation of Dual-Pivot Quicksort is 'all-in-one'
> version:
> >> > it contains one code for both parallel and sequential sorting
> algorithms.
> >> >
> >> > Suggested version is 10-20% faster on random data, 1.5-4 times faster
> >> > on nearly structured arrays, 1.5-2 times faster on period inputs.
> >> >
> >> > Parallel Dual-Pivot Quicksort is 1.5-3 times faster than current
> >> > algorithm based on merge sort.
> >> >
> >> > Benchmarking on the test suite, suggested by Jon Bentley, shows the
> >> > boost of performance in 1.4 times. This test suite contains several
> >> > types of inputs, such as random data, nearly structured arrays, data
> >> > with period and so on. Also there are several modifications of inputs
> >> > and parameters which are used in data building. We run sorting on all
> >> > combinations to compare two algorithms.
> >> >
> >> > Please let me know if you have any questions / comments.
> >> >
> >> > Summary of changes:
> >> >
> >> > DualPivotQuicksort class
> >> > ------------------------
> >> > * Pivots are chosen with another step, the 1-st and 5-th candidates
> >> > are taken as pivots instead of 2-nd and 4-th.
> >> > * Splitting into parts is related to the golden ratio
> >> > * Pivot candidates are sorted by combination of 5-element
> >> > network sorting + insertion sort
> >> > * New backwards partitioning is simpler and more efficient
> >> > * Quicksort tuning parameters were updated
> >> > * Merging sort is invoked on each iteration from Quicksort
> >> > * Additional steps for float/double were fully rewritten
> >> > * Heap sort is invoked on the leftmost part
> >> > * Heap sort is used as a guard against quadratic time
> >> > * Parallel sorting is based on Dual-Pivot Quicksort
> >> > instead of merge sort
> >> >
> >> > SortingAlgorithms class
> >> > -----------------------
> >> > * Merging sort and pair insertion sort were moved from
> >> > DualPivotQuicksort class
> >> > * Pair insertion sort was simplified and optimized
> >> > * New nano insertion sort was introduced for tiny arrays
> >> > * Merging sort was fully rewritten
> >> > * Optimized merging partitioning is used
> >> > * Merging parameters were updated
> >> > * Merging of runs was fully rewritten
> >> > * Fast version of heap sort was introduced
> >> > * Parallel merging sort was also provided
> >> >
> >> > Sorting / ParallelSorting classes
> >> > ---------------------------------
> >> > * New test cases were added
> >> > * Sources of these classes were unified
> >> >
> >> > Arrays class
> >> > ------------
> >> > * Calls of Dual-Pivot Quicksort were updated
> >> > * Parallel sorting of primitives was switched to parallel
> >> > Dual-Pivot Quicksort
> >> > * Javadoc was modified
> >> >
> >> > ArraysParallelSortHelpers class
> >> > -------------------------------
> >> > * Old implementation of parallel sorting for primitives was removed
> >> > * Javadoc was updated
> >> >
> >> > Thank you,
> >> > Vladimir
> >> >
> >> > --------------------------------------------------------------------
> >> > [1] http://cr.openjdk.java.net/~alanb/DualPivotSortUpdate/webrev.01/
> >> > --------------------------------------------------------------------
>
More information about the core-libs-dev
mailing list