RFR: JDK-8266431: Dual-Pivot Quicksort improvements (Radix sort) [v11]
Vladimir Yaroslavskiy
duke at openjdk.org
Mon Feb 5 23:55:28 UTC 2024
On Fri, 2 Feb 2024 20:09:57 GMT, Srinivas Vamsi Parasa <duke at openjdk.org> wrote:
>> Hi Vamsi (@vamsi-parasa), Laurent(@bourgesl),
>>
>> The latest benchmarking compares compares the following versions:
>> jdk - direct call of Arrays.sort();
>> a15 - the current source of DualPivotQuicksort from the latest build (except renaming)
>> https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/DualPivotQuicksort.java
>> https://github.com/iaroslavski/sorting/blob/master/radixsort/DualPivotQuicksort_a15.java
>> r20s - new version without Radix sort
>> r20p - new version with Radix sort in parallel case only
>>
>> It is expected that timing of jdk and a15 should be more or less the same, but please look at the results:
>>
>> Benchmark | Data Type | Array Size | Arrays.sort() from jdk | Current source (a15)
>> -- | -- | -- | -- | --
>> ArraysSort.Int.testSort | RANDOM | 600 | 1.612 | 7.096
>> ArraysSort.Int.testSort | RANDOM | 2000 | 6.893 | 44.014
>> ArraysSort.Int.testSort | RANDOM | 90000 | 522.749 | 4451.444
>> ArraysSort.Int.testSort | RANDOM | 400000 | 2424.204 | 22751.966
>> ArraysSort.Int.testSort | RANDOM | 3000000 | 21000.434 | 190326.306
>> ArraysSort.Int.testSort | REPEATED | 600 | 0.496 | 1.044
>> ArraysSort.Int.testSort | REPEATED | 2000 | 1.037 | 2.272
>> ArraysSort.Int.testSort | REPEATED | 90000 | 57.763 | 412.331
>> ArraysSort.Int.testSort | REPEATED | 400000 | 182.238 | 1908.978
>> ArraysSort.Int.testSort | REPEATED | 3000000 | 1708.082 | 15163.443
>> ArraysSort.Int.testSort | STAGGER | 600 | 1.038 | 1.055
>> ArraysSort.Int.testSort | STAGGER | 2000 | 3.434 | 3.408
>> ArraysSort.Int.testSort | STAGGER | 90000 | 148.638 | 149.220
>> ArraysSort.Int.testSort | STAGGER | 400000 | 663.076 | 663.096
>> ArraysSort.Int.testSort | STAGGER | 3000000 | 5212.821 | 5206.890
>> ArraysSort.Int.testSort | SHUFFLE | 600 | 1.926 | 4.611
>> ArraysSort.Int.testSort | SHUFFLE | 2000 | 6.858 | 17.955
>> ArraysSort.Int.testSort | SHUFFLE | 90000 | 473.441 | 1410.357
>> ArraysSort.Int.testSort | SHUFFLE | 400000 | 2153.779 | 5739.311
>> ArraysSort.Int.testSort | SHUFFLE | 3000000 | 18180.141 | 41501.980
>>
>> You can see that a15 (current source) works extremly slower than Arrays.sort(), but the code is the same
>> with minor differences: renaming and repackaging (I put the class to the test package org.openjdk.bench.java.util).
>> How does other package org.openjdk.bench.java.util effect so much?
>>
>> I use this pom.xml file https://github.com/iarosla...
>
> Hi Vladimir (@iaroslavski),
>
> Please see the data below. All tests were run after putting the DPQS code in java.util package and recompiling the JDK for each case.
>
> <html xmlns:v="urn:schemas-microsoft-com:vml"
> xmlns:o="urn:schemas-microsoft-com:office:office"
> xmlns:x="urn:schemas-microsoft-com:office:excel"
> xmlns="http://www.w3.org/TR/REC-html40">
>
> <head>
>
> <meta name=ProgId content=Excel.Sheet>
> <meta name=Generator content="Microsoft Excel 15">
> <link id=Main-File rel=Main-File
> href="file:///C:/Users/sparasa/AppData/Local/Temp/msohtmlclip1/01/clip.htm">
> <link rel=File-List
> href="file:///C:/Users/sparasa/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml">
>
>
>
> </head>
>
> <body link="#0563C1" vlink="#954F72">
>
>
> Benchmark (us/op) | (builder) | (size) | Stock JDK | a15 | r20p | r20s
> -- | -- | -- | -- | -- | -- | --
> ArraysSort.Int.testParallelSort | RANDOM | 600 | 2.24 | 2.201 | 2.423 | 2.389
> ArraysSort.Int.testParallelSort | RANDOM | 9000 | 35.318 | 35.961 | 79.028 | 83.774
> ArraysSort.Int.testParallelSort | RANDOM | 20000 | 118.729 | 120.872 | 134.829 | 138.349
> ArraysSort.Int.testParallelSort | RANDOM | 400000 | 822.676 | 822.44 | 1200.858 | 872.264
> ArraysSort.Int.testParallelSort | RANDOM | 3000000 | 5864.514 | 5948.82 | 8800.391 | 6020.616
> ArraysSort.Int.testParallelSort | REPEATED | 600 | 0.924 | 0.936 | 0.752 | 0.733
> ArraysSort.Int.testParallelSort | REPEATED | 9000 | 9.896 | 9.317 | 31.409 | 24.896
> ArraysSort.Int.testParallelSort | REPEATED | 20000 | 58.265 | 42.189 | 40.92 | 40.101
> ArraysSort.Int.testParallelSort | REPEATED | 400000 | 256.952 | 253.217 | 236.568 | 239.163
> ArraysSort.Int.testParallelSort | REPEATED | 3000000 | 2844.107 | 2851.088 | 2752.939 | 3040.423
> ArraysSort.Int.testParallelSort | STAGGER | 600 | 2.245 | 2.296 | 2.15 | 2.219
> ArraysSort.Int.testParallelSort | STAGGER | 9000 | 29.278 | 29.119 | 28.288 | 28.141
> ArraysSort.Int.testParallelSort | STAGGER | 20000 | 50.129 | 50.442 | 49.746 | 49.686
> ArraysSort.Int.testParallelSort | STAGGER | 400000 | 463.309 | 413.619 | 418.077 | 407.519
> ArraysSort.Int.testParallelSort | STAGGER | 3000000 | 3687.198 | 4363.242 | 3732.777 | 3769.898
> ArraysSort.Int.testParallelSort | SHUFFLE | 600 | 1.715 | 1.698 | 2.799 | 2.733
> ArraysSort.Int.testParallelSort | SHUFFLE | 9000 | 27.69 | 27.183 | 32.883 | 32.373
> ArraysSort.Int.testParallelSort | SHUFFLE | 20000 | 62.067 | 60.987 | 63.281 | 52.89
> ArraysSort.Int.testParallelSort | SHUFFLE | 400000 | 467.213 | 495.14 | 650.173 | 476.403
> ArraysSort.Int.testParallelS...
Hello Vamsi (@vamsi-parasa),
Many thanks for the results! Now we can see that intrinsics are applied in all cases,
but there are big differences between the same code.
For example,
parallelSort REPEATED 20000: 58.265(Stock JDK) and 42.189(a15) with speedup x1.38
parallelSort STAGGER 3000000: 3687.198(Stock JDK) 4363.242(a15) with speedup x0.85
Case a15 is the current source code from JDK, but in one benchmarking it is faster,
in other benchmarking it is slower (~15-30%).
Other strange behaviour with new sorting: r20p and r20s have the same code for
sequential sorting (no radix sort at all), but we can see that on case works much slower
sort STAGGER 3000000: 34406.74(r20p) and 10467.03(r20s) - 3.3 times slower,
whereas other sizes show more or less equal values.
Vamsi (@vamsi-parasa),
Could you please run benchmarking of 5 cases with **updated** test class **ArraysSortNew**?
https://github.com/iaroslavski/sorting/blob/master/radixsort/ArraysSortNew.java
Put the DPQS code in java.util package and recompiling the JDK for each case as you
did before, but run new ArraysSortNew.
Find the sources there:
https://github.com/iaroslavski/sorting/blob/master/radixsort/ArraysSortNew.java
https://github.com/iaroslavski/sorting/blob/master/radixsort/DualPivotQuicksort_jdk.java
https://github.com/iaroslavski/sorting/blob/master/radixsort/DualPivotQuicksort_r20p.java
https://github.com/iaroslavski/sorting/blob/master/radixsort/DualPivotQuicksort_r20s.java
https://github.com/iaroslavski/sorting/blob/master/radixsort/DualPivotQuicksort_r25p.java
https://github.com/iaroslavski/sorting/blob/master/radixsort/DualPivotQuicksort_r25s.java
Thank you,
Vladimir
-------------
PR Comment: https://git.openjdk.org/jdk/pull/13568#issuecomment-1928130680
More information about the core-libs-dev
mailing list