RFR: 8309130: x86_64 AVX512 intrinsics for Arrays.sort methods (int, long, float and double arrays) [v14]
iaroslavski
duke at openjdk.org
Wed Aug 16 21:49:30 UTC 2023
On Wed, 16 Aug 2023 18:22:19 GMT, Srinivas Vamsi Parasa <duke at openjdk.org> wrote:
>> Hello @vamsi-parasa !
>>
>> Do you process negative zeros properly? From one hand -0.0f equals to 0.0f, but negative zeros must be placed before 0.0f.
>> See javadoc for Arrays.sort(float[] a). The same situation with -0.0d (double type).
>
> @iaroslavski
>
> Hello Vladimir,
>
> Please see the `Arrays.sort()` performance comparison between the **enhanced DPQS/Radix sort #13568** vs. **AVX512 sort intrinsic** (this PR) using the `ArraysSort.java` JMH [benchmark](https://github.com/openjdk/jdk/pull/13568/files#diff-dee51b13bd1872ff455cec2f29255cfd25014a5dd33dda55a2fc68638c3dd4b2) provided in the PR for [JDK-8266431: Dual-Pivot Quicksort improvements (Radix sort)](https://github.com/openjdk/jdk/pull/13568/files#top) ( #13568)
>
> - The following command line was used to run the benchmarks: ` java -jar $JDK_HOME/build/linux-x86_64-server-release/images/test/micro/benchmarks.jar -jvmArgs "-XX:CompileThreshold=1 -XX:-TieredCompilation" ArraysSortTests`
> - Please see the performance numbers below. The scores shown are the average time (us/op), thus lower is better. The last column towards the right shows the speedup.
> - Space complexity was not benchmarked as AVX512 sort is an in-place sorting algorithm while the Radix sort needs extra memory.
>
> | Benchmark | Mode | Size | Enhanced DPQS/Radix (us/op) | AVX512 Sort (us/op) | Speedup |
> | --- | --- | --- | --- | --- | --- |
> | ArraysSortTests.Double.testSort | RANDOM | 800 | 8.4 | 2.6 | 3.3x |
> | ArraysSortTests.Double.testSort | RANDOM | 7000 | 94.4 | 28.0 | 3.4x |
> | ArraysSortTests.Double.testSort | RANDOM | 50000 | 634.9 | 238.4 | 2.7x |
> | ArraysSortTests.Double.testSort | RANDOM | 300000 | 3887.6 | 1721.6 | 2.3x |
> | ArraysSortTests.Double.testSort | RANDOM | 2000000 | 29348.9 | 13854.8 | 2.1x |
> | ArraysSortTests.Double.testSort | REPEATED | 800 | 2.0 | 1.6 | 1.2x |
> | ArraysSortTests.Double.testSort | REPEATED | 7000 | 39.6 | 37.2 | 1.1x |
> | ArraysSortTests.Double.testSort | REPEATED | 50000 | 470.0 | 386.3 | 1.2x |
> | ArraysSortTests.Double.testSort | REPEATED | 300000 | 2940.8 | 342.7 | 8.6x |
> | ArraysSortTests.Double.testSort | REPEATED | 2000000 | 19361.4 | 2838.9 | 6.8x |
> | ArraysSortTests.Double.testSort | STAGGER | 800 | 2.3 | 2.7 | 0.8x |
> | ArraysSortTests.Double.testSort | STAGGER | 7000 | 25.6 | 28.0 | 0.9x |
> | ArraysSortTests.Double.testSort | STAGGER | 50000 | 159.8 | 224.0 | 0.7x |
> | ArraysSortTests.Double.testSort | STAGGER | 300000 | 945.3 | 1566.8 | 0.6x |
> | ArraysSortTests.Double.testSort | STAGGER | 2000000 | 6398.7 | 12668.3 | 0.5x |
> | ArraysSortTests.Double.testSort | SHUFFLE | 800 | 4.9 | 2.7 | 1.8x |
> | ArraysSortTests.Double.testSort | SHUFFLE | 7000 | 55.9 | 27.5 | 2.0x |
> | ArraysSortTests.Double.testSort | SHUFFLE | 50000 | 4...
@vamsi-parasa
Hello Vamsi,
Many thanks for the great details!
-------------
PR Comment: https://git.openjdk.org/jdk/pull/14227#issuecomment-1681312328
More information about the hotspot-compiler-dev
mailing list