RFR: 8309130: x86_64 AVX512 intrinsics for Arrays.sort methods (int, long, float and double arrays) [v8]
Srinivas Vamsi Parasa
duke at openjdk.org
Fri Jun 23 21:14:20 UTC 2023
> The goal is to develop faster sort routines for x86_64 CPUs by taking advantage of AVX512 instructions. This enhancement provides an order of magnitude speedup for Arrays.sort() using int, long, float and double arrays.
>
> This PR shows upto ~13x improvement for 32-bit datatypes (int, float) and upto 8x improvement for 64-bit datatypes (long, double) as shown in the performance data below.
>
> **Arrays.sort performance data using JMH benchmarks**
>
> | Arrays.sort benchmark | Array Size | Baseline (us/op) | AVX512 Sort (us/op) | Speedup |
> | --- | --- | --- | --- | --- |
> | ArraysSort.doubleSort | 100 | 0.639 | 0.217 | 2.9x |
> | ArraysSort.doubleSort | 1000 | 8.707 | 3.421 | 2.5x |
> | ArraysSort.doubleSort | 10000 | 349.267 | 43.56 | **8.0x** |
> | ArraysSort.doubleSort | 100000 | 4721.17 | 579.819 | **8.1x** |
> | ArraysSort.floatSort | 100 | 0.722 | 0.129 | 5.6x |
> | ArraysSort.floatSort | 1000 | 9.1 | 2.356 | 3.9x |
> | ArraysSort.floatSort | 10000 | 336.472 | 26.706 | **12.6x** |
> | ArraysSort.floatSort | 100000 | 4804.716 | 427.397 | **11.2x** |
> | ArraysSort.intSort | 100 | 0.61 | 0.111 | 5.5x |
> | ArraysSort.intSort | 1000 | 8.534 | 2.025 | 4.2x |
> | ArraysSort.intSort | 10000 | 310.97 | 24.082 | **12.9x** |
> | ArraysSort.intSort | 100000 | 4484.94 | 381.01 | **11.8x** |
> | ArraysSort.longSort | 100 | 0.636 | 0.28 | 2.3x |
> | ArraysSort.longSort | 1000 | 8.646 | 4.425 | 2.0x |
> | ArraysSort.longSort | 10000 | 322.116 | 53.094 | **6.1x** |
> | ArraysSort.longSort | 100000 | 4448.171 | 696.773 | **6.4x** |
Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision:
replace multiple intrinsics with one general intrinsic
-------------
Changes:
- all: https://git.openjdk.org/jdk/pull/14227/files
- new: https://git.openjdk.org/jdk/pull/14227/files/3bd12ec5..53a5309d
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jdk&pr=14227&range=07
- incr: https://webrevs.openjdk.org/?repo=jdk&pr=14227&range=06-07
Stats: 84 lines in 6 files changed: 12 ins; 40 del; 32 mod
Patch: https://git.openjdk.org/jdk/pull/14227.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/14227/head:pull/14227
PR: https://git.openjdk.org/jdk/pull/14227
More information about the hotspot-compiler-dev
mailing list