RFR: 8309130: x86_64 AVX512 intrinsics for Arrays.sort methods (int, long, float and double arrays) [v4]
Andrew Haley
aph at openjdk.org
Thu Jun 1 12:34:21 UTC 2023
On Thu, 1 Jun 2023 01:15:57 GMT, Srinivas Vamsi Parasa <duke at openjdk.org> wrote:
>> The goal is to develop faster sort routines for x86_64 CPUs by taking advantage of AVX512 instructions. This enhancement provides an order of magnitude speedup for Arrays.sort() using int, long, float and double arrays.
>>
>> This PR shows upto ~13x improvement for 32-bit datatypes (int, float) and upto 8x improvement for 64-bit datatypes (long, double) as shown in the performance data below.
>>
>> **Arrays.sort performance data using JMH benchmarks**
>>
>> | Arrays.sort benchmark | Array Size | Baseline (us/op) | AVX512 Sort (us/op) | Speedup |
>> | --- | --- | --- | --- | --- |
>> | ArraysSort.doubleSort | 100 | 0.639 | 0.217 | 2.9x |
>> | ArraysSort.doubleSort | 1000 | 8.707 | 3.421 | 2.5x |
>> | ArraysSort.doubleSort | 10000 | 349.267 | 43.56 | **8.0x** |
>> | ArraysSort.doubleSort | 100000 | 4721.17 | 579.819 | **8.1x** |
>> | ArraysSort.floatSort | 100 | 0.722 | 0.129 | 5.6x |
>> | ArraysSort.floatSort | 1000 | 9.1 | 2.356 | 3.9x |
>> | ArraysSort.floatSort | 10000 | 336.472 | 26.706 | **12.6x** |
>> | ArraysSort.floatSort | 100000 | 4804.716 | 427.397 | **11.2x** |
>> | ArraysSort.intSort | 100 | 0.61 | 0.111 | 5.5x |
>> | ArraysSort.intSort | 1000 | 8.534 | 2.025 | 4.2x |
>> | ArraysSort.intSort | 10000 | 310.97 | 24.082 | **12.9x** |
>> | ArraysSort.intSort | 100000 | 4484.94 | 381.01 | **11.8x** |
>> | ArraysSort.longSort | 100 | 0.636 | 0.28 | 2.3x |
>> | ArraysSort.longSort | 1000 | 8.646 | 4.425 | 2.0x |
>> | ArraysSort.longSort | 10000 | 322.116 | 53.094 | **6.1x** |
>> | ArraysSort.longSort | 100000 | 4448.171 | 696.773 | **6.4x** |
>
> Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision:
>
> fix license
test/micro/org/openjdk/bench/java/util/ArraysSort.java line 59:
> 57:
> 58:
> 59: @Param({"100", "1000", "10000", "100000"})
Suggestion:
@Param({"10","25","50","75","100", "1000", "10000", "100000"})
Short arrays are important.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1213079759
More information about the hotspot-compiler-dev
mailing list