RFR: 8309130: x86_64 AVX512 intrinsics for Arrays.sort methods (int, long, float and double arrays) [v6]
Jatin Bhateja
jbhateja at openjdk.org
Tue Jun 6 19:09:58 UTC 2023
On Thu, 1 Jun 2023 17:22:32 GMT, Srinivas Vamsi Parasa <duke at openjdk.org> wrote:
>> The goal is to develop faster sort routines for x86_64 CPUs by taking advantage of AVX512 instructions. This enhancement provides an order of magnitude speedup for Arrays.sort() using int, long, float and double arrays.
>>
>> This PR shows upto ~13x improvement for 32-bit datatypes (int, float) and upto 8x improvement for 64-bit datatypes (long, double) as shown in the performance data below.
>>
>> **Arrays.sort performance data using JMH benchmarks**
>>
>> | Arrays.sort benchmark | Array Size | Baseline (us/op) | AVX512 Sort (us/op) | Speedup |
>> | --- | --- | --- | --- | --- |
>> | ArraysSort.doubleSort | 100 | 0.639 | 0.217 | 2.9x |
>> | ArraysSort.doubleSort | 1000 | 8.707 | 3.421 | 2.5x |
>> | ArraysSort.doubleSort | 10000 | 349.267 | 43.56 | **8.0x** |
>> | ArraysSort.doubleSort | 100000 | 4721.17 | 579.819 | **8.1x** |
>> | ArraysSort.floatSort | 100 | 0.722 | 0.129 | 5.6x |
>> | ArraysSort.floatSort | 1000 | 9.1 | 2.356 | 3.9x |
>> | ArraysSort.floatSort | 10000 | 336.472 | 26.706 | **12.6x** |
>> | ArraysSort.floatSort | 100000 | 4804.716 | 427.397 | **11.2x** |
>> | ArraysSort.intSort | 100 | 0.61 | 0.111 | 5.5x |
>> | ArraysSort.intSort | 1000 | 8.534 | 2.025 | 4.2x |
>> | ArraysSort.intSort | 10000 | 310.97 | 24.082 | **12.9x** |
>> | ArraysSort.intSort | 100000 | 4484.94 | 381.01 | **11.8x** |
>> | ArraysSort.longSort | 100 | 0.636 | 0.28 | 2.3x |
>> | ArraysSort.longSort | 1000 | 8.646 | 4.425 | 2.0x |
>> | ArraysSort.longSort | 10000 | 322.116 | 53.094 | **6.1x** |
>> | ArraysSort.longSort | 100000 | 4448.171 | 696.773 | **6.4x** |
>
> Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision:
>
> fix license in one file
test/micro/org/openjdk/bench/java/util/ArraysSort.java line 2:
> 1: /*
> 2: * Copyright (c) 2022, 2023, Oracle and/or its affiliates. All rights reserved.
Copyright year should be 2023
test/micro/org/openjdk/bench/java/util/ArraysSort.java line 85:
> 83: ints_unsorted[i] = rnd.nextInt();
> 84: longs_unsorted[i] = rnd.nextLong();
> 85: floats_unsorted[i] = rnd.nextFloat();
Can you also introduce NaN, Infinity, +0.0, -0.0 in input floating point arrays.
test/micro/org/openjdk/bench/java/util/ArraysSort.java line 104:
> 102: @Benchmark
> 103: public void floatSort() throws Throwable {
> 104: floats_sorted = floats_unsorted.clone();
We can move clone out of benchmarking methods.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1220170913
PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1220168276
PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1220174402
More information about the hotspot-compiler-dev
mailing list