RFR: 8309130: x86_64 AVX512 intrinsics for Arrays.sort methods (int, long, float and double arrays) [v41]
Vladimir Kozlov
kvn at openjdk.org
Thu Oct 5 18:50:30 UTC 2023
On Fri, 22 Sep 2023 16:53:21 GMT, Srinivas Vamsi Parasa <duke at openjdk.org> wrote:
>> The goal is to develop faster sort routines for x86_64 CPUs by taking advantage of AVX512 instructions. This enhancement provides an order of magnitude speedup for Arrays.sort() using int, long, float and double arrays.
>>
>> This PR shows upto ~7x improvement for 32-bit datatypes (int, float) and upto ~4.5x improvement for 64-bit datatypes (long, double) as shown in the performance data below.
>>
>>
>> **Arrays.sort performance data using JMH benchmarks for arrays with random data**
>>
>> | Arrays.sort benchmark | Array Size | Baseline (us/op) | AVX512 Sort (us/op) | Speedup |
>> | --- | --- | --- | --- | --- |
>> | ArraysSort.doubleSort | 10 | 0.034 | 0.035 | 1.0 |
>> | ArraysSort.doubleSort | 25 | 0.116 | 0.089 | 1.3 |
>> | ArraysSort.doubleSort | 50 | 0.282 | 0.291 | 1.0 |
>> | ArraysSort.doubleSort | 75 | 0.474 | 0.358 | 1.3 |
>> | ArraysSort.doubleSort | 100 | 0.654 | 0.623 | 1.0 |
>> | ArraysSort.doubleSort | 1000 | 9.274 | 6.331 | 1.5 |
>> | ArraysSort.doubleSort | 10000 | 323.339 | 71.228 | **4.5** |
>> | ArraysSort.doubleSort | 100000 | 4471.871 | 1002.748 | **4.5** |
>> | ArraysSort.doubleSort | 1000000 | 51660.742 | 12921.295 | **4.0** |
>> | ArraysSort.floatSort | 10 | 0.045 | 0.046 | 1.0 |
>> | ArraysSort.floatSort | 25 | 0.103 | 0.084 | 1.2 |
>> | ArraysSort.floatSort | 50 | 0.285 | 0.33 | 0.9 |
>> | ArraysSort.floatSort | 75 | 0.492 | 0.346 | 1.4 |
>> | ArraysSort.floatSort | 100 | 0.597 | 0.326 | 1.8 |
>> | ArraysSort.floatSort | 1000 | 9.811 | 5.294 | 1.9 |
>> | ArraysSort.floatSort | 10000 | 323.955 | 50.547 | **6.4** |
>> | ArraysSort.floatSort | 100000 | 4326.38 | 731.152 | **5.9** |
>> | ArraysSort.floatSort | 1000000 | 52413.88 | 8409.193 | **6.2** |
>> | ArraysSort.intSort | 10 | 0.033 | 0.033 | 1.0 |
>> | ArraysSort.intSort | 25 | 0.086 | 0.051 | 1.7 |
>> | ArraysSort.intSort | 50 | 0.236 | 0.151 | 1.6 |
>> | ArraysSort.intSort | 75 | 0.416 | 0.332 | 1.3 |
>> | ArraysSort.intSort | 100 | 0.63 | 0.521 | 1.2 |
>> | ArraysSort.intSort | 1000 | 10.518 | 4.698 | 2.2 |
>> | ArraysSort.intSort | 10000 | 309.659 | 42.518 | **7.3** |
>> | ArraysSort.intSort | 100000 | 4130.917 | 573.956 | **7.2** |
>> | ArraysSort.intSort | 1000000 | 49876.307 | 6712.812 | **7.4** |
>> | ArraysSort.longSort | 10 | 0.036 | 0.037 | 1.0 |
>> | ArraysSort.longSort | 25 | 0.094 | 0.08 | 1.2 |
>> | ArraysSort.longSort | 50 | 0.218 | 0.227 | 1.0 |
>> | ArraysSort.longSort | 75 | 0.466 | 0.402 | 1.2 |
>> | ArraysSort.longSort | 100 | 0.76 | 0.58 | 1.3 |
>> | ArraysSort.longSort | 1000 | 10.449 | 6....
>
> Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision:
>
> Update CompileThresholdScaling only for the sort and partition intrinsics; update build script to remove nested if
In general it looks good. I have some code style comments and file name change request.
After you fix that I will need to rerun testing for it before approval.
src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 4183:
> 4181:
> 4182: // Load x86_64_sort library on supported hardware to enable avx512 sort and partition intrinsics
> 4183: if (UseAVX > 2 && VM_Version::supports_avx512dq()) {
Indention (spacing) is wrong for lines 4183-4190 after you moved check.
src/hotspot/share/opto/library_call.cpp line 5372:
> 5370: bool LibraryCallKit::inline_array_partition() {
> 5371:
> 5372: address stubAddr = nullptr;
Move this declaration to first assignment at line 5387.
src/hotspot/share/opto/library_call.cpp line 5374:
> 5372: address stubAddr = nullptr;
> 5373: const char *stubName;
> 5374: stubName = "array_partition_stub";
It could one line.
src/hotspot/share/opto/library_call.cpp line 5400:
> 5398:
> 5399: // create the pivotIndices array of type int and size = 2
> 5400: Node* pivotIndices = nullptr;
Move to assignment at line 5403.
src/hotspot/share/opto/library_call.cpp line 5414:
> 5412: make_runtime_call(RC_LEAF|RC_NO_FP, OptoRuntime::array_partition_Type(),
> 5413: stubAddr, stubName, TypePtr::BOTTOM,
> 5414: obj_adr, elemType, fromIndex, toIndex, pivotIndices_adr, indexPivot1, indexPivot2);
May be put `indexPivot*` argument on new line for fit screen better.
src/java.base/linux/native/libsimdsort/avx512-32bit-qsort.hpp line 4:
> 2: * Copyright (c) 2021, 2023, Intel Corporation. All rights reserved.
> 3: * Copyright (c) 2021 Serge Sans Paille. All rights reserved.
> 4: * Intel x86-simd-sort source code.
I don't think this line here and in other new files will pass our Copyright header checker. Do you need this line here? You do have comment below `This implementation is based on x86-simd-sort`.
`DO NOT ALTER ..` line should immediately follow `Copyright` lines.
src/java.base/linux/native/libsimdsort/avxsort_linux_x86.cpp line 1:
> 1: /*
I think this file name should follow pattern of other files: `avx512-linux-qsort.cpp
src/java.base/share/classes/java/util/DualPivotQuicksort.java line 418:
> 416:
> 417: /*
> 418: * The first and the last elements to be sorted are moved
Misaligned `/*` here and several places later. `*` should be aligned.
src/java.base/share/classes/java/util/DualPivotQuicksort.java line 1353:
> 1351: /*
> 1352: * Swap the pivot into its final position.
> 1353: */
Indent
src/java.base/share/classes/java/util/DualPivotQuicksort.java line 2941:
> 2939: /*
> 2940: * Swap the pivot into its final position.
> 2941: */
indent
src/java.base/share/classes/java/util/DualPivotQuicksort.java line 3794:
> 3792: /*
> 3793: * Swap the pivot into its final position.
> 3794: */
Indent.
-------------
PR Review: https://git.openjdk.org/jdk/pull/14227#pullrequestreview-1660450589
PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1347808019
PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1347810630
PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1347809581
PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1347814187
PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1347816297
PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1347830769
PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1347834135
PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1347828096
PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1347824813
PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1347823957
PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1347823053
More information about the build-dev
mailing list