RFR: 8371711: AArch64: SVE intrinsics for Arrays.sort methods (int, float)
Vladimir Ivanov
vlivanov at openjdk.org
Tue Dec 9 00:03:57 UTC 2025
On Fri, 5 Dec 2025 09:44:16 GMT, Bhavana Kilambi <bkilambi at openjdk.org> wrote:
> This patch adds an SVE implementation of primitive array sorting (Arrays.sort()) on AArch64 systems that support SVE. On non-SVE machines, we fall back to the existing Java implementation.
>
> For smaller arrays (length <= 64), we use insertion sort; for larger arrays we use an SVE-vectorized quicksort partitioner followed by an odd-even transposition cleanup pass.
>
> The SVE path is enabled by default for int type. For float type, it is available through the experimental flag :
>
> `-XX:+UnlockExperimentalVMOptions -XX:+UseSVELibSimdSortForFP
> `
> Without this flag being enabled, the default Java implementation would be executed for floats (the flag is disabled by default).
>
> Float is gated due to observed regressions on some small/medium sizes. On larger arrays, the SVE float path shows upto 1.47x speedup on Neoverse V2 and 2.12x on Neoverse V1.
>
> Following are the performance numbers for **ArraysSort JMH benchmark** -
>
> **Case A:** Ratio between the scores of master branch and `UseSVELibSimdSortForFP` flag disabled (which is the default).
> **Case B:** Ratio between the scores of master branch and `UseSVELibSimdSortForFP` flag enabled (the int numbers will be the same but this now enables SVE vectorized sorting for floats).
> **We would want the ratios to be >= 1 to be at par or better than the default Java implementation (master branch).**
>
> On Neoverse V1:
>
>
> Benchmark (size) Mode Cnt A B
> ArraysSort.floatParallelSort 10 avgt 3 0.98 0.98
> ArraysSort.floatParallelSort 25 avgt 3 1.01 0.83
> ArraysSort.floatParallelSort 50 avgt 3 0.99 0.55
> ArraysSort.floatParallelSort 75 avgt 3 0.99 0.66
> ArraysSort.floatParallelSort 100 avgt 3 0.98 0.66
> ArraysSort.floatParallelSort 1000 avgt 3 1.00 0.84
> ArraysSort.floatParallelSort 10000 avgt 3 1.03 1.52
> ArraysSort.floatParallelSort 100000 avgt 3 1.03 1.46
> ArraysSort.floatParallelSort 1000000 avgt 3 0.98 1.81
> ArraysSort.floatSort 10 avgt 3 1.00 0.98
> ArraysSort.floatSort 25 avgt 3 1.00 0.81
> ArraysSort.floatSort 50 avgt 3 0.99 0.56
> ArraysSort.floatSort 75 avgt 3 0.99 0.65
> ArraysSort.floatSort 100 avgt 3 0.98 0.70
> ArraysSort.floatSort 1000 avgt 3 0.99 0.84
> ArraysSort.floatSort ...
Good work, Bhavana!
I reminds me of an effort I started a year ago to migrate native libraries to FFM.
`vectormath` got integrated, but `libsimdsort` is still a draft:
https://github.com/iwanowww/jdk/tree/libsimdsort.1
Since you do profound refactorings in the libsimdsort library code, I suggest to introduce SVE variant on top of FFM from the beginning. Let me finalize the PR and post it for review. What do you think?
-------------
PR Comment: https://git.openjdk.org/jdk/pull/28675#issuecomment-3629564612
More information about the core-libs-dev
mailing list