RFR: 8371711: AArch64: SVE intrinsics for Arrays.sort methods (int, float)

Tue Dec 9 00:03:57 UTC 2025

On Fri, 5 Dec 2025 09:44:16 GMT, Bhavana Kilambi <bkilambi at openjdk.org> wrote:

> This patch adds an SVE implementation of primitive array sorting (Arrays.sort()) on AArch64 systems that support SVE. On non-SVE machines, we fall back to the existing Java implementation.
> 
> For smaller arrays (length <= 64), we use insertion sort; for larger arrays we use an SVE-vectorized quicksort partitioner followed by an odd-even transposition cleanup pass.
> 
> The SVE path is enabled by default for int type. For float type, it is available through the experimental flag :
> 
> `-XX:+UnlockExperimentalVMOptions -XX:+UseSVELibSimdSortForFP
> `
> Without this flag being enabled, the default Java implementation would be executed for floats (the flag is disabled by default).
> 
> Float is gated due to observed regressions on some small/medium sizes. On larger arrays, the SVE float path shows upto 1.47x speedup on Neoverse V2 and 2.12x on Neoverse V1.
> 
> Following are the performance numbers for **ArraysSort JMH benchmark** -
> 
> **Case A:** Ratio between the scores of master branch and `UseSVELibSimdSortForFP` flag disabled (which is the default).
> **Case B:** Ratio between the scores of master branch and `UseSVELibSimdSortForFP` flag enabled (the int numbers will be the same but this now enables SVE vectorized sorting for floats).
> **We would want the ratios to be >= 1 to be at par or better than the default Java implementation (master branch).**
> 
> On Neoverse V1:
> 
> 
> Benchmark                       (size)   Mode    Cnt    A       B
> ArraysSort.floatParallelSort    10       avgt    3      0.98    0.98
> ArraysSort.floatParallelSort    25       avgt    3      1.01    0.83
> ArraysSort.floatParallelSort    50       avgt    3      0.99    0.55
> ArraysSort.floatParallelSort    75       avgt    3      0.99    0.66
> ArraysSort.floatParallelSort    100      avgt    3      0.98    0.66
> ArraysSort.floatParallelSort    1000     avgt    3      1.00    0.84
> ArraysSort.floatParallelSort    10000    avgt    3      1.03    1.52
> ArraysSort.floatParallelSort    100000   avgt    3      1.03    1.46
> ArraysSort.floatParallelSort    1000000  avgt    3      0.98    1.81
> ArraysSort.floatSort            10       avgt    3      1.00    0.98
> ArraysSort.floatSort            25       avgt    3      1.00    0.81
> ArraysSort.floatSort            50       avgt    3      0.99    0.56
> ArraysSort.floatSort            75       avgt    3      0.99    0.65
> ArraysSort.floatSort            100      avgt    3      0.98    0.70
> ArraysSort.floatSort            1000     avgt    3      0.99    0.84
> ArraysSort.floatSort            ...

Good work, Bhavana!

I reminds me of an effort I started a year ago to migrate native libraries to FFM.
`vectormath` got integrated, but `libsimdsort` is still a draft:
   https://github.com/iwanowww/jdk/tree/libsimdsort.1

Since you do profound refactorings in the libsimdsort library code, I suggest to introduce SVE variant on top of FFM from the beginning. Let me finalize the PR and post it for review. What do you think?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/28675#issuecomment-3629564612