RFR: 8309130: x86_64 AVX512 intrinsics for Arrays.sort methods (int, long, float and double arrays) [v13]

David Schlosnagle duke at openjdk.org
Sat Jul 29 16:02:54 UTC 2023


On Tue, 25 Jul 2023 20:30:31 GMT, Srinivas Vamsi Parasa <duke at openjdk.org> wrote:

>> The goal is to develop faster sort routines for x86_64 CPUs by taking advantage of AVX512 instructions. This enhancement provides an order of magnitude speedup for Arrays.sort() using int, long, float and double arrays.
>> 
>> This PR shows upto ~13x improvement for 32-bit datatypes (int, float) and upto 8x improvement for 64-bit datatypes (long, double) as shown in the performance data below.
>> 
>> **Arrays.sort performance data using JMH benchmarks** 
>> 
>> |	Arrays.sort benchmark	|	Array Size	|	Baseline (us/op)	|	AVX512 Sort (us/op)	|	Speedup	|
>> |	---	|	---	|	---	|	---	|	---	|
>> |	ArraysSort.doubleSort	|	100	|	0.639	|	0.217	|	2.9x	|
>> |	ArraysSort.doubleSort	|	1000	|	8.707	|	3.421	|	2.5x	|
>> |	ArraysSort.doubleSort	|	10000	|	349.267	|	43.56	|	**8.0x**	|
>> |	ArraysSort.doubleSort	|	100000	|	4721.17	|	579.819	|	**8.1x**	|
>> |	ArraysSort.floatSort	|	100	|	0.722	|	0.129	|	5.6x	|
>> |	ArraysSort.floatSort	|	1000	|	9.1	|	2.356	|	3.9x	|
>> |	ArraysSort.floatSort	|	10000	|	336.472	|	26.706	|	**12.6x**	|
>> |	ArraysSort.floatSort	|	100000	|	4804.716	|	427.397	|	**11.2x**	|
>> |	ArraysSort.intSort	|	100	|	0.61	|	0.111	|	5.5x	|
>> |	ArraysSort.intSort	|	1000	|	8.534	|	2.025	|	4.2x	|
>> |	ArraysSort.intSort	|	10000	|	310.97	|	24.082	|	**12.9x**	|
>> |	ArraysSort.intSort	|	100000	|	4484.94	|	381.01	|	**11.8x**	|
>> |	ArraysSort.longSort	|	100	|	0.636	|	0.28	|	2.3x	|
>> |	ArraysSort.longSort	|	1000	|	8.646	|	4.425	|	2.0x	|
>> |	ArraysSort.longSort	|	10000	|	322.116	|	53.094	|	**6.1x**	|
>> |	ArraysSort.longSort	|	100000	|	4448.171	|	696.773	|	**6.4x**	|
>
> Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision:
> 
>   add special cases to float and double arrays

src/java.base/share/classes/java/util/Arrays.java line 100:

> 98:         else if (elemType == float.class) DualPivotQuicksort.sort((float[]) array, 0, fromIndex, toIndex);
> 99:         else if (elemType == double.class) DualPivotQuicksort.sort((double[]) array, 0, fromIndex, toIndex);
> 100:         else throw new UnsupportedOperationException("arraySort intrinsic not supported for this type: " + elemType.toString());

I'm curious if there is a performance difference using switch pattern on element type that would generate an `invokedynamic typeSwitch` over the primitive array types e.g.:

Suggestion:

        switch (array) {
            case int[] arr -> DualPivotQuicksort.sort(arr, 0, fromIndex, toIndex);
            case long[] arr -> DualPivotQuicksort.sort(arr, 0, fromIndex, toIndex);
            case float[] arr -> DualPivotQuicksort.sort(arr, 0, fromIndex, toIndex);
            case double[] arr -> DualPivotQuicksort.sort(arr, 0, fromIndex, toIndex);
            default -> throw new UnsupportedOperationException(
                    "arraySort intrinsic not supported for this type: " + elemType);
        }

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/14227#discussion_r1278316538


More information about the hotspot-compiler-dev mailing list