RFR: 8309130: x86_64 AVX512 intrinsics for Arrays.sort methods (int, long, float and double arrays) [v14]

iaroslavski duke at openjdk.org
Wed Aug 16 21:49:30 UTC 2023


On Wed, 16 Aug 2023 18:22:19 GMT, Srinivas Vamsi Parasa <duke at openjdk.org> wrote:

>> Hello @vamsi-parasa !
>> 
>> Do you process negative zeros properly? From one hand -0.0f equals to 0.0f, but negative zeros must be placed before 0.0f.
>> See javadoc for Arrays.sort(float[] a). The same situation with -0.0d (double type).
>
> @iaroslavski 
> 
> Hello Vladimir,
> 
> Please see the `Arrays.sort()` performance comparison between the  **enhanced DPQS/Radix sort #13568** vs. **AVX512 sort intrinsic** (this PR)  using the `ArraysSort.java` JMH [benchmark](https://github.com/openjdk/jdk/pull/13568/files#diff-dee51b13bd1872ff455cec2f29255cfd25014a5dd33dda55a2fc68638c3dd4b2) provided in the PR for [JDK-8266431: Dual-Pivot Quicksort improvements (Radix sort)](https://github.com/openjdk/jdk/pull/13568/files#top) ( #13568)
> 
> - The following command line was used to run the benchmarks: ` java -jar $JDK_HOME/build/linux-x86_64-server-release/images/test/micro/benchmarks.jar -jvmArgs "-XX:CompileThreshold=1 -XX:-TieredCompilation" ArraysSortTests`
> - Please see the performance numbers below. The scores shown are the average time (us/op),  thus lower is better. The last column towards the right shows the speedup. 
> - Space complexity was not benchmarked as AVX512 sort is an in-place sorting algorithm while the Radix sort needs extra memory.
> 
> |	Benchmark	|	Mode	|	Size	|	Enhanced DPQS/Radix (us/op)	|	AVX512 Sort (us/op)	|	Speedup	|
> |	 --- 	|	 --- 	|	 --- 	|	 --- 	|	 --- 	|	 --- 	|
> |	ArraysSortTests.Double.testSort	|	RANDOM	|	800	|	8.4	|	2.6	|	3.3x	|
> |	ArraysSortTests.Double.testSort	|	RANDOM	|	7000	|	94.4	|	28.0	|	3.4x	|
> |	ArraysSortTests.Double.testSort	|	RANDOM	|	50000	|	634.9	|	238.4	|	2.7x	|
> |	ArraysSortTests.Double.testSort	|	RANDOM	|	300000	|	3887.6	|	1721.6	|	2.3x	|
> |	ArraysSortTests.Double.testSort	|	RANDOM	|	2000000	|	29348.9	|	13854.8	|	2.1x	|
> |	ArraysSortTests.Double.testSort	|	REPEATED	|	800	|	2.0	|	1.6	|	1.2x	|
> |	ArraysSortTests.Double.testSort	|	REPEATED	|	7000	|	39.6	|	37.2	|	1.1x	|
> |	ArraysSortTests.Double.testSort	|	REPEATED	|	50000	|	470.0	|	386.3	|	1.2x	|
> |	ArraysSortTests.Double.testSort	|	REPEATED	|	300000	|	2940.8	|	342.7	|	8.6x	|
> |	ArraysSortTests.Double.testSort	|	REPEATED	|	2000000	|	19361.4	|	2838.9	|	6.8x	|
> |	ArraysSortTests.Double.testSort	|	STAGGER	|	800	|	2.3	|	2.7	|	0.8x	|
> |	ArraysSortTests.Double.testSort	|	STAGGER	|	7000	|	25.6	|	28.0	|	0.9x	|
> |	ArraysSortTests.Double.testSort	|	STAGGER	|	50000	|	159.8	|	224.0	|	0.7x	|
> |	ArraysSortTests.Double.testSort	|	STAGGER	|	300000	|	945.3	|	1566.8	|	0.6x	|
> |	ArraysSortTests.Double.testSort	|	STAGGER	|	2000000	|	6398.7	|	12668.3	|	0.5x	|
> |	ArraysSortTests.Double.testSort	|	SHUFFLE	|	800	|	4.9	|	2.7	|	1.8x	|
> |	ArraysSortTests.Double.testSort	|	SHUFFLE	|	7000	|	55.9	|	27.5	|	2.0x	|
> |	ArraysSortTests.Double.testSort	|	SHUFFLE	|	50000	|	4...

@vamsi-parasa 
Hello Vamsi,
Many thanks for the great details!

-------------

PR Comment: https://git.openjdk.org/jdk/pull/14227#issuecomment-1681312328


More information about the hotspot-compiler-dev mailing list