RFR: 8317763: Follow-up to AVX512 intrinsics for Arrays.sort() PR [v4]

Srinivas Vamsi Parasa duke at openjdk.org
Wed Oct 11 19:59:16 UTC 2023


On Wed, 11 Oct 2023 17:28:12 GMT, Srinivas Vamsi Parasa <duke at openjdk.org> wrote:

>> The goal of this PR is to address the follow-up comments to the SIMD accelerated sort PR (#14227) which implemented AVX512 intrinsics for Arrays.sort() methods.
>> The proposed changes are:
>> 
>> 1) Restriction of the AVX512 sort acceleration to only Intel CPUs. A performance regression (due to micro-architectural differences) was reported for AMD Zen4 CPUs in the comments section of PR.
>> 2) Addressing the build failure due to a bug in GCC 12 (which was fixed in version 12.3.1). The details of the bug are at: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105593
>> 3) Minor changes in Javadoc strings
>
> Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Add @ForceInline annotation to insertion and mixedInsertion sort

> > > Also @forceinline in these changes only works for case when new intrinsics are not used. I would suggest to adapt/update JMH benchmark to cover all cases and see effect @forceinline without intrinsics. That will tell us which @forceinline annotations are needed.
> > 
> > 
> > Added `@ForceInline` annotations to `insertionSort` and `mixedInsertionSort` as it is helping arrays of small sizes when intrinsics are disabled.
> > Thanks, Vamsi
> 
> @vamsi-parasa Please revert changes (adding @forceinline to insertionSort and mixedInsertionSort) - I checked: initinal version works faster.

@iaroslavski 
Vladimir, have you used `-XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_arraySort,_arrayPartition` to disable the intrinsics?
Also, it helps to have one warmup iteration of 30 secs to reduce run-to-run variance.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/16124#issuecomment-1758437209


More information about the hotspot-compiler-dev mailing list