RFR: 8299038: Add AArch64 backend support for auto-vectorized FP16 conversions

Tue Jan 3 11:25:34 UTC 2023

This patch adds aarch64 backend support for auto-vectorized FP16 conversions namely, half precision to single precision and vice versa. Both Neon and SVE versions are included. The performance of this patch was tested on aarch64 machines with vector size of 128-bit, 256-bit and 512-bit.

Following are the performance improvements in throughput observed with the vectorized version versus the scalar code (which is the current implementation) for the JMH micro benchmark -
test/micro/org/openjdk/bench/java/math/Fp16ConversionBenchmark.java

Benchmark                               128-bit  256-bit  512-bit
Fp16ConversionBenchmark.float16ToFloat  6.02     8.72     24.71
Fp16ConversionBenchmark.floatToFloat16  2.00     3.29     10.84

The numbers shown are the ratios between throughput of vectorized version and that of the scalar version.

-------------

Commit messages:
 - 8299038: Add AArch64 backend support for auto-vectorized FP16 conversions

Changes: https://git.openjdk.org/jdk/pull/11825/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=11825&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8299038
  Stats: 207 lines in 6 files changed: 135 ins; 3 del; 69 mod
  Patch: https://git.openjdk.org/jdk/pull/11825.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/11825/head:pull/11825

PR: https://git.openjdk.org/jdk/pull/11825