RFR: 8299038: Add AArch64 backend support for auto-vectorized FP16 conversions
Bhavana Kilambi
bkilambi at openjdk.org
Tue Jan 3 11:25:34 UTC 2023
This patch adds aarch64 backend support for auto-vectorized FP16 conversions namely, half precision to single precision and vice versa. Both Neon and SVE versions are included. The performance of this patch was tested on aarch64 machines with vector size of 128-bit, 256-bit and 512-bit.
Following are the performance improvements in throughput observed with the vectorized version versus the scalar code (which is the current implementation) for the JMH micro benchmark -
test/micro/org/openjdk/bench/java/math/Fp16ConversionBenchmark.java
Benchmark 128-bit 256-bit 512-bit
Fp16ConversionBenchmark.float16ToFloat 6.02 8.72 24.71
Fp16ConversionBenchmark.floatToFloat16 2.00 3.29 10.84
The numbers shown are the ratios between throughput of vectorized version and that of the scalar version.
-------------
Commit messages:
- 8299038: Add AArch64 backend support for auto-vectorized FP16 conversions
Changes: https://git.openjdk.org/jdk/pull/11825/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=11825&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8299038
Stats: 207 lines in 6 files changed: 135 ins; 3 del; 69 mod
Patch: https://git.openjdk.org/jdk/pull/11825.diff
Fetch: git fetch https://git.openjdk.org/jdk pull/11825/head:pull/11825
PR: https://git.openjdk.org/jdk/pull/11825
More information about the hotspot-compiler-dev
mailing list