RFR: 8299038: Add AArch64 backend support for auto-vectorized FP16 conversions [v2]
Bhavana Kilambi
bkilambi at openjdk.org
Wed Jan 4 16:26:20 UTC 2023
> This patch adds aarch64 backend support for auto-vectorized FP16 conversions namely, half precision to single precision and vice versa. Both Neon and SVE versions are included. The performance of this patch was tested on aarch64 machines with vector size of 128-bit, 256-bit and 512-bit.
>
> Following are the performance improvements in throughput observed with the vectorized version versus the scalar code (which is the current implementation) for the JMH micro benchmark -
> test/micro/org/openjdk/bench/java/math/Fp16ConversionBenchmark.java
>
>
> Benchmark 128-bit 256-bit 512-bit
> Fp16ConversionBenchmark.float16ToFloat 6.02 8.72 24.71
> Fp16ConversionBenchmark.floatToFloat16 2.00 3.29 10.84
>
>
> The numbers shown are the ratios between throughput of vectorized version and that of the scalar version.
Bhavana Kilambi has updated the pull request incrementally with one additional commit since the last revision:
Changed the copyright year to 2023
-------------
Changes:
- all: https://git.openjdk.org/jdk/pull/11825/files
- new: https://git.openjdk.org/jdk/pull/11825/files/4ba1cf5d..49c0b7b7
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jdk&pr=11825&range=01
- incr: https://webrevs.openjdk.org/?repo=jdk&pr=11825&range=00-01
Stats: 6 lines in 4 files changed: 0 ins; 0 del; 6 mod
Patch: https://git.openjdk.org/jdk/pull/11825.diff
Fetch: git fetch https://git.openjdk.org/jdk pull/11825/head:pull/11825
PR: https://git.openjdk.org/jdk/pull/11825
More information about the hotspot-compiler-dev
mailing list