RFR: 8299038: Add AArch64 backend support for auto-vectorized FP16 conversions [v2]

Bhavana Kilambi bkilambi at openjdk.org
Wed Jan 4 16:26:20 UTC 2023


> This patch adds aarch64 backend support for auto-vectorized FP16 conversions namely, half precision to single precision and vice versa. Both Neon and SVE versions are included. The performance of this patch was tested on aarch64 machines with vector size of 128-bit, 256-bit and 512-bit.
> 
> Following are the performance improvements in throughput observed with the vectorized version versus the scalar code (which is the current implementation) for the JMH micro benchmark -
> test/micro/org/openjdk/bench/java/math/Fp16ConversionBenchmark.java
> 
> 
> Benchmark                               128-bit  256-bit  512-bit
> Fp16ConversionBenchmark.float16ToFloat  6.02     8.72     24.71
> Fp16ConversionBenchmark.floatToFloat16  2.00     3.29     10.84
> 
> 
> The numbers shown are the ratios between throughput of vectorized version and that of the scalar version.

Bhavana Kilambi has updated the pull request incrementally with one additional commit since the last revision:

  Changed the copyright year to 2023

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/11825/files
  - new: https://git.openjdk.org/jdk/pull/11825/files/4ba1cf5d..49c0b7b7

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=11825&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11825&range=00-01

  Stats: 6 lines in 4 files changed: 0 ins; 0 del; 6 mod
  Patch: https://git.openjdk.org/jdk/pull/11825.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/11825/head:pull/11825

PR: https://git.openjdk.org/jdk/pull/11825


More information about the hotspot-compiler-dev mailing list