RFR: 8361582: AArch64: Some ConH values cannot be replicated with SVE [v2]

Thu Aug 7 08:24:49 UTC 2025

> After this commit - https://github.com/openjdk/jdk/commit/a49ecb26c5ff2f949851937f3bb036d7946a103e, the JTREG test -
> `test/hotspot/jtreg/compiler/vectorization/TestFloat16VectorOperations.java` fails for some of the tests which contain constant values such as -
> 
> 
> public void vectorAddConstInputFloat16() {
>          for (int i = 0; i < LEN; ++i) {
>              output[i] = float16ToRawShortBits(add(shortBitsToFloat16(input1[i]), FP16_CONST));
>          }
>      }
> 
> 
> 
> <The full failure log is present in the JBS ticket, thus not reproducing it here>
> 
> The current code in the JDK results in the generation of sve_dup instruction for every 16-bit immediate while the acceptable range is [-128, 127] for 8-bit immediates and [-127 << 8, 128 << 8] with a multiple of 256 for 16-bit signed immediates.
> 
> This patch allows the generation of sve_dup instruction for only those 16-bit values which are within the limits as specified above and for the values which are out of range, the immediate half float value is loaded from the constant pool into a register ("loadConH" mach node) which is then replicated or broadcasted to an SVE register ("replicateHF" mach node).
> 
> Both the tests - `test/hotspot/jtreg/compiler/vectorization/TestFloat16VectorOperations.java` and `test/hotspot/jtreg/compiler/c2/aarch64/TestFloat16Replicate.java` pass on 256-bit SVE machine. JTREG tests - hotspot (hotspot_all), langtools (tier1) and jdk(tier 1-3) pass on the same machine.

Bhavana Kilambi has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits:

 - Merge branch 'master' into JDK-8361582
 - 8361582: AArch64: Some ConH values cannot be replicated with SVE

   After this commit -
   https://github.com/openjdk/jdk/commit/a49ecb26c5ff2f949851937f3bb036d7946a103e,
   the JTREG test -
   hotspot/jtreg/compiler/vectorization/TestFloat16VectorOperations.java
   fails for some of the tests which contain constant values such as -

   public void vectorAddConstInputFloat16() {
            for (int i = 0; i < LEN; ++i) {
                output[i] = float16ToRawShortBits(add(shortBitsToFloat16(input1[i]), FP16_CONST));
            }
        }
   <The full failure log is present in the JBS ticket, thus not
   reproducing it here>

   The current code in the JDK results in the generation of sve_dup
   instruction for every 16-bit immediate while the acceptable range is
   [-128, 127] for 8-bit immediates and [-127 << 8, 128 << 8] with a
   multiple of 256 for 16-bit signed immediates.

   This patch allows the generation of sve_dup instruction for only those
   16-bit values which are within the limits as specified above and for the
   values which are out of range, the immediate half float value is loaded
   from the constant pool into a register ("loadConH" mach node) which is
   then replicated or broadcasted to an SVE register ("replicateHF" mach
   node).

-------------

Changes: https://git.openjdk.org/jdk/pull/26589/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26589&range=01
  Stats: 194 lines in 7 files changed: 170 ins; 4 del; 20 mod
  Patch: https://git.openjdk.org/jdk/pull/26589.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/26589/head:pull/26589

PR: https://git.openjdk.org/jdk/pull/26589