RFR: 8320646: RISC-V: C2 VectorCastHF2F [v3]

Mon Mar 4 08:02:48 UTC 2024

On Wed, 28 Feb 2024 14:19:06 GMT, Hamlin Li <mli at openjdk.org> wrote:

>> Hi,
>> Can you review this patch to add instrinsics for VectorCastHF2F/VectorCastF2HF?
>> Thanks!
>> 
>> ## Test
>> 
>> test/jdk/java/lang/Float/Binary16ConversionNaN.java
>> test/jdk/java/lang/Float/Binary16Conversion.java
>> 
>> hotspot/jtreg/compiler/intrinsics/float16
>> hotspot/jtreg/compiler/vectorization/TestFloatConversionsVectorjava
>> hotspot/jtreg/compiler/vectorization/TestFloatConversionsVectorNaN.java
>
> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision:
> 
>   make UseZvfh experimatal; clean code

Several more comments. Could you please move the newly-added assember functions after the scalar versions, that is `C2_MacroAssembler::float16_to_float` and `C2_MacroAssembler::float_to_float16`? Then it will be more convenient for people to compare the two versions.

src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1979:

> 1977: 
> 1978:   // widen and sign-extend src data.
> 1979:   __ vwadd_vx(dst, src, zr, Assembler::v0_t);

Can we use vector integer extension instruction here instead of `vwadd_vx`? Like this one from the RVV spec: `vsext.vf2 vd, vs2, vm # Sign-extend SEW/2 source to SEW destination`. This seems more readable and we can do the lines L1980-l1981 on entry, which has the benefit of not relying on the caller's vsetvli for operations here like `vwadd_vx`.

src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1995:

> 1993:   auto stub = C2CodeStub::make<VectorRegister, VectorRegister, uint>
> 1994:               (dst, src, length, 24, float16_to_float_v_slow_path);
> 1995: 

I think we should add one assertion here asserting that `dst` and `src` are differenct vector registers.

src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1996:

> 1994:               (dst, src, length, 24, float16_to_float_v_slow_path);
> 1995: 
> 1996:   // in riscv, NaN needs a special process as vfwcvt_f_f_v does not work in that case.

Suggestion: s/in riscv/On riscv/

src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1997:

> 1995: 
> 1996:   // in riscv, NaN needs a special process as vfwcvt_f_f_v does not work in that case.
> 1997:   // in riscv, Inf does not need a special process as vfwcvt_f_f_v can handle it correctly.

Suggestion: s/in riscv/On riscv/

src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 2052:

> 2050: void C2_MacroAssembler::float_to_float16_v(VectorRegister dst, VectorRegister src, VectorRegister tmp, uint length) {
> 2051:   auto stub = C2CodeStub::make<VectorRegister, VectorRegister, VectorRegister, uint>
> 2052:               (dst, src, tmp, length, 36, float_to_float16_v_slow_path);

Similar here: I think we should add one assertion here asserting that `dst` and `src` are differenct vector registers.

src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 2054:

> 2052:               (dst, src, tmp, length, 36, float_to_float16_v_slow_path);
> 2053: 
> 2054:   // in riscv, NaN needs a special process as vfncvt_f_f_w does not work in that case.

Suggestion: s/in riscv/On riscv/

-------------

PR Review: https://git.openjdk.org/jdk/pull/17698#pullrequestreview-1913566853
PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1510711417
PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1510713401
PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1510715794
PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1510715892
PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1510714094
PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1510716096