RFR: 8320646: RISC-V: C2 VectorCastHF2F [v3]

Mon Mar 4 11:55:55 UTC 2024

On Mon, 4 Mar 2024 07:50:53 GMT, Fei Yang <fyang at openjdk.org> wrote:

>> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   make UseZvfh experimatal; clean code
>
> src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1979:
> 
>> 1977: 
>> 1978:   // widen and sign-extend src data.
>> 1979:   __ vwadd_vx(dst, src, zr, Assembler::v0_t);
> 
> Can we use vector integer extension instruction here instead of `vwadd_vx`? Like this one from the RVV spec:
> `vsext.vf2 vd, vs2, vm # Sign-extend SEW/2 source to SEW destination`.
> This seems more readable and we can do the lines L1980-l1981 on entry, which has the benefit of not relying on the caller's vsetvli for operations here like `vwadd_vx`.

Yes, modified. Thanks!

> src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1995:
> 
>> 1993:   auto stub = C2CodeStub::make<VectorRegister, VectorRegister, uint>
>> 1994:               (dst, src, length, 24, float16_to_float_v_slow_path);
>> 1995: 
> 
> I think we should add one assertion here asserting that `dst` and `src` are differenct vector registers.

we could, but it's not necessary, as in `riscv_v.ad` it already has:

instruct vconvHF2F(vReg dst, vReg src, vRegMask_V0 v0) %{
  ...
  effect(TEMP_DEF dst, TEMP v0);

> src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 2003:
> 
>> 2001:   // i.e. non-NaN and non-Inf cases.
>> 2002: 
>> 2003:   vsetvli_helper(BasicType::T_SHORT, length, Assembler::mf2);
> 
> Will this `vsetvli` setting work when we check for NaNs in L2005-L2010?  The `Assembler::mf2` param doesn't seem correct to me. Should we use `Assembler::m1` here instead?

I think it will work. And please note that we set it to `m1` in `float16_to_float_v_slow_path` in case of any NaN.
Or maybe I miss something?

> src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 2037:
> 
>> 2035:   __ mv(t0, 26);
>> 2036:   // preserve the sign bit.
>> 2037:   __ vnsra_wx(tmp, src, t0, Assembler::v0_t);
> 
> Will this more simpler version do here? `__ vnsra_wi(tmp, src, 26, Assembler::v0_t);`

Seems not, as by vector spec, `vector-immediate | imm[4:0]` and `The value is sign-extended to SEW bits`, so maximum of imm should be 16-1.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1511037729
PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1511037873
PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1511037578
PR Review Comment: https://git.openjdk.org/jdk/pull/17698#discussion_r1511038094