RFR: 8318227: RISC-V: C2 ConvHF2F [v2]

Tue Nov 28 13:27:11 UTC 2023

On Mon, 27 Nov 2023 13:29:33 GMT, Hamlin Li <mli at openjdk.org> wrote:

>> Hi,
>> Can you review the patch to add ConvHF2F intrinsic to JDK for riscv？
>> Thanks!
>> 
>> (By latest kernel patch, `#define		RISCV_HWPROBE_EXT_ZFH		(1 << 27)`
>> https://lore.kernel.org/lkml/20231114141256.126749-11-cleger@rivosinc.com/)
>> 
>> ## Test
>> ### Functionality
>> #### hotspot tests
>> test/hotspot/jtreg/compiler/intrinsics/ 
>> test/hotspot/jtreg/compiler/c2/irTests
>> 
>> #### jdk tests
>> test/jdk/java/lang/Float/Binary16Conversion*.java
>> 
>> ### Performance
>> tested on licheepi.
>> 
>> #### with UseZfh enabled & stub out-of-band
>> 
>> Benchmark                                     (size)  Mode  Cnt      Score     Error  Units
>> Fp16ConversionBenchmark.float16ToFloat          2048  avgt   10   3493.376 ?  18.631  ns/op
>> Fp16ConversionBenchmark.float16ToFloatMemory    2048  avgt   10     19.819 ?   0.193  ns/op
>> 
>> 
>> #### with UseZfh enabled only
>> (i.e. enable the intrinsic)
>> 
>> Benchmark                                     (size)  Mode  Cnt      Score     Error  Units
>> Fp16ConversionBenchmark.float16ToFloat          2048  avgt   10   4659.796 ?  13.262  ns/op
>> Fp16ConversionBenchmark.float16ToFloatMemory    2048  avgt   10     22.957 ?   0.098  ns/op
>> 
>> 
>> #### with UseZfh disabled
>> (i.e. disable the intrinsic)
>> 
>> Benchmark                                     (size)  Mode  Cnt      Score    Error  Units
>> Fp16ConversionBenchmark.float16ToFloat          2048  avgt   10  22930.591 ? 72.595  ns/op
>> Fp16ConversionBenchmark.float16ToFloatMemory    2048  avgt   10     25.970 ?  0.063  ns/op
>
> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision:
> 
>   optimize perf with stub out-of-line

src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1704:

> 1702:   // check whether it's a NaN.
> 1703:   mv(t0, 0x7c00);
> 1704:   andr(tmp, src, t0);

I see from the exponent encoding of float16 on [1], it could be a negative/positive infinity as well when exponent is 0b11111. It depends on whether the significand is zero or not. So it this checking for NAN sufficient?

[1] https://en.wikipedia.org/wiki/Half-precision_floating-point_format

src/hotspot/cpu/riscv/riscv.ad line 8288:

> 8286:     __ float16_to_float($dst$$FloatRegister, $src$$Register, $tmp$$Register);
> 8287:   %}
> 8288:   ins_pipe(fp_f2i);

Seems we should use `ins_pipe(pipe_slow)` here as this emits multiple instructions.

src/hotspot/os_cpu/linux_riscv/riscv_hwprobe.cpp line 52:

> 50: #define   RISCV_HWPROBE_EXT_ZBB                 (1 << 4)
> 51: #define   RISCV_HWPROBE_EXT_ZBS                 (1 << 5)
> 52: #define   RISCV_HWPROBE_EXT_ZFH                 (1 << 27)

Will this change in future? Seems it's still not there in the kernel source yet [1].

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/riscv/include/uapi/asm/hwprobe.h?h=v6.7-rc3

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16802#discussion_r1407751915
PR Review Comment: https://git.openjdk.org/jdk/pull/16802#discussion_r1407588997
PR Review Comment: https://git.openjdk.org/jdk/pull/16802#discussion_r1407610639