[lworld+fp16] RFR: 8341414: Add support for FP16 conversion routines [v2]
Jatin Bhateja
jbhateja at openjdk.org
Thu Nov 14 12:35:55 UTC 2024
On Thu, 31 Oct 2024 13:50:40 GMT, Bhavana Kilambi <bkilambi at openjdk.org> wrote:
>> This patch adds intrinsic support for FP16 conversion routines to int/long/double and also the aarch64 backend support. This patch implements both scalar and vector versions for these conversions.
>>
>> Performance numbers on aarch64 machine with SVE support :
>>
>>
>> Benchmark (vectorDim) Gain
>> Float16OpsBenchmark.fp16ToDouble 1024 18.23
>> Float16OpsBenchmark.fp16ToInt 1024 1.93
>> Float16OpsBenchmark.fp16ToLong 1024 3.95
>>
>>
>> The Gain column is the ratio between thrpt of this patch and the thrpt with the intrinsics disabled (which generates FP32 arithmetic).
>
> Bhavana Kilambi has updated the pull request incrementally with one additional commit since the last revision:
>
> Remove intrinsification of conversion methods in Float16
src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 5507:
> 5505: case T_DOUBLE: __ flt16_to_flt(v0, r0, v1, T_DOUBLE); break;
> 5506: default: ShouldNotReachHere();
> 5507: }
Ok, I re-visited this, so conversion stubs for constant folding call direct FP16 to INT/LONG/DOUBLE instructions.
This looks reasonable. Though constant folding is something that happens at compile time so it may not result in any runtime penalty even if we remove the stubs and directly cast to target type after hf2f stub converion.
-------------
PR Review Comment: https://git.openjdk.org/valhalla/pull/1283#discussion_r1842141999
More information about the valhalla-dev
mailing list