RFR: 8345146: [PPC64] Make intrinsic conversions between bit representations of half precision values and floats [v3]

Mon Dec 2 14:15:44 UTC 2024

On Thu, 28 Nov 2024 21:33:23 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:

>> PPC64 implementation of [JDK-8289552](https://bugs.openjdk.org/browse/JDK-8289552). I've implemented some more instructions which may be useful in the future.
>> VectorCastNodes are not yet implemented on PPC64. Power9 is recognized by the availability of the "darn" instruction.
>> 
>> Performance on Power9:
>> Before patch:
>> 
>> Benchmark                                     (size)   Mode  Cnt      Score     Error   Units
>> Fp16ConversionBenchmark.float16ToFloat          2048  thrpt   15     18.995 ?   0.156  ops/ms
>> Fp16ConversionBenchmark.floatToFloat16          2048  thrpt   15     18.730 ?   0.331  ops/ms
>> 
>> 
>> After patch:
>> 
>> Benchmark                                     (size)   Mode  Cnt      Score      Error   Units
>> Fp16ConversionBenchmark.float16ToFloat          2048  thrpt   15    522.637 ?   11.274  ops/ms
>> Fp16ConversionBenchmark.floatToFloat16          2048  thrpt   15    408.112 ?    9.069  ops/ms
>
> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Make sure interpreter entries are not called on Power8 or older.

src/hotspot/cpu/ppc/c1_LIRGenerator_ppc.cpp line 700:

> 698:       LIR_Opr tmp = new_register(T_FLOAT);
> 699:       // f2hf treats tmp as live_in. Workaround: initialize to some value.
> 700:       __ move(LIR_OprFact::floatConst(-0.0), tmp); // just to satisfy LinearScan

@vnkozlov What do you think about introducing a dummy `LIR_Op0` which provides an undefined float for such workarounds (separate RFE)? That could be used on multiple platforms.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/22433#discussion_r1865923138