RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v8]
Fei Yang
fyang at openjdk.org
Tue Dec 19 01:57:42 UTC 2023
On Mon, 18 Dec 2023 16:05:54 GMT, Olga Mikhaltsova <omikhaltcova at openjdk.org> wrote:
>> Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform.
>>
>> In the table below it is shown that NaN argument should be processed as a special case.
>>
>> RISC-V Java
>> (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a))
>> Minimum valid input (after rounding) −2^31 −2^63 Long.MIN_VALUE Integer.MIN_VALUE
>> Maximum valid input (after rounding) 2^31 − 1 2^63 − 1 Long.MAX_VALUE Integer.MAX_VALUE
>> Output for out-of-range negative input −2^31 −2^63 Long.MIN_VALUE Integer.MIN_VALUE
>> Output for −∞ −2^31 −2^63 Long.MIN_VALUE Integer.MIN_VALUE
>> Output for out-of-range positive input 2^31 − 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE
>> Output for +∞ 2^31 − 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE
>> Output for NaN 2^31 − 1 2^63 - 1 0 0
>>
>> The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement::
>>
>> **Before**
>>
>> Benchmark (TESTSIZE) Mode Cnt Score Error Units
>> FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms
>> FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms
>>
>>
>> **After**
>>
>> Benchmark (TESTSIZE) Mode Cnt Score Error Units
>> FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms
>> FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms
>
> Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision:
>
> Used jint_cast/julong_cast; moved mv between feq and beqz
Thanks. Looks fine to me except for two nits. I guess we can follow the design decisions of RISC-V about dynamic and static rounding mode from the ISA spec and keep an eye on how this may affect new hardware implementations coming out.
The C99 language standard effectively mandates the provision of a dynamic rounding mode register.
In typical implementations, writes to the dynamic rounding mode CSR state will serialize the pipeline.
Static rounding modes are used to implement specialized arithmetic operations that often have to switch
frequently between different rounding modes
src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4264:
> 4262: void MacroAssembler::java_round_float(Register dst, FloatRegister src, FloatRegister ftmp) {
> 4263: Label done;
> 4264: li(t0, jint_cast(0.5f));
Nit: Can you change this `li` into `mv`? That will be consistent with other places where we move an immediate.
src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4281:
> 4279: void MacroAssembler::java_round_double(Register dst, FloatRegister src, FloatRegister ftmp) {
> 4280: Label done;
> 4281: li(t0, julong_cast(0.5));
Same as above here.
-------------
Marked as reviewed by fyang (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/16382#pullrequestreview-1787934518
PR Review Comment: https://git.openjdk.org/jdk/pull/16382#discussion_r1430796146
PR Review Comment: https://git.openjdk.org/jdk/pull/16382#discussion_r1430796295
More information about the hotspot-dev
mailing list