RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v8]

Tue Dec 19 01:57:42 UTC 2023

On Mon, 18 Dec 2023 16:05:54 GMT, Olga Mikhaltsova <omikhaltcova at openjdk.org> wrote:

>> Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform.
>> 
>> In the table below it is shown that NaN argument should be processed as a special case.
>> 
>>                                                   RISC-V                            Java
>>                                         (FCVT.W.S)    (FCVT.L.D)  (long round(double a)) (int round(float a))
>> Minimum valid input (after rounding)     −2^31         −2^63         Long.MIN_VALUE       Integer.MIN_VALUE
>> Maximum valid input (after rounding)      2^31 − 1      2^63 − 1     Long.MAX_VALUE       Integer.MAX_VALUE
>> Output for out-of-range negative input   −2^31         −2^63         Long.MIN_VALUE       Integer.MIN_VALUE
>> Output for −∞                            −2^31         −2^63         Long.MIN_VALUE       Integer.MIN_VALUE
>> Output for out-of-range positive input    2^31 − 1      2^63 - 1     Long.MAX_VALUE       Integer.MAX_VALUE
>> Output for +∞                             2^31 − 1      2^63 - 1     Long.MAX_VALUE       Integer.MAX_VALUE
>> Output for NaN                            2^31 − 1      2^63 - 1           0                      0
>> 
>> The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement::
>> 
>> **Before**
>> 
>> Benchmark                              (TESTSIZE)   Mode  Cnt    Score   Error   Units
>> FpRoundingBenchmark.test_round_double        2048  thrpt   15   59.555  0.179  ops/ms
>> FpRoundingBenchmark.test_round_float         2048  thrpt   15   49.760  0.103  ops/ms
>> 
>> 
>> **After**
>> 
>> Benchmark                              (TESTSIZE)   Mode  Cnt    Score   Error   Units
>> FpRoundingBenchmark.test_round_double        2048  thrpt   15  110.956  0.186  ops/ms
>> FpRoundingBenchmark.test_round_float         2048  thrpt   15  115.947  0.122  ops/ms
>
> Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Used jint_cast/julong_cast; moved mv between feq and beqz

Thanks. Looks fine to me except for two nits. I guess we can follow the design decisions of RISC-V about dynamic and static rounding mode from the ISA spec and keep an eye on how this may affect new hardware implementations coming out.

The C99 language standard effectively mandates the provision of a dynamic rounding mode register.
In typical implementations, writes to the dynamic rounding mode CSR state will serialize the pipeline.
Static rounding modes are used to implement specialized arithmetic operations that often have to switch
frequently between different rounding modes

src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4264:

> 4262: void MacroAssembler::java_round_float(Register dst, FloatRegister src, FloatRegister ftmp) {
> 4263:   Label done;
> 4264:   li(t0, jint_cast(0.5f));

Nit: Can you change this `li` into `mv`? That will be consistent with other places where we move an immediate.

src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4281:

> 4279: void MacroAssembler::java_round_double(Register dst, FloatRegister src, FloatRegister ftmp) {
> 4280:   Label done;
> 4281:   li(t0, julong_cast(0.5));

Same as above here.

-------------

Marked as reviewed by fyang (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/16382#pullrequestreview-1787934518
PR Review Comment: https://git.openjdk.org/jdk/pull/16382#discussion_r1430796146
PR Review Comment: https://git.openjdk.org/jdk/pull/16382#discussion_r1430796295