RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v7]
Fei Yang
fyang at openjdk.org
Mon Dec 18 06:30:45 UTC 2023
On Fri, 8 Dec 2023 22:46:45 GMT, Olga Mikhaltsova <omikhaltcova at openjdk.org> wrote:
>> Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform.
>>
>> In the table below it is shown that NaN argument should be processed as a special case.
>>
>> RISC-V Java
>> (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a))
>> Minimum valid input (after rounding) −2^31 −2^63 Long.MIN_VALUE Integer.MIN_VALUE
>> Maximum valid input (after rounding) 2^31 − 1 2^63 − 1 Long.MAX_VALUE Integer.MAX_VALUE
>> Output for out-of-range negative input −2^31 −2^63 Long.MIN_VALUE Integer.MIN_VALUE
>> Output for −∞ −2^31 −2^63 Long.MIN_VALUE Integer.MIN_VALUE
>> Output for out-of-range positive input 2^31 − 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE
>> Output for +∞ 2^31 − 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE
>> Output for NaN 2^31 − 1 2^63 - 1 0 0
>>
>> The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement::
>>
>> **Before**
>>
>> Benchmark (TESTSIZE) Mode Cnt Score Error Units
>> FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms
>> FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms
>>
>>
>> **After**
>>
>> Benchmark (TESTSIZE) Mode Cnt Score Error Units
>> FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms
>> FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms
>
> Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision:
>
> Optimization against regression on SiFive
So I tried your test on both Unmatched and Licheepi-4A boards. And I see the C2 JIT code is exercised and the test is passing over the full 32-bit range. Several minor comments though.
src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4264:
> 4262: void MacroAssembler::java_round_float(Register dst, FloatRegister src, FloatRegister ftmp) {
> 4263: Label done;
> 4264: mv(dst, zr);
I see slightly improvement on both platforms when moving `mv` into between `feq_s`/`feq_d` and `beqz`.
src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4265:
> 4263: Label done;
> 4264: mv(dst, zr);
> 4265: li(t0, 0x3f000000);
Suggestion: `mv(t0, jint_cast(0.5f));`
src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4282:
> 4280: Label done;
> 4281: mv(dst, zr);
> 4282: li(t0, 0x3fe0000000000000);
Suggestion: `mv(t0, julong_cast(0.5));`
src/hotspot/cpu/riscv/riscv.ad line 8367:
> 8365: %}
> 8366:
> 8367: ins_pipe(pipe_class_default);
Suggestion: `ins_pipe(pipe_slow);`
src/hotspot/cpu/riscv/riscv.ad line 8381:
> 8379: %}
> 8380:
> 8381: ins_pipe(pipe_class_default);
Suggestion: `ins_pipe(pipe_slow);`
-------------
PR Review: https://git.openjdk.org/jdk/pull/16382#pullrequestreview-1785876685
PR Review Comment: https://git.openjdk.org/jdk/pull/16382#discussion_r1429517638
PR Review Comment: https://git.openjdk.org/jdk/pull/16382#discussion_r1429518186
PR Review Comment: https://git.openjdk.org/jdk/pull/16382#discussion_r1429518536
PR Review Comment: https://git.openjdk.org/jdk/pull/16382#discussion_r1429546156
PR Review Comment: https://git.openjdk.org/jdk/pull/16382#discussion_r1429546238
More information about the hotspot-dev
mailing list