RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v6]
Fei Yang
fyang at openjdk.org
Tue Dec 5 03:36:34 UTC 2023
On Wed, 15 Nov 2023 15:44:47 GMT, Olga Mikhaltsova <omikhaltcova at openjdk.org> wrote:
>> Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform.
>>
>> In the table below it is shown that NaN argument should be processed as a special case.
>>
>> RISC-V Java
>> (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a))
>> Minimum valid input (after rounding) −2^31 −2^63 Long.MIN_VALUE Integer.MIN_VALUE
>> Maximum valid input (after rounding) 2^31 − 1 2^63 − 1 Long.MAX_VALUE Integer.MAX_VALUE
>> Output for out-of-range negative input −2^31 −2^63 Long.MIN_VALUE Integer.MIN_VALUE
>> Output for −∞ −2^31 −2^63 Long.MIN_VALUE Integer.MIN_VALUE
>> Output for out-of-range positive input 2^31 − 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE
>> Output for +∞ 2^31 − 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE
>> Output for NaN 2^31 − 1 2^63 - 1 0 0
>>
>> The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement::
>>
>> **Before**
>>
>> Benchmark (TESTSIZE) Mode Cnt Score Error Units
>> FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms
>> FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms
>>
>>
>> **After**
>>
>> Benchmark (TESTSIZE) Mode Cnt Score Error Units
>> FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms
>> FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms
>
> Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision:
>
> Replaced tmp with t0
Unfortunately, I witnessed performance regression on sifive unmatched board.
Before:
FpRoundingBenchmark.test_ceil 2048 thrpt 15 39.243 ? 0.506 ops/ms
FpRoundingBenchmark.test_floor 2048 thrpt 15 39.448 ? 0.076 ops/ms
FpRoundingBenchmark.test_rint 2048 thrpt 15 39.411 ? 0.134 ops/ms
FpRoundingBenchmark.test_round_double 2048 thrpt 15 31.329 ? 0.085 ops/ms
FpRoundingBenchmark.test_round_float 2048 thrpt 15 31.328 ? 0.031 ops/ms
After:
FpRoundingBenchmark.test_ceil 2048 thrpt 15 39.375 ? 0.125 ops/ms
FpRoundingBenchmark.test_floor 2048 thrpt 15 39.407 ? 0.076 ops/ms
FpRoundingBenchmark.test_rint 2048 thrpt 15 39.387 ? 0.235 ops/ms
FpRoundingBenchmark.test_round_double 2048 thrpt 15 23.940 ? 0.025 ops/ms
FpRoundingBenchmark.test_round_float 2048 thrpt 15 30.629 ? 0.021 ops/ms
-------------
PR Comment: https://git.openjdk.org/jdk/pull/16382#issuecomment-1839944647
More information about the hotspot-dev
mailing list