RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v12]
Olga Mikhaltsova
omikhaltcova at openjdk.org
Thu Dec 28 23:13:51 UTC 2023
On Thu, 21 Dec 2023 23:02:55 GMT, Olga Mikhaltsova <omikhaltcova at openjdk.org> wrote:
>> Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform.
>>
>> In the table below it is shown that NaN argument should be processed as a special case.
>>
>> RISC-V Java
>> (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a))
>> Minimum valid input (after rounding) −2^31 −2^63 Long.MIN_VALUE Integer.MIN_VALUE
>> Maximum valid input (after rounding) 2^31 − 1 2^63 − 1 Long.MAX_VALUE Integer.MAX_VALUE
>> Output for out-of-range negative input −2^31 −2^63 Long.MIN_VALUE Integer.MIN_VALUE
>> Output for −∞ −2^31 −2^63 Long.MIN_VALUE Integer.MIN_VALUE
>> Output for out-of-range positive input 2^31 − 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE
>> Output for +∞ 2^31 − 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE
>> Output for NaN 2^31 − 1 2^63 - 1 0 0
>>
>> The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement::
>>
>> **Before**
>>
>> Benchmark (TESTSIZE) Mode Cnt Score Error Units
>> FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms
>> FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms
>>
>>
>> **After**
>>
>> Benchmark (TESTSIZE) Mode Cnt Score Error Units
>> FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms
>> FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms
>
> Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision:
>
> Moved the code up + comments
fadd_s requires setting the explicit rounding mode RDN (round down towards −∞) because adding 0.5f to some floats exceeds the precision limits for a float and therefore rounding takes place. This leads to the incorrect results in case of the default rounding mode RNE (round to nearest, ties to even) for some inputs:
error: src = 8388609.000000 dst = 8388610 etalon = 8388609
error: src = 8388611.000000 dst = 8388612 etalon = 8388611
error: src = 8388613.000000 dst = 8388614 etalon = 8388613
error: src = 8388615.000000 dst = 8388616 etalon = 8388615
error: src = 8388617.000000 dst = 8388618 etalon = 8388617
error: src = 8388619.000000 dst = 8388620 etalon = 8388619
error: src = 8388621.000000 dst = 8388622 etalon = 8388621
error: src = 8388623.000000 dst = 8388624 etalon = 8388623
error: src = 8388625.000000 dst = 8388626 etalon = 8388625
error: src = 8388627.000000 dst = 8388628 etalon = 8388627
error: src = 8388629.000000 dst = 8388630 etalon = 8388629
error: src = 8388631.000000 dst = 8388632 etalon = 8388631
error: src = 8388633.000000 dst = 8388634 etalon = 8388633
error: src = 8388635.000000 dst = 8388636 etalon = 8388635
error: src = 8388637.000000 dst = 8388638 etalon = 8388637
error: src = 8388639.000000 dst = 8388640 etalon = 8388639
etc.
Let’s consider two of them with RNE for fadd.s:
fadd.s rne (src + 0.5f): src = 8388609.000000 dst = 8388610.000000
fcvt.w.s rdn: src = 8388610.000000 dst = 8388610
RESULT: 8388610 (JAVA Math.round: 8388609)
fadd.s rne (src + 0.5f): src = 8388611.000000 dst = 8388612.000000
fcvt.w.s rdn: src = 8388612.000000 dst = 8388612
RESULT: 8388612 (JAVA Math.round: 8388611)
if RDN is set for fadd.s then:
fadd.s rdn (src + 0.5f): src = 8388609.000000 dst = 8388609.000000
fcvt.w.s rdn: src = 8388609.000000 dst = 8388609
RESULT: 8388609 (JAVA Math.round: 8388609)
fadd.s rdn (src + 0.5f): src = 8388611.000000 dst = 8388611.000000
fcvt.w.s rdn: src = 8388611.000000 dst = 8388611
RESULT: 8388611 (JAVA Math.round: 8388611)
-------------
PR Comment: https://git.openjdk.org/jdk/pull/16382#issuecomment-1871616845
More information about the hotspot-dev
mailing list