RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v6]

Olga Mikhaltsova omikhaltcova at openjdk.org
Fri Dec 8 23:03:15 UTC 2023


On Tue, 5 Dec 2023 03:33:52 GMT, Fei Yang <fyang at openjdk.org> wrote:

>> Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Replaced tmp with t0
>
> Unfortunately, I witnessed performance regression on sifive unmatched board.
> 
> Before:
> 
> FpRoundingBenchmark.test_ceil                2048  thrpt   15  39.243 ? 0.506  ops/ms
> FpRoundingBenchmark.test_floor               2048  thrpt   15  39.448 ? 0.076  ops/ms
> FpRoundingBenchmark.test_rint                2048  thrpt   15  39.411 ? 0.134  ops/ms
> FpRoundingBenchmark.test_round_double        2048  thrpt   15  31.329 ? 0.085  ops/ms
> FpRoundingBenchmark.test_round_float         2048  thrpt   15  31.328 ? 0.031  ops/ms
> 
> After:
> 
> FpRoundingBenchmark.test_ceil                2048  thrpt   15  39.375 ? 0.125  ops/ms
> FpRoundingBenchmark.test_floor               2048  thrpt   15  39.407 ? 0.076  ops/ms
> FpRoundingBenchmark.test_rint                2048  thrpt   15  39.387 ? 0.235  ops/ms
> FpRoundingBenchmark.test_round_double        2048  thrpt   15  23.940 ? 0.025  ops/ms
> FpRoundingBenchmark.test_round_float         2048  thrpt   15  30.629 ? 0.021  ops/ms

@RealFYang Thanks for pointing out this regression!

Some optimization has been done. Please take a look at the results below!

**VisionFive 2**

Benchmark                              (TESTSIZE)   Mode  Cnt   Score   Error   Units
FpRoundingBenchmark.test_round_double        2048  thrpt   15  39.351 ± 0.150  ops/ms
FpRoundingBenchmark.test_round_float         2048  thrpt   15  39.323 ± 0.192  ops/ms

After
FpRoundingBenchmark.test_round_double        2048  thrpt   15  36.812 ± 0.171  ops/ms
FpRoundingBenchmark.test_round_float         2048  thrpt   15  50.179 ± 0.143  ops/ms

**T-Head**

Before
Benchmark                              (TESTSIZE)   Mode  Cnt    Score   Error   Units
FpRoundingBenchmark.test_round_double        2048  thrpt   15   59.853  0.227  ops/ms
FpRoundingBenchmark.test_round_float         2048  thrpt   15   49.889  0.145  ops/ms

After
FpRoundingBenchmark.test_round_double        2048  thrpt   15  119.493  1.591  ops/ms
FpRoundingBenchmark.test_round_float         2048  thrpt   15  123.546  0.329  ops/ms

-------------

PR Comment: https://git.openjdk.org/jdk/pull/16382#issuecomment-1847949976


More information about the hotspot-dev mailing list