RFR: 8312569: RISC-V: Missing intrinsics for Math.ceil, floor, rint [v6]
Ilya Gavrilin
duke at openjdk.org
Mon Aug 28 11:01:48 UTC 2023
> Please review this changes into risc-v double rounding intrinsic.
>
> On risc-v intrinsics for rounding doubles with mode (like Math.ceil/floor/rint) were missing. On risc-v we don`t have special instruction for such conversion, so two times conversion was used: double -> long int -> double (using fcvt.l.d, fcvt.d.l).
>
> Also, we should provide some rounding mode to fcvt.x.x instruction.
>
> Rounding mode selection on ceil (similar for floor and rint): according to Math.ceil requirements:
>
>> Returns the smallest (closest to negative infinity) value that is greater than or equal to the argument and is equal to a mathematical integer (Math.java:475).
>
> For double -> long int we choose rup (round towards +inf) mode to get the integer that more than or equal to the input value.
> For long int -> double we choose rdn (rounds towards -inf) mode to get the smallest (closest to -inf) representation of integer that we got after conversion.
>
> For cases when we got inf, nan, or value more than 2^63 return input value (double value which more than 2^63 is guaranteed integer).
> As well when we store result we copy sign from input value (need for cases when for (-1.0, 0.0) ceil need to return -0.0).
>
> We have observed significant improvement on hifive and thead boards.
>
> testing: tier1, tier2 and hotspot:tier3 on hifive
>
> Performance results on hifive (FpRoundingBenchmark.testceil/floor/rint):
>
> Without intrinsic:
>
> Benchmark (TESTSIZE) Mode Cnt Score Error Units
> FpRoundingBenchmark.testceil 1024 thrpt 25 39.297 ± 0.037 ops/ms
> FpRoundingBenchmark.testfloor 1024 thrpt 25 39.398 ± 0.018 ops/ms
> FpRoundingBenchmark.testrint 1024 thrpt 25 36.388 ± 0.844 ops/ms
>
> With intrinsic:
>
> Benchmark (TESTSIZE) Mode Cnt Score Error Units
> FpRoundingBenchmark.testceil 1024 thrpt 25 80.560 ± 0.053 ops/ms
> FpRoundingBenchmark.testfloor 1024 thrpt 25 80.541 ± 0.081 ops/ms
> FpRoundingBenchmark.testrint 1024 thrpt 25 80.603 ± 0.071 ops/ms
Ilya Gavrilin has updated the pull request incrementally with one additional commit since the last revision:
Fix whitespaces in c2_MacroAssembler
-------------
Changes:
- all: https://git.openjdk.org/jdk/pull/14991/files
- new: https://git.openjdk.org/jdk/pull/14991/files/f6dd7b16..492fb25c
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jdk&pr=14991&range=05
- incr: https://webrevs.openjdk.org/?repo=jdk&pr=14991&range=04-05
Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
Patch: https://git.openjdk.org/jdk/pull/14991.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/14991/head:pull/14991
PR: https://git.openjdk.org/jdk/pull/14991
More information about the hotspot-compiler-dev
mailing list