RFR: 8312569: RISC-V: Missing intrinsics for Math.ceil, floor, rint

Wed Aug 2 20:48:51 UTC 2023

On Wed, 2 Aug 2023 13:32:36 GMT, Feilong Jiang <fjiang at openjdk.org> wrote:

>> Please review this changes into risc-v double rounding intrinsic.
>> 
>> On risc-v intrinsics for rounding doubles with mode (like Math.ceil/floor/rint) were missing. On risc-v we don`t have special instruction for such conversion, so two times conversion was used: double -> long int -> double (using fcvt.l.d, fcvt.d.l).
>> 
>> Also, we should provide some rounding mode to fcvt.x.x instruction.
>> 
>> Rounding mode selection on ceil (similar for floor and rint): according to Math.ceil requirements: 
>> 
>>> Returns the smallest (closest to negative infinity) value that is greater than or equal to the argument and is equal to a mathematical integer (Math.java:475).
>> 
>> For double -> long int we choose rup (round towards +inf) mode to get the integer that more than or equal to the input value. 
>> For long int -> double we choose rdn (rounds towards -inf) mode to get the smallest (closest to -inf) representation of integer that we got after conversion.
>> 
>> For cases when we got inf, nan, or value more than 2^63 return input value (double value which more than 2^63 is guaranteed integer).
>> As well when we store result we copy sign from input value (need for cases when for (-1.0, 0.0) ceil need to return -0.0).
>> 
>> We have observed significant improvement on hifive and thead boards.
>> 
>> testing: tier1, tier2 and hotspot:tier3 on hifive
>> 
>> Performance results on hifive (FpRoundingBenchmark.testceil/floor/rint):
>> 
>> Without intrinsic:
>> 
>> Benchmark                      (TESTSIZE)   Mode  Cnt   Score   Error   Units
>> FpRoundingBenchmark.testceil         1024  thrpt   25  39.297 ± 0.037  ops/ms
>> FpRoundingBenchmark.testfloor        1024  thrpt   25  39.398 ± 0.018  ops/ms
>> FpRoundingBenchmark.testrint         1024  thrpt   25  36.388 ± 0.844  ops/ms
>> 
>> With intrinsic:
>> 
>> Benchmark                      (TESTSIZE)   Mode  Cnt   Score   Error   Units
>> FpRoundingBenchmark.testceil         1024  thrpt   25  80.560 ± 0.053  ops/ms
>> FpRoundingBenchmark.testfloor        1024  thrpt   25  80.541 ± 0.081  ops/ms
>> FpRoundingBenchmark.testrint         1024  thrpt   25  80.603 ± 0.071  ops/ms
>
> src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4286:
> 
>> 4284:   slli(mask, mask, 63);
>> 4285:   // conversion from double to long
>> 4286:   fcvt_l_d(converted_dbl, src, rm_direct);
> 
> How about using `fclass` [1] to check the special cases of input, then we can just do `fcvt.l.d` and `fcvt.d.l` for normal inputs? We can check the result of `fclass`. If the input contains NaN/infinity/+0/-0, we could return the value without conversion.
> 
> 1. https://github.com/riscv/riscv-isa-manual/blob/3a6edf7ebf6af9e6ad92ace865c0069090870c20/src/f-st-ext.adoc?plain=1#L487-L500

Hi, thanks for your review. Also, we can use `fclass` to check cases NaN/+(-)INF/+(-)0.0 but we still need to check if value exeed `2^63 - 1 `(for positive input value) and `-2^63` (for negative one). So, we should leave check of converted value and we can add branch with a result of `fclass`. It will give an additional branch on regular values.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/14991#discussion_r1282402592