RFR: 8320500: [vectorapi] RISC-V: Optimize vector math operations with SLEEF [v4]

Fei Yang fyang at openjdk.org
Tue Sep 24 03:54:48 UTC 2024


On Mon, 23 Sep 2024 19:32:10 GMT, Hamlin Li <mli at openjdk.org> wrote:

>> Hi,
>> Can you help to review this patch?
>> Thanks!
>> 
>> This patch is based on https://github.com/openjdk/jdk/pull/20781 which added the sleef source (in particular the generated sleef inline headers). We use sleef api to vectorize the math operations in vector api.
>> 
>> On machine with vector intrinsic support on riscv (e.g. gcc 14+) it will generate libsleef.so with the bridge functions to sleef api, otherwise without the bridge functions.
>> 
>> ### Test
>> test/jdk/jdk/incubator/vector
>> 
>> ### Performance
>> data on bananapi
>> <google-sheets-html-origin style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">
>> Benchmark - bananapi | (size) | Mode | Cnt | Score +intrinsic | Error +intrinsic | Score -intrinsic | Error -intrinsic | Units | Improvement
>> -- | -- | -- | -- | -- | -- | -- | -- | -- | --
>> Double128Vector.ACOS | 1024 | avgt | 10 | 112444.388 | 655.761 | 208554.742 | 1508.709 | ns/op | 1.855
>> Double128Vector.ASIN | 1024 | avgt | 10 | 104121.259 | 243.167 | 208314.499 | 2833.61 | ns/op | 2.001
>> Double128Vector.ATAN | 1024 | avgt | 10 | 136941.263 | 243.486 | 284024.53 | 2204.224 | ns/op | 2.074
>> Double128Vector.ATAN2 | 1024 | avgt | 10 | 163228.681 | 435.455 | 427589.587 | 3045.192 | ns/op | 2.62
>> Double128Vector.CBRT | 1024 | avgt | 10 | 146395.753 | 239.355 | 317136.654 | 1330.869 | ns/op | 2.166
>> Double128Vector.COS | 1024 | avgt | 10 | 154865.298 | 235.697 | 305721.518 | 1319.313 | ns/op | 1.974
>> Double128Vector.COSH | 1024 | avgt | 10 | 189212.943 | 262.399 | 220756.27 | 61324.863 | ns/op | 1.167
>> Double128Vector.EXP | 1024 | avgt | 10 | 113941.594 | 219.647 | 252853.07 | 891.272 | ns/op | 2.219
>> Double128Vector.EXPM1 | 1024 | avgt | 10 | 184552.939 | 513.715 | 254087.184 | 2144.997 | ns/op | 1.377
>> Double128Vector.HYPOT | 1024 | avgt | 10 | 111580.194 | 423.282 | 374537.338 | 2091.811 | ns/op | 3.357
>> Double128Vector.LOG | 1024 | avgt | 10 | 110680.548 | 192.731 | 265391.129 | 2653.519 | ns/op | 2.398
>> Double128Vector.LOG10 | 1024 | avgt | 10 | 116708.105 | 167.095 | 285764.405 | 2489.08 | ns/op | 2.449
>> Double128Vector.LOG1P | 1024 | avgt | 10 | 115633.302 | 567.7 | 317235.967 | 1062.848 | ns/op | 2.743
>> Double128Vector.POW | 1024 | avgt | 10 | 321655.14 | 3...
>
> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision:
> 
>   refine comment

Hi, I have several comments after a cursory look. Please consider.

src/hotspot/cpu/riscv/assembler_riscv.hpp line 51:

> 49:     n_int_register_parameters_c   = 8,   // x10, x11, ... x17 (c_rarg0, c_rarg1, ...)
> 50:     n_float_register_parameters_c = 8,   // f10, f11, ... f17 (c_farg0, c_farg1, ... )
> 51:     n_vector_register_parameters_c = 8,  // v8, v9, ... v15

I know vector registers are not used for passing arguments or return values by the RISCV ABI for now. But I guess it might be better and consistent to align with the numbering of integer and floating-point argument registers? That is v10 - v17.

src/hotspot/cpu/riscv/riscv.ad line 10078:

> 10076:   match(CallLeafVector);
> 10077: 
> 10078:   effect(USE meth);

It's possible for the runtime call to clobber rFlagsReg `cr` (aka `t1`). So safer to add `KILL cr` to the effect.

src/hotspot/cpu/riscv/sharedRuntime_riscv.cpp line 669:

> 667:                                              uint num_bits,
> 668:                                              uint total_args_passed) {
> 669:   // More than 8 argument inputs are not supported now.

Maybe: `// More than 8 argument inputs are not supported for now.`

src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 6063:

> 6061: 
> 6062:   void generate_vector_math_stubs() {
> 6063:     if (UseRVV) {

Seems to me cleaner to do this:

if (UseRVV) {
  generate_vector_math_stubs();
}

-------------

PR Review: https://git.openjdk.org/jdk/pull/21083#pullrequestreview-2323927673
PR Review Comment: https://git.openjdk.org/jdk/pull/21083#discussion_r1772520081
PR Review Comment: https://git.openjdk.org/jdk/pull/21083#discussion_r1772527038
PR Review Comment: https://git.openjdk.org/jdk/pull/21083#discussion_r1772521301
PR Review Comment: https://git.openjdk.org/jdk/pull/21083#discussion_r1772522783


More information about the build-dev mailing list