RFR: 8320500: [vectorapi] RISC-V: Optimize vector math operations with SLEEF [v2]
Erik Joelsson
erikj at openjdk.org
Mon Sep 23 13:14:39 UTC 2024
On Mon, 23 Sep 2024 07:30:59 GMT, Hamlin Li <mli at openjdk.org> wrote:
>> Hi,
>> Can you help to review this patch?
>> Thanks!
>>
>> This patch is based on https://github.com/openjdk/jdk/pull/20781 which added the sleef source (in particular the generated sleef inline headers). We use sleef api to vectorize the math operations in vector api.
>>
>> On machine with vector intrinsic support on riscv (e.g. gcc 14+) it will generate libsleef.so with the bridge functions to sleef api, otherwise without the bridge functions.
>>
>> ### Test
>> test/jdk/jdk/incubator/vector
>>
>> ### Performance
>> data on bananapi
>> <google-sheets-html-origin style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">
>> Benchmark - bananapi | (size) | Mode | Cnt | Score +intrinsic | Error +intrinsic | Score -intrinsic | Error -intrinsic | Units | Improvement
>> -- | -- | -- | -- | -- | -- | -- | -- | -- | --
>> Double128Vector.ACOS | 1024 | avgt | 10 | 112444.388 | 655.761 | 208554.742 | 1508.709 | ns/op | 1.855
>> Double128Vector.ASIN | 1024 | avgt | 10 | 104121.259 | 243.167 | 208314.499 | 2833.61 | ns/op | 2.001
>> Double128Vector.ATAN | 1024 | avgt | 10 | 136941.263 | 243.486 | 284024.53 | 2204.224 | ns/op | 2.074
>> Double128Vector.ATAN2 | 1024 | avgt | 10 | 163228.681 | 435.455 | 427589.587 | 3045.192 | ns/op | 2.62
>> Double128Vector.CBRT | 1024 | avgt | 10 | 146395.753 | 239.355 | 317136.654 | 1330.869 | ns/op | 2.166
>> Double128Vector.COS | 1024 | avgt | 10 | 154865.298 | 235.697 | 305721.518 | 1319.313 | ns/op | 1.974
>> Double128Vector.COSH | 1024 | avgt | 10 | 189212.943 | 262.399 | 220756.27 | 61324.863 | ns/op | 1.167
>> Double128Vector.EXP | 1024 | avgt | 10 | 113941.594 | 219.647 | 252853.07 | 891.272 | ns/op | 2.219
>> Double128Vector.EXPM1 | 1024 | avgt | 10 | 184552.939 | 513.715 | 254087.184 | 2144.997 | ns/op | 1.377
>> Double128Vector.HYPOT | 1024 | avgt | 10 | 111580.194 | 423.282 | 374537.338 | 2091.811 | ns/op | 3.357
>> Double128Vector.LOG | 1024 | avgt | 10 | 110680.548 | 192.731 | 265391.129 | 2653.519 | ns/op | 2.398
>> Double128Vector.LOG10 | 1024 | avgt | 10 | 116708.105 | 167.095 | 285764.405 | 2489.08 | ns/op | 2.449
>> Double128Vector.LOG1P | 1024 | avgt | 10 | 115633.302 | 567.7 | 317235.967 | 1062.848 | ns/op | 2.743
>> Double128Vector.POW | 1024 | avgt | 10 | 321655.14 | 3...
>
> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision:
>
> modify cflags style
Build changes look ok.
-------------
Marked as reviewed by erikj (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/21083#pullrequestreview-2322170469
More information about the build-dev
mailing list