RFR: 8321010: RISC-V: C2 RoundVF [v16]

Robbin Ehn rehn at openjdk.org
Fri Sep 13 07:12:11 UTC 2024


On Wed, 28 Aug 2024 10:29:59 GMT, Hamlin Li <mli at openjdk.org> wrote:

>> Hi,
>> Can you have a review on this patch to add RoundVF/RoundDF intrinsics?
>> 
>> Current test shows that, it bring performance gain when vlenb >= 32 (which is on bananapi), but bring regression when vlenb == 16 (which is on k230). So I only enable the intrinsic when vlenb >= 32.
>> 
>> For double, even vlenb == 32, there is still some regression, so in this pr I only enable it when vlenb >= 64. Although there is no hardware to verify it, I think from the trend of performance data on bananapi and k230, it's promising to bring better performance rather than regression for double when vlenb == 64+. Please compare the data of `Benchmark   on bananapi, +UseSuperWord` and `Benchmark   on k230, +UseSuperWord, enable RoundVF/D when vlenb == 16`.
>> 
>> Thanks!
>> 
>> ## Tests
>> 
>> test/hotspot/jtreg/compiler/vectorization/TestRoundVectRiscv64.java test/hotspot/jtreg/compiler/c2/cr6340864/TestFloatVect.java test/hotspot/jtreg/compiler/c2/cr6340864/TestDoubleVect.java test/hotspot/jtreg/compiler/floatingpoint/TestRound.java
>> 
>> test/jdk/java/lang/Math/RoundTests.java
>> 
>> ## Performance - with Intrinsic
>> 
>> ### on bananapi
>> Benchmark   on bananapi, +UseSuperWord
>> <google-sheets-html-origin style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">
>> Benchmark on bananapi, +UseSuperWord | (TESTSIZE) | Mode | Cnt | Score +intrinsic | Score -intrinsic | Error | Units | Improvement
>> -- | -- | -- | -- | -- | -- | -- | -- | --
>> FpRoundingBenchmark.test_round_double | 2048 | avgt | 10 | 23794.153 | 20557.467 | 2899.266 | ns/op | 0.864
>> FpRoundingBenchmark.test_round_float | 2048 | avgt | 10 | 11531.853 | 16562.431 | 865.779 | ns/op | 1.436
>> 
>> </google-sheets-html-origin>
>> 
>> ### on k230 (enable intrinsic even when vlenb == 16)
>> Benchmark   on k230, +UseSuperWord, enable RoundVF/D when vlenb == 16
>> <google-sheets-html-origin style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">
>> Benchmark on k230, +UseSuperWord, enable RoundVF/D ...
>
> Hamlin Li has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 25 commits:
> 
>  - comments
>  - Merge branch 'master' into round-F+D-v
>  - minor
>  - minor
>  - minor
>  - add additional tests
>  - enable roundVD when MaxVectorSize >= 64
>  - enable intrinsic when MaxVectorSize >= 32
>  - Merge branch 'master' into round-F+D-v
>  - enable when vlenb >= 32
>  - ... and 15 more: https://git.openjdk.org/jdk/compare/be34730f...c35fcddc

Seems alright, thanks!

-------------

Marked as reviewed by rehn (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/17745#pullrequestreview-2302283083


More information about the hotspot-compiler-dev mailing list