RFR: 8343600: RISC-V: enable CRC32 intrinsic when either Zba or RVV are supported
Fei Yang
fyang at openjdk.org
Wed Nov 6 03:47:28 UTC 2024
On Tue, 5 Nov 2024 09:44:21 GMT, Hamlin Li <mli at openjdk.org> wrote:
> Hi,
> Can you help to review this patch?
>
> There were 2 intrinsic optimizations, scalar one and vector one,
> * plain scalar one optimizes cases when data size is small, scalar one with zba optimizes cases when data size is big;
> * vector one optimizes cases when data size is big (better than scalar one with zba).
>
> With RVV supported, when the size is small it will go to scalar code path, when the size is big it will go to RVV path then scalar path, so that means the RVV itself are sufficient enough to enable the intrinsic even without Zba supported.
>
> Thanks!
Just tried on one of my OoO machines (One which is not equipped with Zba). The performance data still looks fine.
I also witnessed that the B-extension is becoming popular on new hardwares (BPI-F3, HiFive P550, etc). So I think we can just eliminate this dependence on Zba.
Without intrinsic:
Benchmark (count) Mode Cnt Score Error Units
TestCRC32.testCRC32Update 64 thrpt 12 1102.980 ± 23.698 ops/ms
TestCRC32.testCRC32Update 128 thrpt 12 997.338 ± 22.022 ops/ms
TestCRC32.testCRC32Update 256 thrpt 12 796.172 ± 17.290 ops/ms
TestCRC32.testCRC32Update 512 thrpt 12 581.080 ± 18.992 ops/ms
TestCRC32.testCRC32Update 2048 thrpt 12 234.958 ± 4.581 ops/ms
TestCRC32.testCRC32Update 16384 thrpt 12 35.167 ± 0.149 ops/ms
TestCRC32.testCRC32Update 65536 thrpt 12 8.893 ± 0.013 ops/ms
With intrinsic:
Benchmark (count) Mode Cnt Score Error Units
TestCRC32.testCRC32Update 64 thrpt 12 6875.581 ± 4.371 ops/ms
TestCRC32.testCRC32Update 128 thrpt 12 3848.178 ± 6.049 ops/ms
TestCRC32.testCRC32Update 256 thrpt 12 2085.047 ± 5.650 ops/ms
TestCRC32.testCRC32Update 512 thrpt 12 1090.035 ± 1.054 ops/ms
TestCRC32.testCRC32Update 2048 thrpt 12 281.727 ± 0.149 ops/ms
TestCRC32.testCRC32Update 16384 thrpt 12 35.563 ± 0.041 ops/ms
TestCRC32.testCRC32Update 65536 thrpt 12 8.751 ± 0.023 ops/ms
-------------
PR Comment: https://git.openjdk.org/jdk/pull/21897#issuecomment-2458667409
More information about the hotspot-dev
mailing list