RFR: 8343600: RISC-V: enable CRC32 intrinsic when either Zba or RVV are supported

Fei Yang fyang at openjdk.org
Wed Nov 6 03:47:28 UTC 2024


On Tue, 5 Nov 2024 09:44:21 GMT, Hamlin Li <mli at openjdk.org> wrote:

> Hi,
> Can you help to review this patch?
> 
> There were 2 intrinsic optimizations, scalar one and vector one, 
> * plain scalar one optimizes cases when data size is small, scalar one with zba optimizes cases when data size is big;
> * vector one optimizes cases when data size is big (better than scalar one with zba).
> 
> With RVV supported, when the size is small it will go to scalar code path, when the size is big it will go to RVV path then scalar path, so that means the RVV itself are sufficient enough to enable the intrinsic even without Zba supported.
> 
> Thanks!

Just tried on one of my OoO machines (One which is not equipped with Zba). The performance data still looks fine.
I also witnessed that the B-extension is becoming popular on new hardwares (BPI-F3, HiFive P550, etc). So I think we can just eliminate this dependence on Zba.


Without intrinsic:
Benchmark                    (count)   Mode  Cnt     Score     Error   Units
TestCRC32.testCRC32Update         64  thrpt   12  1102.980 ±  23.698  ops/ms
TestCRC32.testCRC32Update        128  thrpt   12   997.338 ±  22.022  ops/ms
TestCRC32.testCRC32Update        256  thrpt   12   796.172 ±  17.290  ops/ms
TestCRC32.testCRC32Update        512  thrpt   12   581.080 ±  18.992  ops/ms
TestCRC32.testCRC32Update       2048  thrpt   12   234.958 ±   4.581  ops/ms
TestCRC32.testCRC32Update      16384  thrpt   12    35.167 ±   0.149  ops/ms
TestCRC32.testCRC32Update      65536  thrpt   12     8.893 ±   0.013  ops/ms

With intrinsic:
Benchmark                    (count)   Mode  Cnt     Score     Error   Units
TestCRC32.testCRC32Update         64  thrpt   12  6875.581 ±   4.371  ops/ms
TestCRC32.testCRC32Update        128  thrpt   12  3848.178 ±   6.049  ops/ms
TestCRC32.testCRC32Update        256  thrpt   12  2085.047 ±   5.650  ops/ms
TestCRC32.testCRC32Update        512  thrpt   12  1090.035 ±   1.054  ops/ms
TestCRC32.testCRC32Update       2048  thrpt   12   281.727 ±   0.149  ops/ms
TestCRC32.testCRC32Update      16384  thrpt   12    35.563 ±   0.041  ops/ms
TestCRC32.testCRC32Update      65536  thrpt   12     8.751 ±   0.023  ops/ms

-------------

PR Comment: https://git.openjdk.org/jdk/pull/21897#issuecomment-2458667409


More information about the hotspot-dev mailing list