RFR: 8317721: RISC-V: Implement CRC32 intrinsic
ArsenyBochkarev
duke at openjdk.org
Mon Dec 11 15:59:16 UTC 2023
On Mon, 11 Dec 2023 01:59:33 GMT, ArsenyBochkarev <duke at openjdk.org> wrote:
> Hi everyone! Please review this port of [AArch64](https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp#L4224) `_updateBytesCRC32`, `_updateByteBufferCRC32` and `_updateCRC32` intrinsics. This patch introduces only the plain (non-vectorized, no Zbc) version.
>
> ### Correctness checks
>
> Tier 1/2 tests are ok.
>
> ### Performance results on T-Head board
>
> #### Results for enabled intrinsic:
>
> Used test is `test/micro/org/openjdk/bench/java/util/TestCRC32.java`
>
> | Benchmark | (count) | Mode | Cnt | Score | Error | Units |
> | --- | ---- | ----- | --- | ---- | --- | ---- |
> | CRC32.TestCRC32.testCRC32Update | 64 | thrpt | 24 | 3730.929 | 37.773 | ops/ms |
> | CRC32.TestCRC32.testCRC32Update | 128 | thrpt | 24 | 2126.673 | 2.032 | ops/ms |
> | CRC32.TestCRC32.testCRC32Update | 256 | thrpt | 24 | 1134.330 | 6.714 | ops/ms |
> | CRC32.TestCRC32.testCRC32Update | 512 | thrpt | 24 | 584.017 | 2.267 | ops/ms |
> | CRC32.TestCRC32.testCRC32Update | 2048 | thrpt | 24 | 151.173 | 0.346 | ops/ms |
> | CRC32.TestCRC32.testCRC32Update | 16384 | thrpt | 24 | 19.113 | 0.008 | ops/ms |
> | CRC32.TestCRC32.testCRC32Update | 65536 | thrpt | 24 | 4.647 | 0.022 | ops/ms |
>
> #### Results for disabled intrinsic:
>
> | Benchmark | (count) | Mode | Cnt | Score | Error | Units |
> | --------------------------------------------------- | ---------- | --------- | ---- | ----------- | --------- | ---------- |
> | CRC32.TestCRC32.testCRC32Update | 64 | thrpt | 15 | 798.365 | 35.486 | ops/ms |
> | CRC32.TestCRC32.testCRC32Update | 128 | thrpt | 15 | 677.756 | 46.619 | ops/ms |
> | CRC32.TestCRC32.testCRC32Update | 256 | thrpt | 15 | 552.781 | 27.143 | ops/ms |
> | CRC32.TestCRC32.testCRC32Update | 512 | thrpt | 15 | 429.304 | 12.518 | ops/ms |
> | CRC32.TestCRC32.testCRC32Update | 2048 | thrpt | 15 | 166.738 | 0.935 | ops/ms |
> | CRC32.TestCRC32.testCRC32Update | 16384 | thrpt | 15 | 25.060 | 0.034 | ops/ms |
> | CRC32.TestCRC32.testCRC32Update | 65536 | thrpt | 15 | 6.196 | 0.030 | ops/ms |
Performance comparison for disabling/enabling Zba on StarFive VisionFive 2 board:
`-XX:-UseZba`:
| Benchmark | (count) | Mode | Cnt | Score | Error | Units |
| --------------------------------------------------- | ---------- | ------- | ----- | ---------- | -------- | --------- |
| CRC32.TestCRC32.testCRC32Update | 64 | thrpt | 12 | 3563.320 | 3.326 | ops/ms |
| CRC32.TestCRC32.testCRC32Update | 128 | thrpt | 12 | 1928.837 | 2.234 | ops/ms |
| CRC32.TestCRC32.testCRC32Update | 256 | thrpt | 12 | 1005.273 | 1.953 | ops/ms |
| CRC32.TestCRC32.testCRC32Update | 512 | thrpt | 12 | 512.550 | 1.718 | ops/ms |
| CRC32.TestCRC32.testCRC32Update | 2048 | thrpt | 12 | 130.396 | 0.341 | ops/ms |
| CRC32.TestCRC32.testCRC32Update | 16384 | thrpt | 12 | 16.319 | 0.073 | ops/ms |
| CRC32.TestCRC32.testCRC32Update | 65536 | thrpt | 12 | 3.913 | 0.011 | ops/ms |
`-XX:+UseZba`:
| Benchmark | (count) | Mode | Cnt | Score | Error | Units |
| --------------------------------------------------- | ---------- | ------- | -------- | -------- | -------- | ---------- |
| CRC32.TestCRC32.testCRC32Update | 64 | thrpt | 12 | 4206.654 | 0.547 | ops/ms |
| CRC32.TestCRC32.testCRC32Update | 128 | thrpt | 12 | 2308.843 | 3.565 | ops/ms |
| CRC32.TestCRC32.testCRC32Update | 256 | thrpt | 12 | 1214.727 | 0.305 | ops/ms |
| CRC32.TestCRC32.testCRC32Update | 512 | thrpt | 12 | 623.173 | 0.651 | ops/ms |
| CRC32.TestCRC32.testCRC32Update | 2048 | thrpt | 12 | 158.965 | 0.376 | ops/ms |
| CRC32.TestCRC32.testCRC32Update | 16384 | thrpt | 12 | 19.934 | 0.055 | ops/ms |
| CRC32.TestCRC32.testCRC32Update | 65536 | thrpt | 12 | 4.730 | 0.007 | ops/ms |
-------------
PR Comment: https://git.openjdk.org/jdk/pull/17046#issuecomment-1850364667
More information about the hotspot-compiler-dev
mailing list