RFR: 8317721: RISC-V: Implement CRC32 intrinsic

ArsenyBochkarev duke at openjdk.org
Mon Dec 11 15:59:16 UTC 2023


On Mon, 11 Dec 2023 01:59:33 GMT, ArsenyBochkarev <duke at openjdk.org> wrote:

> Hi everyone! Please review this port of [AArch64](https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp#L4224) `_updateBytesCRC32`, `_updateByteBufferCRC32` and `_updateCRC32` intrinsics. This patch introduces only the plain (non-vectorized, no Zbc) version.
> 
> ### Correctness checks
> 
> Tier 1/2 tests are ok.
> 
> ### Performance results on T-Head board
> 
> #### Results for enabled intrinsic:
> 
> Used test is `test/micro/org/openjdk/bench/java/util/TestCRC32.java`
> 
> | Benchmark                                             |  (count) |  Mode | Cnt    | Score |   Error |  Units |
> | --- | ---- | ----- | --- | ---- | --- | ---- |
> | CRC32.TestCRC32.testCRC32Update  |     64  | thrpt     | 24 | 3730.929 | 37.773 | ops/ms |
> | CRC32.TestCRC32.testCRC32Update  |    128 |  thrpt    | 24 | 2126.673 |  2.032 | ops/ms |
> | CRC32.TestCRC32.testCRC32Update  |    256 | thrpt    |  24 | 1134.330 |  6.714 | ops/ms |
> | CRC32.TestCRC32.testCRC32Update  |    512 | thrpt    |  24 |  584.017 |  2.267 | ops/ms |
> | CRC32.TestCRC32.testCRC32Update  |   2048 |  thrpt   |   24 |  151.173 |  0.346 | ops/ms |
> | CRC32.TestCRC32.testCRC32Update  |   16384 | thrpt |  24 |   19.113 |  0.008 | ops/ms |
> | CRC32.TestCRC32.testCRC32Update  |  65536 | thrpt  | 24  |   4.647 | 0.022 | ops/ms |
> 
> #### Results for disabled intrinsic:
> 
> | Benchmark                                            | (count)  |  Mode | Cnt |   Score  |  Error   | Units     |
> | --------------------------------------------------- | ---------- | --------- | ---- | ----------- | --------- | ---------- | 
> | CRC32.TestCRC32.testCRC32Update |      64    |  thrpt   | 15  | 798.365 | 35.486 | ops/ms |
> | CRC32.TestCRC32.testCRC32Update |     128   |  thrpt   | 15  | 677.756 | 46.619 | ops/ms |
> | CRC32.TestCRC32.testCRC32Update |     256   |  thrpt   | 15  | 552.781 | 27.143 | ops/ms |
> | CRC32.TestCRC32.testCRC32Update |     512   |  thrpt   | 15  | 429.304 | 12.518 | ops/ms |
> | CRC32.TestCRC32.testCRC32Update |    2048  |  thrpt   | 15  | 166.738 |  0.935  | ops/ms |
> | CRC32.TestCRC32.testCRC32Update |   16384 |  thrpt   | 15  |  25.060  | 0.034   | ops/ms |
> | CRC32.TestCRC32.testCRC32Update |   65536 |  thrpt   | 15  |   6.196   | 0.030   | ops/ms |

Performance comparison for disabling/enabling Zba on StarFive VisionFive 2 board:

`-XX:-UseZba`:
| Benchmark                                            | (count)  | Mode | Cnt |     Score |  Error |  Units    |
| --------------------------------------------------- | ---------- | ------- | -----  | ---------- | -------- | --------- |
| CRC32.TestCRC32.testCRC32Update |      64    | thrpt   | 12 | 3563.320 | 3.326 | ops/ms |
| CRC32.TestCRC32.testCRC32Update |     128   | thrpt   | 12 | 1928.837 | 2.234 | ops/ms |
| CRC32.TestCRC32.testCRC32Update |     256   | thrpt   | 12 | 1005.273 | 1.953 | ops/ms |
| CRC32.TestCRC32.testCRC32Update |     512   | thrpt   | 12 |  512.550  | 1.718 | ops/ms |
| CRC32.TestCRC32.testCRC32Update |    2048  | thrpt   | 12 |  130.396  | 0.341 | ops/ms |
| CRC32.TestCRC32.testCRC32Update |   16384 |  thrpt  | 12 |   16.319   | 0.073 | ops/ms |
| CRC32.TestCRC32.testCRC32Update |   65536 | thrpt   | 12 |    3.913    | 0.011 | ops/ms |

 
`-XX:+UseZba`:
| Benchmark                                            | (count)   |  Mode | Cnt   |  Score  | Error  | Units     |
| --------------------------------------------------- | ---------- | ------- | -------- | -------- | -------- | ---------- |
| CRC32.TestCRC32.testCRC32Update |      64     | thrpt |  12 |  4206.654  | 0.547 | ops/ms |
| CRC32.TestCRC32.testCRC32Update |     128    | thrpt |  12 | 2308.843 | 3.565 |  ops/ms |
| CRC32.TestCRC32.testCRC32Update |     256    | thrpt |  12 | 1214.727  | 0.305 | ops/ms |
| CRC32.TestCRC32.testCRC32Update |     512    | thrpt |  12 |  623.173  | 0.651 |  ops/ms |
| CRC32.TestCRC32.testCRC32Update |   2048    | thrpt |  12 |  158.965   | 0.376 |   ops/ms |
| CRC32.TestCRC32.testCRC32Update |   16384  | thrpt |  12 |   19.934  | 0.055 |  ops/ms |
| CRC32.TestCRC32.testCRC32Update |   65536  | thrpt |  12 |    4.730     | 0.007 |     ops/ms |

-------------

PR Comment: https://git.openjdk.org/jdk/pull/17046#issuecomment-1850364667


More information about the hotspot-compiler-dev mailing list