RFR: 8339738: RISC-V: Vectorize crc32 intrinsic [v2]

Ludovic Henry luhenry at openjdk.org
Tue Sep 10 15:18:10 UTC 2024


On Mon, 9 Sep 2024 11:13:47 GMT, Hamlin Li <mli at openjdk.org> wrote:

>> Hi,
>> Can you help to review this patch?
>> Thanks.
>> 
>> This improvement is based on java.base/share/native/libzip/zlib/zcrc32.c, I made some modification to N (to 16) related code, then re-generate the tables needed, finally vectorize the code (original implementation in zcrc32.c is just scalar code).
>> 
>> ## Test
>> test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32.java, 
>> test/jdk/java/util/zip/TestCRC32.java
>> 
>> ## Performance
>> 
>> ### on bananapi
>> 
>> with patch
>> <google-sheets-html-origin style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">
>> Benchmark -with patch | (count) | Mode | Cnt | Score | Error | Units
>> -- | -- | -- | -- | -- | -- | --
>> TestCRC32.testCRC32Update | 64 | avgt | 10 | 220.884 | 0.03 | ns/op
>> TestCRC32.testCRC32Update | 128 | avgt | 10 | 401.122 | 0.309 | ns/op
>> TestCRC32.testCRC32Update | 256 | avgt | 10 | 680.168 | 0.032 | ns/op
>> TestCRC32.testCRC32Update | 512 | avgt | 10 | 1062.426 | 0.401 | ns/op
>> TestCRC32.testCRC32Update | 2048 | avgt | 10 | 3308.361 | 0.176 | ns/op
>> TestCRC32.testCRC32Update | 16384 | avgt | 10 | 24403.231 | 20.248 | ns/op
>> TestCRC32.testCRC32Update | 65536 | avgt | 10 | 103463.735 | 4.245 | ns/op
>> 
>> </google-sheets-html-origin>
>> 
>> without patch
>> <google-sheets-html-origin style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">
>> Benchmark -without patch | (count) | Mode | Cnt | Score | Error | Units
>> -- | -- | -- | -- | -- | -- | --
>> TestCRC32.testCRC32Update | 64 | avgt | 10 | 220.942 | 0.224 | ns/op
>> TestCRC32.testCRC32Update | 128 | avgt | 10 | 364.159 | 0.019 | ns/op
>> TestCRC32.testCRC32Update | 256 | avgt | 10 | 686.106 | 0.1 | ns/op
>> TestCRC32.testCRC32Update | 512 | avgt | 10 | 1328.962 | 0.073 | ns/op
>> TestCRC32.testCRC32Update | 2048 | avgt | 10 | 5191.116 | 0.189 | ns/op
>> TestCRC32.testCRC32Update | 16384 | avgt | 10 | 41286.858 | 4.53 | ns/op
>> TestCRC32.testCRC32Update | 65536 | avgt | 10 | 172340.099 | 11.004 | ns/op
>> 
>> </goo...
>
> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision:
> 
>   zext_w

src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 1469:

> 1467: 
> 1468:     // prepare
> 1469:     add(tableN16, table3, 1*256*sizeof(juint), tmp1);

where is that `1*256` coming from? It would be worth a comment

src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 1481:

> 1479:     }
> 1480:     vmv_v_x(vcrc, zr);
> 1481:     slli(crc, crc, 32);

You can use zero extend here.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/20910#discussion_r1752180965
PR Review Comment: https://git.openjdk.org/jdk/pull/20910#discussion_r1752177020


More information about the hotspot-dev mailing list