RFR: 8339738: RISC-V: Vectorize crc32 intrinsic [v4]
Hamlin Li
mli at openjdk.org
Wed Sep 11 07:07:38 UTC 2024
> Hi,
> Can you help to review this patch?
> Thanks.
>
> This improvement is based on java.base/share/native/libzip/zlib/zcrc32.c, I made some modification to N (to 16) related code, then re-generate the tables needed, finally vectorize the code (original implementation in zcrc32.c is just scalar code).
>
> ## Test
> test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32.java,
> test/jdk/java/util/zip/TestCRC32.java
>
> ## Performance
>
> ### on bananapi
>
> with patch
> <google-sheets-html-origin style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">
> Benchmark -with patch | (count) | Mode | Cnt | Score | Error | Units
> -- | -- | -- | -- | -- | -- | --
> TestCRC32.testCRC32Update | 64 | avgt | 10 | 220.884 | 0.03 | ns/op
> TestCRC32.testCRC32Update | 128 | avgt | 10 | 401.122 | 0.309 | ns/op
> TestCRC32.testCRC32Update | 256 | avgt | 10 | 680.168 | 0.032 | ns/op
> TestCRC32.testCRC32Update | 512 | avgt | 10 | 1062.426 | 0.401 | ns/op
> TestCRC32.testCRC32Update | 2048 | avgt | 10 | 3308.361 | 0.176 | ns/op
> TestCRC32.testCRC32Update | 16384 | avgt | 10 | 24403.231 | 20.248 | ns/op
> TestCRC32.testCRC32Update | 65536 | avgt | 10 | 103463.735 | 4.245 | ns/op
>
> </google-sheets-html-origin>
>
> without patch
> <google-sheets-html-origin style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">
> Benchmark -without patch | (count) | Mode | Cnt | Score | Error | Units
> -- | -- | -- | -- | -- | -- | --
> TestCRC32.testCRC32Update | 64 | avgt | 10 | 220.942 | 0.224 | ns/op
> TestCRC32.testCRC32Update | 128 | avgt | 10 | 364.159 | 0.019 | ns/op
> TestCRC32.testCRC32Update | 256 | avgt | 10 | 686.106 | 0.1 | ns/op
> TestCRC32.testCRC32Update | 512 | avgt | 10 | 1328.962 | 0.073 | ns/op
> TestCRC32.testCRC32Update | 2048 | avgt | 10 | 5191.116 | 0.189 | ns/op
> TestCRC32.testCRC32Update | 16384 | avgt | 10 | 41286.858 | 4.53 | ns/op
> TestCRC32.testCRC32Update | 65536 | avgt | 10 | 172340.099 | 11.004 | ns/op
>
> </google-sheets-html-origin>
>
> ### on K230
>
> with patch
> <google-sheets-html-origin style=...
Hamlin Li has updated the pull request incrementally with one additional commit since the last revision:
fix perf regression
-------------
Changes:
- all: https://git.openjdk.org/jdk/pull/20910/files
- new: https://git.openjdk.org/jdk/pull/20910/files/e856e6ad..2420f770
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jdk&pr=20910&range=03
- incr: https://webrevs.openjdk.org/?repo=jdk&pr=20910&range=02-03
Stats: 56 lines in 1 file changed: 25 ins; 22 del; 9 mod
Patch: https://git.openjdk.org/jdk/pull/20910.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/20910/head:pull/20910
PR: https://git.openjdk.org/jdk/pull/20910
More information about the hotspot-dev
mailing list