RFR: 8317720: RISC-V: Implement Adler32 intrinsic [v11]
ArsenyBochkarev
duke at openjdk.org
Wed Jun 26 11:19:18 UTC 2024
On Wed, 19 Jun 2024 03:37:20 GMT, Fei Yang <fyang at openjdk.org> wrote:
>> Sounds good, thanks! As far as I can see it is safe to do it for any `MaxVectorSize`, so that the code will be even simplier. Please correct me if I'm wrong
>
> Yeah, it will be more simpler if you use it for any MaxVectorSize. But to my surprise, it will bring negative performane impact when I test on my BananaPi-F3 (MaxVectorSize = 32). I am not sure whether it's an issue across the boards. Maybe you can give it a try on your hardware to see?
Hi! Sorry for such a late reply. Here are the results on Kendryte K230 for both options.
Without condition for `MaxVectorSize`:
Benchmark (count) Mode Cnt Score Error Units
Adler32.TestAdler32.testAdler32Update 64 thrpt 25 7253.108 ? 7.362 ops/ms
Adler32.TestAdler32.testAdler32Update 128 thrpt 25 5849.102 ? 9.263 ops/ms
Adler32.TestAdler32.testAdler32Update 256 thrpt 25 4220.237 ? 4.695 ops/ms
Adler32.TestAdler32.testAdler32Update 512 thrpt 25 2717.498 ? 3.531 ops/ms
Adler32.TestAdler32.testAdler32Update 1024 thrpt 25 1585.664 ? 2.455 ops/ms
Adler32.TestAdler32.testAdler32Update 2048 thrpt 25 865.849 ? 0.717 ops/ms
Adler32.TestAdler32.testAdler32Update 5012 thrpt 25 372.534 ? 0.293 ops/ms
Adler32.TestAdler32.testAdler32Update 8192 thrpt 25 231.532 ? 0.306 ops/ms
Adler32.TestAdler32.testAdler32Update 16384 thrpt 25 117.180 ? 0.157 ops/ms
Adler32.TestAdler32.testAdler32Update 32768 thrpt 25 55.013 ? 0.152 ops/ms
Adler32.TestAdler32.testAdler32Update 65536 thrpt 25 25.604 ? 0.126 ops/ms
and for different `vsetvli` depending on the `MaxVectorSize`:
Benchmark (count) Mode Cnt Score Error Units
Adler32.TestAdler32.testAdler32Update 64 thrpt 25 7239.485 ? 8.705 ops/ms
Adler32.TestAdler32.testAdler32Update 128 thrpt 25 5836.018 ? 9.489 ops/ms
Adler32.TestAdler32.testAdler32Update 256 thrpt 25 4212.986 ? 4.596 ops/ms
Adler32.TestAdler32.testAdler32Update 512 thrpt 25 2712.742 ? 3.114 ops/ms
Adler32.TestAdler32.testAdler32Update 1024 thrpt 25 1583.161 ? 2.374 ops/ms
Adler32.TestAdler32.testAdler32Update 2048 thrpt 25 864.321 ? 0.870 ops/ms
Adler32.TestAdler32.testAdler32Update 5012 thrpt 25 371.964 ? 0.463 ops/ms
Adler32.TestAdler32.testAdler32Update 8192 thrpt 25 231.092 ? 0.328 ops/ms
Adler32.TestAdler32.testAdler32Update 16384 thrpt 25 116.995 ? 0.189 ops/ms
Adler32.TestAdler32.testAdler32Update 32768 thrpt 25 54.923 ? 0.075 ops/ms
Adler32.TestAdler32.testAdler32Update 65536 thrpt 25 24.864 ? 0.618 ops/ms
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/18382#discussion_r1654623817
More information about the hotspot-compiler-dev
mailing list