RFR: 8335191: RISC-V: verify perf of chacha20
Hamlin Li
mli at openjdk.org
Tue Jul 23 11:27:00 UTC 2024
Hi,
Can you help to review this simple patch?
Previously, we implemented this intrinsic for chacha20 algo based on vector instructions, the latest test on real hardwares (k230, bananapi) shows that the implementation only bring more performance gain rather than regression when (vlenb == 32, on bananapi), when vlenb == 16 (on k230) it only bring regression in all test cases.
So, we should adjust when to turn on the intrinsic, ie. only when vlenb == 32.
Thanks
## Performance
### on k230
vlenb == 16
<google-sheets-html-origin style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">
Benchmark - on k230, vlenb == 16 | (dataSize) | (keyLength) | (mode) | (padding) | (permutation) | Cnt | Score -no-intrinsic | Score +intrinsic | Error | Units | Non-intrinsic/intrinsic
-- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --
o.o.b.j.c.full.CipherBench.ChaCha20.decrypt | 256 | 256 | None | NoPadding | ChaCha20 | 10 | 4642.694 | 5216.699 | 36.039 | ns/op | 0.89
o.o.b.j.c.full.CipherBench.ChaCha20.decrypt | 1024 | 256 | None | NoPadding | ChaCha20 | 10 | 15719.612 | 17622.616 | 136.609 | ns/op | 0.892
o.o.b.j.c.full.CipherBench.ChaCha20.decrypt | 4096 | 256 | None | NoPadding | ChaCha20 | 10 | 59402.742 | 67124.28 | 651.011 | ns/op | 0.885
o.o.b.j.c.full.CipherBench.ChaCha20.decrypt | 16384 | 256 | None | NoPadding | ChaCha20 | 10 | 250056.475 | 269184.924 | 8591.727 | ns/op | 0.929
o.o.b.j.c.full.CipherBench.ChaCha20.encrypt | 256 | 256 | None | NoPadding | ChaCha20 | 10 | 4752.081 | 5131.094 | 38.917 | ns/op | 0.926
o.o.b.j.c.full.CipherBench.ChaCha20.encrypt | 1024 | 256 | None | NoPadding | ChaCha20 | 10 | 15554.484 | 16992.339 | 106.583 | ns/op | 0.915
o.o.b.j.c.full.CipherBench.ChaCha20.encrypt | 4096 | 256 | None | NoPadding | ChaCha20 | 10 | 61446.365 | 67359.67 | 548.353 | ns/op | 0.912
o.o.b.j.c.full.CipherBench.ChaCha20.encrypt | 16384 | 256 | None | NoPadding | ChaCha20 | 10 | 241653.654 | 270189.531 | 3705.045 | ns/op | 0.894
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.decrypt | 256 | 256 | None | NoPadding ChaC | ha20-Poly1305 | 10 | 17833.825 | 20610.118 | 688.668 | ns/op | 0.865
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.decrypt | 1024 | 256 | None | NoPadding ChaC | ha20-Poly1305 | 10 | 32872.633 | 36427.148 | 4339.823 | ns/op | 0.902
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.decrypt | 4096 | 256 | None | NoPadding ChaC | ha20-Poly1305 | 10 | 87398.821 | 96112.498 | 1028.342 | ns/op | 0.909
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.decrypt | 16384 | 256 | None | NoPadding ChaC | ha20-Poly1305 | 10 | 314533.305 | 342115.144 | 13633.382 | ns/op | 0.919
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.encrypt | 256 | 256 | None | NoPadding ChaC | ha20-Poly1305 | 10 | 12190.039 | 14844.154 | 111.009 | ns/op | 0.821
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.encrypt | 1024 | 256 | None | NoPadding ChaC | ha20-Poly1305 | 10 | 25734.516 | 30267.139 | 326.158 | ns/op | 0.85
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.encrypt | 4096 | 256 | None | NoPadding ChaC | ha20-Poly1305 | 10 | 81007.764 | 90623.578 | 572.987 | ns/op | 0.894
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.encrypt | 16384 | 256 | None | NoPadding ChaC | ha20-Poly1305 | 10 | 308229.077 | 343146.562 | 18801.368 | ns/op | 0.898
o.o.b.j.c.small.CipherBench.ChaCha20Poly1305.decrypt | 16384 | 256 | None | NoPadding ChaC | ha20-Poly1305 | 10 | 321267.148 | 340960.217 | 22253.659 | ns/op | 0.942
o.o.b.j.c.small.CipherBench.ChaCha20Poly1305.encrypt | 16384 | 256 | None | NoPadding ChaC | ha20-Poly1305 | 10 | 307476.57 | 341029.841 | 13851.386 | ns/op | 0.902
</google-sheets-html-origin>
### on bananapi
vlenb == 32
<google-sheets-html-origin style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">
Benchmark - on bananas, vlenb == 32 | (dataSize) | (keyLength) | (mode) | (padding) | (permutation) | Cnt | Score +intrinsic | Score -intrinsic | Error | Units | improvement
-- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --
o.o.b.j.c.full.CipherBench.ChaCha20.decrypt | 256 | 256 | None | NoPadding | ChaCha20 | 10 | 4804.517 | 4154.869 | 2.951 | ns/op | 0.865
o.o.b.j.c.full.CipherBench.ChaCha20.decrypt | 1024 | 256 | None | NoPadding | ChaCha20 | 10 | 10782.788 | 14604.89 | 19.031 | ns/op | 1.354
o.o.b.j.c.full.CipherBench.ChaCha20.decrypt | 4096 | 256 | None | NoPadding | ChaCha20 | 10 | 39502.457 | 57211.53 | 69.436 | ns/op | 1.448
o.o.b.j.c.full.CipherBench.ChaCha20.decrypt | 16384 | 256 | None | NoPadding | ChaCha20 | 10 | 166005.925 | 228615.833 | 22.311 | ns/op | 1.377
o.o.b.j.c.full.CipherBench.ChaCha20.encrypt | 256 | 256 | None | NoPadding | ChaCha20 | 10 | 5040.652 | 4389.007 | 60.197 | ns/op | 0.871
o.o.b.j.c.full.CipherBench.ChaCha20.encrypt | 1024 | 256 | None | NoPadding | ChaCha20 | 10 | 11176.787 | 14530.768 | 12.192 | ns/op | 1.3
o.o.b.j.c.full.CipherBench.ChaCha20.encrypt | 4096 | 256 | None | NoPadding | ChaCha20 | 10 | 40875.87 | 56149.493 | 111.238 | ns/op | 1.374
o.o.b.j.c.full.CipherBench.ChaCha20.encrypt | 16384 | 256 | None | NoPadding | ChaCha20 | 10 | 166459.572 | 221221.334 | 1078.792 | ns/op | 1.329
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.decrypt | 256 | 256 | None | NoPadding | ChaCha20-Poly1305 | 10 | 17781.57 | 14356.974 | 38.96 | ns/op | 0.807
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.decrypt | 1024 | 256 | None | NoPadding | ChaCha20-Poly1305 | 10 | 26098.932 | 27368.785 | 52.171 | ns/op | 1.049
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.decrypt | 4096 | 256 | None | NoPadding | ChaCha20-Poly1305 | 10 | 67351.38 | 82535.832 | 111.414 | ns/op | 1.225
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.decrypt | 16384 | 256 | None | NoPadding | ChaCha20-Poly1305 | 10 | 235767.096 | 295121.502 | 1443.64 | ns/op | 1.252
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.encrypt | 256 | 256 | None | NoPadding | ChaCha20-Poly1305 | 10 | 13634.202 | 10476.916 | 21.069 | ns/op | 0.768
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.encrypt | 1024 | 256 | None | NoPadding | ChaCha20-Poly1305 | 10 | 22209.959 | 24513.545 | 23.072 | ns/op | 1.104
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.encrypt | 4096 | 256 | None | NoPadding | ChaCha20-Poly1305 | 10 | 62540.238 | 78088.592 | 54.63 | ns/op | 1.249
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.encrypt | 16384 | 256 | None | NoPadding | ChaCha20-Poly1305 | 10 | 225358.667 | 293718.246 | 314.449 | ns/op | 1.303
o.o.b.j.c.small.CipherBench.ChaCha20Poly1305.decrypt | 16384 | 256 | None | NoPadding | ChaCha20-Poly1305 | 10 | 237810.351 | 295495.242 | 412.976 | ns/op | 1.243
o.o.b.j.c.small.CipherBench.ChaCha20Poly1305.encrypt | 16384 | 256 | None | NoPadding | ChaCha20-Poly1305 | 10 | 230771.689 | 290751.264 | 315.883 | ns/op | 1.26
</google-sheets-html-origin>
-------------
Commit messages:
- Initial commit
Changes: https://git.openjdk.org/jdk/pull/20298/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20298&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8335191
Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod
Patch: https://git.openjdk.org/jdk/pull/20298.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/20298/head:pull/20298
PR: https://git.openjdk.org/jdk/pull/20298
More information about the hotspot-dev
mailing list