RFR: 8335191: RISC-V: verify perf of chacha20
Fei Yang
fyang at openjdk.org
Thu Jul 25 04:11:32 UTC 2024
On Tue, 23 Jul 2024 11:21:31 GMT, Hamlin Li <mli at openjdk.org> wrote:
> Hi,
> Can you help to review this simple patch?
>
> Previously, we implemented this intrinsic for chacha20 algo based on vector instructions, the latest test on real hardwares (k230, bananapi) shows that the implementation only bring more performance gain rather than regression when (vlenb == 32, on bananapi), when vlenb == 16 (on k230) it only bring regression in all test cases.
> So, we should adjust when to turn on the intrinsic, ie. only when vlenb == 32.
>
> Thanks
>
>
> ## Performance
>
> ### on k230
> vlenb == 16
> <google-sheets-html-origin style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">
> Benchmark - on k230, vlenb == 16 | (dataSize) | (keyLength) | (mode) | (padding) | (permutation) | Cnt | Score -no-intrinsic | Score +intrinsic | Error | Units | Non-intrinsic/intrinsic
> -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --
> o.o.b.j.c.full.CipherBench.ChaCha20.decrypt | 256 | 256 | None | NoPadding | ChaCha20 | 10 | 4642.694 | 5216.699 | 36.039 | ns/op | 0.89
> o.o.b.j.c.full.CipherBench.ChaCha20.decrypt | 1024 | 256 | None | NoPadding | ChaCha20 | 10 | 15719.612 | 17622.616 | 136.609 | ns/op | 0.892
> o.o.b.j.c.full.CipherBench.ChaCha20.decrypt | 4096 | 256 | None | NoPadding | ChaCha20 | 10 | 59402.742 | 67124.28 | 651.011 | ns/op | 0.885
> o.o.b.j.c.full.CipherBench.ChaCha20.decrypt | 16384 | 256 | None | NoPadding | ChaCha20 | 10 | 250056.475 | 269184.924 | 8591.727 | ns/op | 0.929
> o.o.b.j.c.full.CipherBench.ChaCha20.encrypt | 256 | 256 | None | NoPadding | ChaCha20 | 10 | 4752.081 | 5131.094 | 38.917 | ns/op | 0.926
> o.o.b.j.c.full.CipherBench.ChaCha20.encrypt | 1024 | 256 | None | NoPadding | ChaCha20 | 10 | 15554.484 | 16992.339 | 106.583 | ns/op | 0.915
> o.o.b.j.c.full.CipherBench.ChaCha20.encrypt | 4096 | 256 | None | NoPadding | ChaCha20 | 10 | 61446.365 | 67359.67 | 548.353 | ns/op | 0.912
> o.o.b.j.c.full.CipherBench.ChaCha20.encrypt | 16384 | 256 | None | NoPadding | ChaCha20 | 10 | 241653.654 | 270189.531 | 3705.045 | ns/op | 0.894
> o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.decrypt | 256 | 256 | None | NoPadding ChaC | ha20-Poly1305 | 10 | 17833.825 | 20610.118 | 688.668 | ns/op | 0.865
> o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.decrypt | 1024 | 256 | None | NoPadding ChaC | ha20-Poly1...
Thanks for carrying out the test.
-------------
Marked as reviewed by fyang (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/20298#pullrequestreview-2198265238
More information about the hotspot-dev
mailing list