RFR: 8335191: RISC-V: verify perf of chacha20
Hamlin Li
mli at openjdk.org
Thu Jul 25 07:52:36 UTC 2024
On Tue, 23 Jul 2024 11:21:31 GMT, Hamlin Li <mli at openjdk.org> wrote:
> Hi,
> Can you help to review this simple patch?
>
> Previously, we implemented this intrinsic for chacha20 algo based on vector instructions, the latest test on real hardwares (k230, bananapi) shows that the implementation only bring more performance gain rather than regression when (vlenb == 32, on bananapi), when vlenb == 16 (on k230) it only bring regression in all test cases.
> So, we should adjust when to turn on the intrinsic, ie. only when vlenb == 32.
>
> Thanks
>
>
> ## Performance
>
> ### on k230
> vlenb == 16
> <google-sheets-html-origin style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">
> Benchmark - on k230, vlenb == 16 | (dataSize) | (keyLength) | (mode) | (padding) | (permutation) | Cnt | Score -no-intrinsic | Score +intrinsic | Error | Units | Non-intrinsic/intrinsic
> -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --
> o.o.b.j.c.full.CipherBench.ChaCha20.decrypt | 256 | 256 | None | NoPadding | ChaCha20 | 10 | 4642.694 | 5216.699 | 36.039 | ns/op | 0.89
> o.o.b.j.c.full.CipherBench.ChaCha20.decrypt | 1024 | 256 | None | NoPadding | ChaCha20 | 10 | 15719.612 | 17622.616 | 136.609 | ns/op | 0.892
> o.o.b.j.c.full.CipherBench.ChaCha20.decrypt | 4096 | 256 | None | NoPadding | ChaCha20 | 10 | 59402.742 | 67124.28 | 651.011 | ns/op | 0.885
> o.o.b.j.c.full.CipherBench.ChaCha20.decrypt | 16384 | 256 | None | NoPadding | ChaCha20 | 10 | 250056.475 | 269184.924 | 8591.727 | ns/op | 0.929
> o.o.b.j.c.full.CipherBench.ChaCha20.encrypt | 256 | 256 | None | NoPadding | ChaCha20 | 10 | 4752.081 | 5131.094 | 38.917 | ns/op | 0.926
> o.o.b.j.c.full.CipherBench.ChaCha20.encrypt | 1024 | 256 | None | NoPadding | ChaCha20 | 10 | 15554.484 | 16992.339 | 106.583 | ns/op | 0.915
> o.o.b.j.c.full.CipherBench.ChaCha20.encrypt | 4096 | 256 | None | NoPadding | ChaCha20 | 10 | 61446.365 | 67359.67 | 548.353 | ns/op | 0.912
> o.o.b.j.c.full.CipherBench.ChaCha20.encrypt | 16384 | 256 | None | NoPadding | ChaCha20 | 10 | 241653.654 | 270189.531 | 3705.045 | ns/op | 0.894
> o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.decrypt | 256 | 256 | None | NoPadding ChaC | ha20-Poly1305 | 10 | 17833.825 | 20610.118 | 688.668 | ns/op | 0.865
> o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.decrypt | 1024 | 256 | None | NoPadding ChaC | ha20-Poly1...
As the change is minor and straight, I'll push it with one review.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/20298#issuecomment-2249677415
More information about the hotspot-dev
mailing list