RFR: 8335191: RISC-V: verify perf of chacha20

Hamlin Li mli at openjdk.org
Tue Jul 23 11:27:00 UTC 2024


Hi,
Can you help to review this simple patch?

Previously, we implemented this intrinsic for chacha20 algo based on vector instructions, the latest test on real hardwares (k230, bananapi) shows that the implementation only bring more performance gain rather than regression when (vlenb == 32, on bananapi), when vlenb == 16 (on k230) it only bring regression in all test cases.
So, we should adjust when to turn on the intrinsic, ie. only when vlenb == 32.

Thanks


## Performance

### on k230
vlenb == 16
<google-sheets-html-origin style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">
Benchmark - on k230, vlenb == 16 | (dataSize) | (keyLength) | (mode) | (padding) | (permutation) | Cnt | Score -no-intrinsic | Score +intrinsic | Error | Units | Non-intrinsic/intrinsic
-- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --
o.o.b.j.c.full.CipherBench.ChaCha20.decrypt | 256 | 256 | None | NoPadding | ChaCha20 | 10 | 4642.694 | 5216.699 | 36.039 | ns/op | 0.89
o.o.b.j.c.full.CipherBench.ChaCha20.decrypt | 1024 | 256 | None | NoPadding | ChaCha20 | 10 | 15719.612 | 17622.616 | 136.609 | ns/op | 0.892
o.o.b.j.c.full.CipherBench.ChaCha20.decrypt | 4096 | 256 | None | NoPadding | ChaCha20 | 10 | 59402.742 | 67124.28 | 651.011 | ns/op | 0.885
o.o.b.j.c.full.CipherBench.ChaCha20.decrypt | 16384 | 256 | None | NoPadding | ChaCha20 | 10 | 250056.475 | 269184.924 | 8591.727 | ns/op | 0.929
o.o.b.j.c.full.CipherBench.ChaCha20.encrypt | 256 | 256 | None | NoPadding | ChaCha20 | 10 | 4752.081 | 5131.094 | 38.917 | ns/op | 0.926
o.o.b.j.c.full.CipherBench.ChaCha20.encrypt | 1024 | 256 | None | NoPadding | ChaCha20 | 10 | 15554.484 | 16992.339 | 106.583 | ns/op | 0.915
o.o.b.j.c.full.CipherBench.ChaCha20.encrypt | 4096 | 256 | None | NoPadding | ChaCha20 | 10 | 61446.365 | 67359.67 | 548.353 | ns/op | 0.912
o.o.b.j.c.full.CipherBench.ChaCha20.encrypt | 16384 | 256 | None | NoPadding | ChaCha20 | 10 | 241653.654 | 270189.531 | 3705.045 | ns/op | 0.894
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.decrypt | 256 | 256 | None | NoPadding ChaC | ha20-Poly1305 | 10 | 17833.825 | 20610.118 | 688.668 | ns/op | 0.865
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.decrypt | 1024 | 256 | None | NoPadding ChaC | ha20-Poly1305 | 10 | 32872.633 | 36427.148 | 4339.823 | ns/op | 0.902
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.decrypt | 4096 | 256 | None | NoPadding ChaC | ha20-Poly1305 | 10 | 87398.821 | 96112.498 | 1028.342 | ns/op | 0.909
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.decrypt | 16384 | 256 | None | NoPadding ChaC | ha20-Poly1305 | 10 | 314533.305 | 342115.144 | 13633.382 | ns/op | 0.919
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.encrypt | 256 | 256 | None | NoPadding ChaC | ha20-Poly1305 | 10 | 12190.039 | 14844.154 | 111.009 | ns/op | 0.821
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.encrypt | 1024 | 256 | None | NoPadding ChaC | ha20-Poly1305 | 10 | 25734.516 | 30267.139 | 326.158 | ns/op | 0.85
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.encrypt | 4096 | 256 | None | NoPadding ChaC | ha20-Poly1305 | 10 | 81007.764 | 90623.578 | 572.987 | ns/op | 0.894
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.encrypt | 16384 | 256 | None | NoPadding ChaC | ha20-Poly1305 | 10 | 308229.077 | 343146.562 | 18801.368 | ns/op | 0.898
o.o.b.j.c.small.CipherBench.ChaCha20Poly1305.decrypt | 16384 | 256 | None | NoPadding ChaC | ha20-Poly1305 | 10 | 321267.148 | 340960.217 | 22253.659 | ns/op | 0.942
o.o.b.j.c.small.CipherBench.ChaCha20Poly1305.encrypt | 16384 | 256 | None | NoPadding ChaC | ha20-Poly1305 | 10 | 307476.57 | 341029.841 | 13851.386 | ns/op | 0.902

</google-sheets-html-origin>

### on bananapi
vlenb == 32
<google-sheets-html-origin style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">
Benchmark - on bananas, vlenb == 32 | (dataSize) | (keyLength) | (mode) | (padding) | (permutation) | Cnt | Score +intrinsic | Score -intrinsic | Error | Units | improvement
-- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --
o.o.b.j.c.full.CipherBench.ChaCha20.decrypt | 256 | 256 | None | NoPadding | ChaCha20 | 10 | 4804.517 | 4154.869 | 2.951 | ns/op | 0.865
o.o.b.j.c.full.CipherBench.ChaCha20.decrypt | 1024 | 256 | None | NoPadding | ChaCha20 | 10 | 10782.788 | 14604.89 | 19.031 | ns/op | 1.354
o.o.b.j.c.full.CipherBench.ChaCha20.decrypt | 4096 | 256 | None | NoPadding | ChaCha20 | 10 | 39502.457 | 57211.53 | 69.436 | ns/op | 1.448
o.o.b.j.c.full.CipherBench.ChaCha20.decrypt | 16384 | 256 | None | NoPadding | ChaCha20 | 10 | 166005.925 | 228615.833 | 22.311 | ns/op | 1.377
o.o.b.j.c.full.CipherBench.ChaCha20.encrypt | 256 | 256 | None | NoPadding | ChaCha20 | 10 | 5040.652 | 4389.007 | 60.197 | ns/op | 0.871
o.o.b.j.c.full.CipherBench.ChaCha20.encrypt | 1024 | 256 | None | NoPadding | ChaCha20 | 10 | 11176.787 | 14530.768 | 12.192 | ns/op | 1.3
o.o.b.j.c.full.CipherBench.ChaCha20.encrypt | 4096 | 256 | None | NoPadding | ChaCha20 | 10 | 40875.87 | 56149.493 | 111.238 | ns/op | 1.374
o.o.b.j.c.full.CipherBench.ChaCha20.encrypt | 16384 | 256 | None | NoPadding | ChaCha20 | 10 | 166459.572 | 221221.334 | 1078.792 | ns/op | 1.329
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.decrypt | 256 | 256 | None | NoPadding | ChaCha20-Poly1305 | 10 | 17781.57 | 14356.974 | 38.96 | ns/op | 0.807
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.decrypt | 1024 | 256 | None | NoPadding | ChaCha20-Poly1305 | 10 | 26098.932 | 27368.785 | 52.171 | ns/op | 1.049
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.decrypt | 4096 | 256 | None | NoPadding | ChaCha20-Poly1305 | 10 | 67351.38 | 82535.832 | 111.414 | ns/op | 1.225
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.decrypt | 16384 | 256 | None | NoPadding | ChaCha20-Poly1305 | 10 | 235767.096 | 295121.502 | 1443.64 | ns/op | 1.252
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.encrypt | 256 | 256 | None | NoPadding | ChaCha20-Poly1305 | 10 | 13634.202 | 10476.916 | 21.069 | ns/op | 0.768
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.encrypt | 1024 | 256 | None | NoPadding | ChaCha20-Poly1305 | 10 | 22209.959 | 24513.545 | 23.072 | ns/op | 1.104
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.encrypt | 4096 | 256 | None | NoPadding | ChaCha20-Poly1305 | 10 | 62540.238 | 78088.592 | 54.63 | ns/op | 1.249
o.o.b.j.c.full.CipherBench.ChaCha20Poly1305.encrypt | 16384 | 256 | None | NoPadding | ChaCha20-Poly1305 | 10 | 225358.667 | 293718.246 | 314.449 | ns/op | 1.303
o.o.b.j.c.small.CipherBench.ChaCha20Poly1305.decrypt | 16384 | 256 | None | NoPadding | ChaCha20-Poly1305 | 10 | 237810.351 | 295495.242 | 412.976 | ns/op | 1.243
o.o.b.j.c.small.CipherBench.ChaCha20Poly1305.encrypt | 16384 | 256 | None | NoPadding | ChaCha20-Poly1305 | 10 | 230771.689 | 290751.264 | 315.883 | ns/op | 1.26

</google-sheets-html-origin>

-------------

Commit messages:
 - Initial commit

Changes: https://git.openjdk.org/jdk/pull/20298/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20298&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8335191
  Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/20298.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20298/head:pull/20298

PR: https://git.openjdk.org/jdk/pull/20298


More information about the hotspot-dev mailing list