Integrated: 8255625: AArch64: Implement Base64.encodeBlock accelerator/intrinsic
Dong Bo
dongbo at openjdk.java.net
Wed Nov 11 01:55:58 UTC 2020
On Mon, 2 Nov 2020 03:05:48 GMT, Dong Bo <dongbo at openjdk.org> wrote:
> Base64.encodeBlock stub is implemented for x86_64.
> We can also do the same thing for aarch64 with SIMD LD3/ST4/TBL.
> A basic idea can be found here: http://0x80.pl/articles/base64-simd-neon.html#encoding-quadwords.
>
> Patch passed jtreg tier1-3 tests with linux-aarch64-server-release build.
> Tests in test/jdk/java/util/Base64/* runned specially for the correctness of the implementation and passed.
>
> A JMH micro, Base64Encode.java, is added for performance test.
> With different input length (upper-bounded by parameter `maxNumBytes` in the JMH micro),
> we witness ~4x improvement with long input and no regression with short input on Kunpeng916 and Kunpeng920.
>
> The Base64Encode.java JMH micro-benchmark results:
> Benchmark (maxNumBytes) Mode Cnt Score Error Units
> # kunpeng 916, intrinsic
> Base64Encode.testBase64Encode 1 avgt 10 31.564 ± 0.034 ns/op
> Base64Encode.testBase64Encode 2 avgt 10 33.921 ± 0.362 ns/op
> Base64Encode.testBase64Encode 3 avgt 10 38.015 ± 0.220 ns/op
> Base64Encode.testBase64Encode 6 avgt 10 41.115 ± 0.281 ns/op
> Base64Encode.testBase64Encode 7 avgt 10 42.161 ± 0.630 ns/op
> Base64Encode.testBase64Encode 9 avgt 10 44.797 ± 0.849 ns/op
> Base64Encode.testBase64Encode 10 avgt 10 46.013 ± 0.917 ns/op
> Base64Encode.testBase64Encode 48 avgt 10 67.984 ± 0.777 ns/op
> Base64Encode.testBase64Encode 512 avgt 10 174.494 ± 1.614 ns/op
> Base64Encode.testBase64Encode 1000 avgt 10 277.103 ± 0.306 ns/op
> Base64Encode.testBase64Encode 20000 avgt 10 4261.018 ± 1.883 ns/op
>
> # kunpeng 916, default
> Base64Encode.testBase64Encode 1 avgt 10 31.710 ± 0.234 ns/op
> Base64Encode.testBase64Encode 2 avgt 10 33.978 ± 0.305 ns/op
> Base64Encode.testBase64Encode 3 avgt 10 40.059 ± 0.444 ns/op
> Base64Encode.testBase64Encode 6 avgt 10 47.958 ± 0.328 ns/op
> Base64Encode.testBase64Encode 7 avgt 10 49.017 ± 1.305 ns/op
> Base64Encode.testBase64Encode 9 avgt 10 53.150 ± 0.769 ns/op
> Base64Encode.testBase64Encode 10 avgt 10 55.418 ± 0.316 ns/op
> Base64Encode.testBase64Encode 48 avgt 10 93.517 ± 0.391 ns/op
> Base64Encode.testBase64Encode 512 avgt 10 494.809 ± 0.413 ns/op
> Base64Encode.testBase64Encode 1000 avgt 10 898.581 ± 0.944 ns/op
> Base64Encode.testBase64Encode 20000 avgt 10 16464.411 ± 7.582 ns/op
>
> # kunpeng 920, intrinsic
> Base64Encode.testBase64Encode 1 avgt 10 17.494 ± 0.012 ns/op
> Base64Encode.testBase64Encode 2 avgt 10 21.023 ± 0.169 ns/op
> Base64Encode.testBase64Encode 3 avgt 10 25.772 ± 0.138 ns/op
> Base64Encode.testBase64Encode 6 avgt 10 30.121 ± 0.347 ns/op
> Base64Encode.testBase64Encode 7 avgt 10 31.591 ± 0.238 ns/op
> Base64Encode.testBase64Encode 9 avgt 10 32.728 ± 0.395 ns/op
> Base64Encode.testBase64Encode 10 avgt 10 35.110 ± 0.215 ns/op
> Base64Encode.testBase64Encode 48 avgt 10 48.621 ± 0.314 ns/op
> Base64Encode.testBase64Encode 512 avgt 10 113.391 ± 0.554 ns/op
> Base64Encode.testBase64Encode 1000 avgt 10 180.749 ± 0.193 ns/op
> Base64Encode.testBase64Encode 20000 avgt 10 3273.961 ± 5.706 ns/op
>
> # kunpeng 920, default
> Base64Encode.testBase64Encode 1 avgt 10 17.428 ± 0.037 ns/op
> Base64Encode.testBase64Encode 2 avgt 10 20.926 ± 0.155 ns/op
> Base64Encode.testBase64Encode 3 avgt 10 25.466 ± 0.140 ns/op
> Base64Encode.testBase64Encode 6 avgt 10 32.526 ± 0.190 ns/op
> Base64Encode.testBase64Encode 7 avgt 10 34.132 ± 0.387 ns/op
> Base64Encode.testBase64Encode 9 avgt 10 36.685 ± 0.212 ns/op
> Base64Encode.testBase64Encode 10 avgt 10 38.117 ± 0.246 ns/op
> Base64Encode.testBase64Encode 48 avgt 10 62.447 ± 0.900 ns/op
> Base64Encode.testBase64Encode 512 avgt 10 377.275 ± 0.162 ns/op
> Base64Encode.testBase64Encode 1000 avgt 10 700.628 ± 0.509 ns/op
> Base64Encode.testBase64Encode 20000 avgt 10 13626.764 ± 3.448 ns/op
This pull request has now been integrated.
Changeset: 8638cd9a
Author: Dong Bo <dongbo at openjdk.org>
Committer: Fei Yang <fyang at openjdk.org>
URL: https://git.openjdk.java.net/jdk/commit/8638cd9a
Stats: 226 lines in 3 files changed: 226 ins; 0 del; 0 mod
8255625: AArch64: Implement Base64.encodeBlock accelerator/intrinsic
Reviewed-by: aph
-------------
PR: https://git.openjdk.java.net/jdk/pull/992
More information about the hotspot-dev
mailing list