RFR: 8255625: AArch64: Implement Base64.encodeBlock accelerator/intrinsic [v7]
Dong Bo
dongbo at openjdk.java.net
Tue Nov 10 03:20:09 UTC 2020
> Base64.encodeBlock stub is implemented for x86_64.
> We can also do the same thing for aarch64 with SIMD LD3/ST4/TBL.
> A basic idea can be found here: http://0x80.pl/articles/base64-simd-neon.html#encoding-quadwords.
>
> Patch passed jtreg tier1-3 tests with linux-aarch64-server-release build.
> Tests in test/jdk/java/util/Base64/* runned specially for the correctness of the implementation and passed.
>
> A JMH micro, Base64Encode.java, is added for performance test.
> With different input length (upper-bounded by parameter `maxNumBytes` in the JMH micro),
> we witness ~4x improvement with long input and no regression with short input on Kunpeng916 and Kunpeng920.
>
> The Base64Encode.java JMH micro-benchmark results:
> Benchmark (maxNumBytes) Mode Cnt Score Error Units
> # kunpeng 916, intrinsic
> Base64Encode.testBase64Encode 1 avgt 10 31.564 ± 0.034 ns/op
> Base64Encode.testBase64Encode 2 avgt 10 33.921 ± 0.362 ns/op
> Base64Encode.testBase64Encode 3 avgt 10 38.015 ± 0.220 ns/op
> Base64Encode.testBase64Encode 6 avgt 10 41.115 ± 0.281 ns/op
> Base64Encode.testBase64Encode 7 avgt 10 42.161 ± 0.630 ns/op
> Base64Encode.testBase64Encode 9 avgt 10 44.797 ± 0.849 ns/op
> Base64Encode.testBase64Encode 10 avgt 10 46.013 ± 0.917 ns/op
> Base64Encode.testBase64Encode 48 avgt 10 67.984 ± 0.777 ns/op
> Base64Encode.testBase64Encode 512 avgt 10 174.494 ± 1.614 ns/op
> Base64Encode.testBase64Encode 1000 avgt 10 277.103 ± 0.306 ns/op
> Base64Encode.testBase64Encode 20000 avgt 10 4261.018 ± 1.883 ns/op
>
> # kunpeng 916, default
> Base64Encode.testBase64Encode 1 avgt 10 31.710 ± 0.234 ns/op
> Base64Encode.testBase64Encode 2 avgt 10 33.978 ± 0.305 ns/op
> Base64Encode.testBase64Encode 3 avgt 10 40.059 ± 0.444 ns/op
> Base64Encode.testBase64Encode 6 avgt 10 47.958 ± 0.328 ns/op
> Base64Encode.testBase64Encode 7 avgt 10 49.017 ± 1.305 ns/op
> Base64Encode.testBase64Encode 9 avgt 10 53.150 ± 0.769 ns/op
> Base64Encode.testBase64Encode 10 avgt 10 55.418 ± 0.316 ns/op
> Base64Encode.testBase64Encode 48 avgt 10 93.517 ± 0.391 ns/op
> Base64Encode.testBase64Encode 512 avgt 10 494.809 ± 0.413 ns/op
> Base64Encode.testBase64Encode 1000 avgt 10 898.581 ± 0.944 ns/op
> Base64Encode.testBase64Encode 20000 avgt 10 16464.411 ± 7.582 ns/op
>
> # kunpeng 920, intrinsic
> Base64Encode.testBase64Encode 1 avgt 10 17.494 ± 0.012 ns/op
> Base64Encode.testBase64Encode 2 avgt 10 21.023 ± 0.169 ns/op
> Base64Encode.testBase64Encode 3 avgt 10 25.772 ± 0.138 ns/op
> Base64Encode.testBase64Encode 6 avgt 10 30.121 ± 0.347 ns/op
> Base64Encode.testBase64Encode 7 avgt 10 31.591 ± 0.238 ns/op
> Base64Encode.testBase64Encode 9 avgt 10 32.728 ± 0.395 ns/op
> Base64Encode.testBase64Encode 10 avgt 10 35.110 ± 0.215 ns/op
> Base64Encode.testBase64Encode 48 avgt 10 48.621 ± 0.314 ns/op
> Base64Encode.testBase64Encode 512 avgt 10 113.391 ± 0.554 ns/op
> Base64Encode.testBase64Encode 1000 avgt 10 180.749 ± 0.193 ns/op
> Base64Encode.testBase64Encode 20000 avgt 10 3273.961 ± 5.706 ns/op
>
> # kunpeng 920, default
> Base64Encode.testBase64Encode 1 avgt 10 17.428 ± 0.037 ns/op
> Base64Encode.testBase64Encode 2 avgt 10 20.926 ± 0.155 ns/op
> Base64Encode.testBase64Encode 3 avgt 10 25.466 ± 0.140 ns/op
> Base64Encode.testBase64Encode 6 avgt 10 32.526 ± 0.190 ns/op
> Base64Encode.testBase64Encode 7 avgt 10 34.132 ± 0.387 ns/op
> Base64Encode.testBase64Encode 9 avgt 10 36.685 ± 0.212 ns/op
> Base64Encode.testBase64Encode 10 avgt 10 38.117 ± 0.246 ns/op
> Base64Encode.testBase64Encode 48 avgt 10 62.447 ± 0.900 ns/op
> Base64Encode.testBase64Encode 512 avgt 10 377.275 ± 0.162 ns/op
> Base64Encode.testBase64Encode 1000 avgt 10 700.628 ± 0.509 ns/op
> Base64Encode.testBase64Encode 20000 avgt 10 13626.764 ± 3.448 ns/op
Dong Bo has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:
fix register naming style
-------------
Changes:
- all: https://git.openjdk.java.net/jdk/pull/992/files
- new: https://git.openjdk.java.net/jdk/pull/992/files/e3380c84..5f4bc36c
Webrevs:
- full: https://webrevs.openjdk.java.net/?repo=jdk&pr=992&range=06
- incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=992&range=05-06
Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod
Patch: https://git.openjdk.java.net/jdk/pull/992.diff
Fetch: git fetch https://git.openjdk.java.net/jdk pull/992/head:pull/992
PR: https://git.openjdk.java.net/jdk/pull/992
More information about the hotspot-dev
mailing list