RFR: 8256245: AArch64: Implement Base64 decoding intrinsic

Dong Bo dongbo at openjdk.java.net
Sat Mar 27 09:05:45 UTC 2021


In JDK-8248188, IntrinsicCandidate and API is added for Base64 decoding.
Base64 decoding can be improved on aarch64 with ld4/tbl/tbx/st3, a basic idea can be found at http://0x80.pl/articles/base64-simd-neon.html#encoding-quadwords.

Patch passed jtreg tier1-3 tests with linux-aarch64-server-fastdebug build.
Tests in `test/jdk/java/util/Base64/` and `compiler/intrinsics/base64/TestBase64.java` runned specially for the correctness of the implementation.

There can be illegal characters at the start of the input if the data is MIME encoded.
It would be no benefits to use SIMD for this case, so the stub use no-simd instructions for MIME encoded data now.

A JMH micro, Base64Decode.java, is added for performance test.
With different input length (upper-bounded by parameter `maxNumBytes` in the JMH micro),
we witness ~2.5x improvements with long inputs and no regression with short inputs for raw base64 decodeing, minor improvements (~10.95%) for MIME on Kunpeng916.

The Base64Decode.java JMH micro-benchmark results:

# Kunpeng916, intrinsic
Base64Decode.testBase64Decode               4              1  avgt    5      48.614 ±     0.609  ns/op
Base64Decode.testBase64Decode               4              3  avgt    5      58.199 ±     1.650  ns/op
Base64Decode.testBase64Decode               4              7  avgt    5      69.400 ±     0.931  ns/op
Base64Decode.testBase64Decode               4             32  avgt    5      96.818 ±     1.687  ns/op
Base64Decode.testBase64Decode               4             64  avgt    5     122.856 ±     9.217  ns/op
Base64Decode.testBase64Decode               4             80  avgt    5     130.935 ±     1.667  ns/op
Base64Decode.testBase64Decode               4             96  avgt    5     143.627 ±     1.751  ns/op
Base64Decode.testBase64Decode               4            112  avgt    5     152.311 ±     1.178  ns/op
Base64Decode.testBase64Decode               4            512  avgt    5     342.631 ±     0.584  ns/op
Base64Decode.testBase64Decode               4           1000  avgt    5     573.635 ±     1.050  ns/op
Base64Decode.testBase64Decode               4          20000  avgt    5    9534.136 ±    45.172  ns/op
Base64Decode.testBase64Decode               4          50000  avgt    5   22718.726 ±   192.070  ns/op
Base64Decode.testBase64MIMEDecode           4              1  avgt   10      63.558 ±    0.336  ns/op
Base64Decode.testBase64MIMEDecode           4              3  avgt   10      82.504 ±    0.848  ns/op
Base64Decode.testBase64MIMEDecode           4              7  avgt   10     120.591 ±    0.608  ns/op
Base64Decode.testBase64MIMEDecode           4             32  avgt   10     324.314 ±    6.236  ns/op
Base64Decode.testBase64MIMEDecode           4             64  avgt   10     532.678 ±    4.670  ns/op
Base64Decode.testBase64MIMEDecode           4             80  avgt   10     678.126 ±    4.324  ns/op
Base64Decode.testBase64MIMEDecode           4             96  avgt   10     771.603 ±    6.393  ns/op
Base64Decode.testBase64MIMEDecode           4            112  avgt   10     889.608 ±   0.759  ns/op
Base64Decode.testBase64MIMEDecode           4            512  avgt   10    3663.557 ±    3.422  ns/op
Base64Decode.testBase64MIMEDecode           4           1000  avgt   10    7017.784 ±    9.128  ns/op
Base64Decode.testBase64MIMEDecode           4          20000  avgt   10  128670.660 ± 7951.521  ns/op
Base64Decode.testBase64MIMEDecode           4          50000  avgt   10  317113.667 ±  161.758  ns/op

# Kunpeng916, default
Base64Decode.testBase64Decode               4              1  avgt    5      48.455 ±   0.571  ns/op
Base64Decode.testBase64Decode               4              3  avgt    5      57.937 ±   0.505  ns/op
Base64Decode.testBase64Decode               4              7  avgt    5      73.823 ±   1.452  ns/op
Base64Decode.testBase64Decode               4             32  avgt    5     106.484 ±   1.243  ns/op
Base64Decode.testBase64Decode               4             64  avgt    5     141.004 ±   1.188  ns/op
Base64Decode.testBase64Decode               4             80  avgt    5     156.284 ±   0.572  ns/op
Base64Decode.testBase64Decode               4             96  avgt    5     174.137 ±   0.177  ns/op
Base64Decode.testBase64Decode               4            112  avgt    5     188.445 ±   0.572  ns/op
Base64Decode.testBase64Decode               4            512  avgt    5     610.847 ±   1.559  ns/op
Base64Decode.testBase64Decode               4           1000  avgt    5    1155.368 ±   0.813  ns/op
Base64Decode.testBase64Decode               4          20000  avgt    5   19751.477 ±  24.669  ns/op
Base64Decode.testBase64Decode               4          50000  avgt    5   50046.586 ± 523.155  ns/op
Base64Decode.testBase64MIMEDecode           4              1  avgt   10      64.130 ±   0.238  ns/op
Base64Decode.testBase64MIMEDecode           4              3  avgt   10      82.096 ±   0.205  ns/op
Base64Decode.testBase64MIMEDecode           4              7  avgt   10     118.849 ±   0.610  ns/op
Base64Decode.testBase64MIMEDecode           4             32  avgt   10     331.177 ±   4.732  ns/op
Base64Decode.testBase64MIMEDecode           4             64  avgt   10     549.117 ±   0.177  ns/op
Base64Decode.testBase64MIMEDecode           4             80  avgt   10     702.951 ±   4.572  ns/op
Base64Decode.testBase64MIMEDecode           4             96  avgt   10     799.566 ±   0.301  ns/op
Base64Decode.testBase64MIMEDecode           4            112  avgt   10     923.749 ±   0.389  ns/op
Base64Decode.testBase64MIMEDecode           4            512  avgt   10    4000.725 ±   2.519  ns/op
Base64Decode.testBase64MIMEDecode           4           1000  avgt   10    7674.994 ±   9.281  ns/op
Base64Decode.testBase64MIMEDecode           4          20000  avgt   10  142059.001 ± 157.920  ns/op
Base64Decode.testBase64MIMEDecode           4          50000  avgt   10  355698.369 ± 216.542  ns/op

-------------

Commit messages:
 - 8256245: AArch64: Implement Base64 decoding intrinsic

Changes: https://git.openjdk.java.net/jdk/pull/3228/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3228&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8256245
  Stats: 410 lines in 3 files changed: 410 ins; 0 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/3228.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/3228/head:pull/3228

PR: https://git.openjdk.java.net/jdk/pull/3228


More information about the core-libs-dev mailing list