RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v2]
CoreyAshford
github.com+51754783+coreyashford at openjdk.java.net
Mon Oct 5 18:29:58 UTC 2020
> This patch set encompasses the following commits:
>
> - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for
> the intrinsic. The API is similar to the existing encodeBlock intrinsic.
> - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic
> implementation
> - Adds a Power64LE-specific implementation of the decodeBlock intrinsic.
> - Adds a JMH microbenchmark for both Base64 encoding and encoding.
> - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding.
CoreyAshford has updated the pull request with a new target base due to a merge or a rebase. The pull request now
contains ten commits:
- AOT: Revert change to aotCodeHeap.cpp for decodeBlock
Don't add the SET_AOT_GLOBAL_SYMBOL_VALUE macro for decode block until all
arches that implement AOT, implement the decodeBlock intrinsic.
- Base64.java decodeBlock: Changes from PR review
* Make comparison safer and consistent with the while loop
* Update comment about the decodeBlock intrinsic so that it matches the new structure
* Add comment about the lack of a length check on the destination buffer
* As per issue 8138732, change HotSpotIntrinsicCandidate to IntrinsicCandidate
- stubGenerator_ppc.cpp: Changes from PR review
* Fix clearing of upper bits to clear 32 bits instead of 31 (due to misreading of clrldi instruction)
* change and document loop_unrolls setting from 8 to 2 after re-running the benchmark
* align unrolled loop on a 32-byte boundary
* replace instruction used for checking isURL from a double word to single
word instruction since the register is effectively 32 bits wide
* cosmetic change to realign register comments.
- TestBase64.java: Changes from PR review
* Use Utils.toByteArrays() method instead of a locally-defined method
* Generate the two non-Base64 tables dynamically rather than use static initialization
* Added comments describing the two above-mentioned arrays
- Expand the Base64 intrinsic regression test to cover decodeBlock
This patch makes four significant changes:
1) The Power implementation of the decodeBlock intrinsic, at least,
requires a decode length of at least 128 bytes, but the existing test cases
are much shorter, maxing out at 111 bytes. So the patch adds a new input
data file which has longer test cases in it.
2) The original test cases only covers the encoding of just the printable
subset of the 7-bit ASCII characters. However, Base64 encoding requires
being able to encode arbitrary binary data, i.e. it must handle all 256
8-bit byte encodings. To remedy this, but keep the original line-oriented
style of the input data, I added another input file type that uses a simple
ASCII hexadecimal encoding - two ASCII hex characters per 8-bit byte. When
test0 is called, a new parameter is passed that specifies the type of the
input file, which is either the original ASCII type or the hexadecimal
format. So to test both longer input data and arbitrary 8-bit data, the
newly added input test file has test cases which are both longer and
encoded in ASCII hex so as to give full 8-bit capability. When reading
this type of file, test0 calls a newly-added function to translate the
ASCII hex to binary data. Except for the first line of input data, which
contains all possible 8-bit values sequentially, the input data was
generated using a random length (between 111 and 520 bytes) buffer filled
with random 8-bit data, which should give adequate coverage.
3) The original test did not test that the decoder detects illegal Base64
bytes. This change chooses a random location in the encoded data to
corrupt with a randomly-chosen byte which is illegal for the specific
Base64 encoding that is chosen (i.e. standard or URLsafe). It then calls
the decode function to verify that the illegal byte is detected and the
proper exception is thrown.
4) The test iteration count was originally 100K, but that is far more than
enough iterations to test the intrinsic. It takes 20K iterations on each
instrinsic for HotSpot C2 to begin calling it. The test originally had
three types of encodings to test and called the encode intrinsic four times
for each iteration, which works out to 100K * 3 * 4 = 1.2M calls just to
encode. Decode was called four times as well (now five because of the
illegal byte test). I believe this is excessive and with the extra test
data I have added, the test was timing out after ten minutes of execution.
It appears that it is timing out, not because the intrinsics take a long
time to run, but because test0 generates an enormous number of discarded
data buffers for the GC system to recover (the test runs at about 39GB of
virtual memory on my test machine). To remedy the timeout problem, I have
changed the code so that a warmup function of 20K repetitions is performed
on a fixed buffer, to activate the instrinsic(s). After the warmup, I have
reduced the number of iterations to 5K on each test0 call. This should
give adequate coverage.
- Add JMH benchmark for Base64 variable length buffer decoding
- Add Power9+ intrinsic implementation for Base64 decoding
- Add HotSpot code to implement Base64 decodeBlock API
- Add HotSpotIntrinsicCandidate and API for Base64 decoding
-------------
Changes: https://git.openjdk.java.net/jdk/pull/293/files
Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=293&range=01
Stats: 1873 lines in 23 files changed: 1846 ins; 4 del; 23 mod
Patch: https://git.openjdk.java.net/jdk/pull/293.diff
Fetch: git fetch https://git.openjdk.java.net/jdk pull/293/head:pull/293
PR: https://git.openjdk.java.net/jdk/pull/293
More information about the core-libs-dev
mailing list