RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v2]

CoreyAshford github.com+51754783+coreyashford at openjdk.java.net
Mon Oct 5 18:29:58 UTC 2020


> This patch set encompasses the following commits:
> 
> - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for
>   the intrinsic.  The API is similar to the existing encodeBlock intrinsic.
> - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic
>   implementation
> - Adds a Power64LE-specific implementation of the decodeBlock intrinsic.
> - Adds a JMH microbenchmark for both Base64 encoding and encoding.
> - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding.

CoreyAshford has updated the pull request with a new target base due to a merge or a rebase. The pull request now
contains ten commits:

 - AOT: Revert change to aotCodeHeap.cpp for decodeBlock
   
   Don't add the SET_AOT_GLOBAL_SYMBOL_VALUE macro for decode block until all
   arches that implement AOT, implement the decodeBlock intrinsic.
 - Base64.java decodeBlock: Changes from PR review
   
    * Make comparison safer and consistent with the while loop
    * Update comment about the decodeBlock intrinsic so that it matches the new structure
    * Add comment about the lack of a length check on the destination buffer
    * As per issue 8138732, change HotSpotIntrinsicCandidate to IntrinsicCandidate
 - stubGenerator_ppc.cpp: Changes from PR review
   
    * Fix clearing of upper bits to clear 32 bits instead of 31 (due to misreading of clrldi instruction)
    * change and document loop_unrolls setting from 8 to 2 after re-running the benchmark
    * align unrolled loop on a 32-byte boundary
    * replace instruction used for checking isURL from a double word to single
      word instruction since the register is effectively 32 bits wide
    * cosmetic change to realign register comments.
 - TestBase64.java: Changes from PR review
   
    * Use Utils.toByteArrays() method instead of a locally-defined method
    * Generate the two non-Base64 tables dynamically rather than use static initialization
    * Added comments describing the two above-mentioned arrays
 - Expand the Base64 intrinsic regression test to cover decodeBlock
   
   This patch makes four significant changes:
   
   1) The Power implementation of the decodeBlock intrinsic, at least,
   requires a decode length of at least 128 bytes, but the existing test cases
   are much shorter, maxing out at 111 bytes.  So the patch adds a new input
   data file which has longer test cases in it.
   
   2) The original test cases only covers the encoding of just the printable
   subset of the 7-bit ASCII characters.  However, Base64 encoding requires
   being able to encode arbitrary binary data, i.e. it must handle all 256
   8-bit byte encodings.  To remedy this, but keep the original line-oriented
   style of the input data, I added another input file type that uses a simple
   ASCII hexadecimal encoding - two ASCII hex characters per 8-bit byte.  When
   test0 is called, a new parameter is passed that specifies the type of the
   input file, which is either the original ASCII type or the hexadecimal
   format.  So to test both longer input data and arbitrary 8-bit data, the
   newly added input test file has test cases which are both longer and
   encoded in ASCII hex so as to give full 8-bit capability.  When reading
   this type of file, test0 calls a newly-added function to translate the
   ASCII hex to binary data.  Except for the first line of input data, which
   contains all possible 8-bit values sequentially, the input data was
   generated using a random length (between 111 and 520 bytes) buffer filled
   with random 8-bit data, which should give adequate coverage.
   
   3) The original test did not test that the decoder detects illegal Base64
   bytes.  This change chooses a random location in the encoded data to
   corrupt with a randomly-chosen byte which is illegal for the specific
   Base64 encoding that is chosen (i.e. standard or URLsafe).  It then calls
   the decode function to verify that the illegal byte is detected and the
   proper exception is thrown.
   
   4) The test iteration count was originally 100K, but that is far more than
   enough iterations to test the intrinsic.  It takes 20K iterations on each
   instrinsic for HotSpot C2 to begin calling it.  The test originally had
   three types of encodings to test and called the encode intrinsic four times
   for each iteration, which works out to 100K * 3 * 4 = 1.2M calls just to
   encode.  Decode was called four times as well (now five because of the
   illegal byte test).  I believe this is excessive and with the extra test
   data I have added, the test was timing out after ten minutes of execution.
   It appears that it is timing out, not because the intrinsics take a long
   time to run, but because test0 generates an enormous number of discarded
   data buffers for the GC system to recover (the test runs at about 39GB of
   virtual memory on my test machine).  To remedy the timeout problem, I have
   changed the code so that a warmup function of 20K repetitions is performed
   on a fixed buffer, to activate the instrinsic(s).  After the warmup, I have
   reduced the number of iterations to 5K on each test0 call.  This should
   give adequate coverage.
 - Add JMH benchmark for Base64 variable length buffer decoding
 - Add Power9+ intrinsic implementation for Base64 decoding
 - Add HotSpot code to implement Base64 decodeBlock API
 - Add HotSpotIntrinsicCandidate and API for Base64 decoding

-------------

Changes: https://git.openjdk.java.net/jdk/pull/293/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=293&range=01
  Stats: 1873 lines in 23 files changed: 1846 ins; 4 del; 23 mod
  Patch: https://git.openjdk.java.net/jdk/pull/293.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/293/head:pull/293

PR: https://git.openjdk.java.net/jdk/pull/293


More information about the core-libs-dev mailing list