RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v8]

Thu Oct 22 14:59:21 UTC 2020

On Wed, 21 Oct 2020 20:43:30 GMT, CoreyAshford <github.com+51754783+CoreyAshford at openjdk.org> wrote:

>> This patch set encompasses the following commits:
>> 
>> - Adds a new HotSpot intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for the intrinsic.  The API is similar to the existing encodeBlock intrinsic.
>> - Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic implementation
>> - Adds a Power64LE-specific implementation of the decodeBlock intrinsic.
>> - Adds a JMH microbenchmark for both Base64 encoding and encoding.
>> - Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding.
>
> CoreyAshford has updated the pull request incrementally with one additional commit since the last revision:
> 
>   TestBase64.java: remove jdk.test.lib.Utils from @build which was causing Tier3 failures.

I took a look at the VSX algo.  I haven't looked much beyond it. I had a few questions I've inlined.  It does look like a faithful VSX implementation of the linked algo.

src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3817:

> 3815:         __ xxperm(offsets->to_vsr(), offsetLUT, higher_nibble->to_vsr());
> 3816: 
> 3817:         // Find out which elemets are the special case character (isURL ? '/' : '-')

Trivial nit, s/elemets/elements/

src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3820:

> 3818:         __ vcmpequb_(eq_special_case_char, input, vec_special_case_char);
> 3819:         //
> 3820:         // There's a (63/64)^16 = 77.7% chance that there are no special

I think that assumes uniformly randomized data, is that a good assumption?   Is it measurably faster to skip around the xxsel instead of doing it unconditionally?

src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3858:

> 3856: 
> 3857:         // The Base64 characters had no errors, so add the offsets
> 3858:         __ vaddubm(input, input, offsets);

I think this looks like a correct implementation of the algo in VSX.

src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3878:

> 3876:             // |    Element    |             |                      |                      |             |             |                      |                      |             |
> 3877:             // +===============+=============+======================+======================+=============+=============+======================+======================+=============+
> 3878:             // | after vaddubm | 00||b0:0..5 | 00||b0:6..7||b1:0..3 | 00||b1:4..7||b2:0..1 | 00||b2:2..7 | 00||b3:0..5 | 00||b3:6..7||b4:0..3 | 00||b4:4..7||b5:0..1 | 00||b5:2..7 |

An extra line here showing how the 8 6-bit values above get mapping into 6 bytes greatly help my brain out. (likewise for the <P10 case below too)

src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3884:

> 3882:             // |   vec_0x3fs   |  00111111   |       00111111       |       00111111       |  00111111   |  00111111   |       00111111       |       00111111       |  00111111   |
> 3883:             // +---------------+-------------+----------------------+----------------------+-------------+-------------+----------------------+----------------------+-------------+
> 3884:             // | after vpextd  |   b5:0..7   |       b4:0..7        |       b3:0..7        |   b2:0..7   |   b1:0..7   |       b0:0..7        |       00000000       |  00000000   |

Are theses comments correct or am I misunderstanding this? I read the final result as something starting as `b5:2..7 || b4:4..7|| b5:0..1` from vpextd.

-------------

PR: https://git.openjdk.java.net/jdk/pull/293