RFR: 8248188: Add IntrinsicCandidate and API for Base64 decoding [v4]
CoreyAshford
github.com+51754783+coreyashford at openjdk.java.net
Mon Oct 12 17:55:14 UTC 2020
On Mon, 12 Oct 2020 12:36:19 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:
>> CoreyAshford has updated the pull request incrementally with two additional commits since the last revision:
>>
>> - TestBase64.java: fix comment to correctly reflect actual intrinsic names.
>>
>> The intrinsic names that are visible with -XX:+PrintCompilation are encode
>> and decode, rather than encodeBlock and decodeBlock.
>> - stubGenerator_ppc.cpp: fix regression caused by change to using loop counter
>>
>> My original fix didn't account for the case where sl < block_size. In the
>> event sl < block_size, the shifted sl will become zero, so it should
>> jump to the code that computes how much data was processed - 0 - and return.
>
> src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 3803:
>
>> 3801: // Base64 class will be used to process the last 12 characters.
>> 3802: __ sub(sl, sl, sp);
>> 3803: __ subi(sl, sl, 12);
>
> I think we should subtract 4, now. srawi will round it down below. We have no guarantee that we can subract more than 4
> without getting negative value.
In the original paper this code is based upon, they subtract 12 because of the overwrite issue. This is discussed in
the preceding code comment as well. So I think that needs to be retained, but I do need to check for a negative after
the subtract.
According to the `srawi.` specification:
> CA and CA32 are set to 1 if the low-order 32 bits of (RS) contain a negative number and any 1-bits are shifted out of
> position 63; otherwise CA and CA32 are set to 0.
Because the `sub` instruction is a 64-bit subtract, all of the upper bits should be 1's if sl is negative after the
subtract, so I think the `srawi.` should catch the negative case if I also check CA after the srawi, via:
__ srawi_(sl, sl, block_size_shift);
// if XER CA is set, sl was less than zero.
__ mcrxrx(CCR2); // moves XER's OV, OV32, CA, CA32 to CCR2's LT, GT, EQ, SO bits, respectively.
__ beq_predict_not_taken(CCR2, unrolled_loop_exit);
__ beq_predict_not_taken(CCR0, unrolled_loop_exit);
-------------
PR: https://git.openjdk.java.net/jdk/pull/293
More information about the core-libs-dev
mailing list