RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v11]
Sandhya Viswanathan
sviswanathan at openjdk.org
Tue Feb 7 02:52:46 UTC 2023
On Tue, 7 Feb 2023 00:12:21 GMT, Scott Gibbons <duke at openjdk.org> wrote:
>> Added code for Base64 acceleration (encode and decode) which will accelerate ~4x for AVX2 platforms.
>>
>> Encode performance:
>> **Old:**
>>
>> Benchmark (maxNumBytes) Mode Cnt Score Error Units
>> Base64Encode.testBase64Encode 1024 thrpt 3 4309.439 ± 2.632 ops/ms
>>
>>
>> **New:**
>>
>> Benchmark (maxNumBytes) Mode Cnt Score Error Units
>> Base64Encode.testBase64Encode 1024 thrpt 3 24211.397 ± 102.026 ops/ms
>>
>>
>> Decode performance:
>> **Old:**
>>
>> Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units
>> Base64Decode.testBase64Decode 144 4 1024 thrpt 3 3961.768 ± 93.409 ops/ms
>>
>> **New:**
>> Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units
>> Base64Decode.testBase64Decode 144 4 1024 thrpt 3 14738.051 ± 24.383 ops/ms
>
> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision:
>
> Add algorithm comments
src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2227:
> 2225:
> 2226: // lut_roll URL
> 2227: __ emit_data64(0xb9b9bfbf04111000, relocInfo::none);
The lut_roll URL doesn't seem to be correct:
0x5F (URL base64 ASCII for "/") would need an offset of -20H i.e. 0xEC.
However the others with upper nibble as 5 need an offset of -65H i.e. 0xBF.
It looks to me that the adjustment for 5F should be -4 instead of -1 at line 2722.
-------------
PR: https://git.openjdk.org/jdk/pull/12126
More information about the hotspot-compiler-dev
mailing list