RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2
Scott Gibbons
duke at openjdk.org
Mon Jan 23 18:16:37 UTC 2023
On Mon, 23 Jan 2023 11:58:58 GMT, Claes Redestad <redestad at openjdk.org> wrote:
>> Added code for Base64 acceleration (encode and decode) which will accelerate ~4x for AVX2 platforms.
>>
>> Encode performance:
>> **Old:**
>>
>> Benchmark (maxNumBytes) Mode Cnt Score Error Units
>> Base64Encode.testBase64Encode 1024 thrpt 3 4309.439 ± 2.632 ops/ms
>>
>>
>> **New:**
>>
>> Benchmark (maxNumBytes) Mode Cnt Score Error Units
>> Base64Encode.testBase64Encode 1024 thrpt 3 24211.397 ± 102.026 ops/ms
>>
>>
>> Decode performance:
>> **Old:**
>>
>> Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units
>> Base64Decode.testBase64Decode 144 4 1024 thrpt 3 3961.768 ± 93.409 ops/ms
>>
>> **New:**
>> Benchmark (errorIndex) (lineSize) (maxNumBytes) Mode Cnt Score Error Units
>> Base64Decode.testBase64Decode 144 4 1024 thrpt 3 14738.051 ± 24.383 ops/ms
>
> src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2661:
>
>> 2659: __ vpbroadcastq(xmm4, Address(r13, 0), Assembler::AVX_256bit);
>> 2660: __ vmovdqu(xmm11, Address(r13, 0x28));
>> 2661: __ vpbroadcastb(xmm10, Address(r13, 0), Assembler::AVX_256bit);
>
> Sorry in advance since I'm probably reading this wrong: the data that `r13` is pointing to appears to be a repeated byte pattern (`0x2f2f2f...`), does this mean this `vpbroadcastb` and the `vpbroadcastq` above end up filling up their respective registers with the exact same bits? If so, and since neither of them is mutated in the code below, then perhaps this can be simplified a bit.
You're reading it correctly - this is redundant and could be handled differently, as the same value is being loaded into ymm4 and ymm10. I don't think there will be any significant performance gain either way. This was done in this manner to allow easier transition to URL acceleration when it is implemented, as URLs require handling '-' and '_' instead of '+' and '/' ('/' = 0x2f).
-------------
PR: https://git.openjdk.org/jdk/pull/12126
More information about the core-libs-dev
mailing list