RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v6]
Scott Gibbons
duke at openjdk.org
Wed Feb 1 21:02:38 UTC 2023
On Fri, 27 Jan 2023 21:45:37 GMT, Claes Redestad <redestad at openjdk.org> wrote:
>> Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 13 additional commits since the last revision:
>>
>> - Merge branch 'openjdk:master' into Base64-AVX2
>> - Merge branch 'openjdk:master' into Base64-AVX2
>> - Merge branch 'openjdk:master' into Base64-AVX2
>> - Merge branch 'Base64-AVX2' of https://github.com/asgibbons/jdk into Base64-AVX2
>> - Merge branch 'openjdk:master' into Base64-AVX2
>> - Address review comment
>> - Remove whitespace
>> - Fix wrong register usage
>> - Working version of Base64 decode with AVX2 (4x perf improvement). No URL support
>> - Merge branch 'Base64-AVX2' of https://github.com/asgibbons/jdk into Base64-AVX2
>> - ... and 3 more: https://git.openjdk.org/jdk/compare/52737e0d...3e66f7be
>
> src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2651:
>
>> 2649: __ jcc(Assembler::notZero, L_tailProc);
>> 2650:
>> 2651: __ cmpl(length, 44);
>
> Perform `length` checks first to avoid unnecessary branches on small inputs?
>
> Ideal might be to move this length check up just before the `_cmpl(length, 128)` in the AVX-512 block, so that if `AVX=3` short inputs branch directly to the scalar tail procedure without jumping around. This might also apply to the encode stub, though that's pre-existing.
Done.
-------------
PR: https://git.openjdk.org/jdk/pull/12126
More information about the core-libs-dev
mailing list