RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v6]

Scott Gibbons duke at openjdk.org
Wed Feb 1 21:02:38 UTC 2023


On Fri, 27 Jan 2023 21:45:37 GMT, Claes Redestad <redestad at openjdk.org> wrote:

>> Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 13 additional commits since the last revision:
>> 
>>  - Merge branch 'openjdk:master' into Base64-AVX2
>>  - Merge branch 'openjdk:master' into Base64-AVX2
>>  - Merge branch 'openjdk:master' into Base64-AVX2
>>  - Merge branch 'Base64-AVX2' of https://github.com/asgibbons/jdk into Base64-AVX2
>>  - Merge branch 'openjdk:master' into Base64-AVX2
>>  - Address review comment
>>  - Remove whitespace
>>  - Fix wrong register usage
>>  - Working version of Base64 decode with AVX2 (4x perf improvement). No URL support
>>  - Merge branch 'Base64-AVX2' of https://github.com/asgibbons/jdk into Base64-AVX2
>>  - ... and 3 more: https://git.openjdk.org/jdk/compare/52737e0d...3e66f7be
>
> src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2651:
> 
>> 2649:     __ jcc(Assembler::notZero, L_tailProc);
>> 2650: 
>> 2651:     __ cmpl(length, 44);
> 
> Perform `length` checks first to avoid unnecessary branches on small inputs?
> 
> Ideal might be to move this length check up just before the `_cmpl(length, 128)` in the AVX-512 block, so that if `AVX=3` short inputs branch directly to the scalar tail procedure without jumping around. This might also apply to the encode stub, though that's pre-existing.

Done.

-------------

PR: https://git.openjdk.org/jdk/pull/12126


More information about the core-libs-dev mailing list