RFR: 8318650: Optimized subword gather for x86 targets. [v10]
Jatin Bhateja
jbhateja at openjdk.org
Tue Jan 16 06:20:24 UTC 2024
On Mon, 15 Jan 2024 14:27:43 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>> Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits:
>>
>> - Accelerating masked sub-word gathers for AVX2 targets, this gives additional 1.5-4x speedups over existing implementation.
>> - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8318650
>> - Removing JDK-8321648 related changes.
>> - Refined AVX3 implementation with integral gather.
>> - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8318650
>> - Fix incorrect comment
>> - Review comments resolutions.
>> - Review comments resolutions.
>> - Review comments resolutions.
>> - Restricting masked sub-word gather to AVX512 target to align with integral gather support.
>> - ... and 2 more: https://git.openjdk.org/jdk/compare/518ec971...de47076e
>
> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 1900:
>
>> 1898: vgather8b(elem_ty, xtmp3, base, idx_base, rtmp, vlen_enc);
>> 1899: } else {
>> 1900: LP64_ONLY(vgather8b_masked(elem_ty, xtmp3, base, idx_base, mask, midx, rtmp, vlen_enc));
>
> What happens if if not `LP64_ONLY`?
32bit skip over check is part of match_rule_supported_vector, https://github.com/openjdk/jdk/pull/16354/files#diff-d6a3624f0f0af65a98a47378a5c146eed5016ca09b4de1acd0a3acc823242e82R1921
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/16354#discussion_r1452969364
More information about the core-libs-dev
mailing list