RFR: 8318650: Optimized subword gather for x86 targets. [v11]
Jatin Bhateja
jbhateja at openjdk.org
Fri Feb 2 03:38:05 UTC 2024
On Wed, 31 Jan 2024 21:29:08 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:
>> Jatin Bhateja has refreshed the contents of this pull request, and previous commits have been removed. Incremental views are not available.
>
> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 1613:
>
>> 1611: vpand(xtmp, idx_vec, xtmp, vlen_enc);
>> 1612: // Load double words from normalized indices.
>> 1613: evpgatherdd(dst, gmask, Address(base, xtmp, scale), vlen_enc);
>
> Considering the byte vector case, could we not do here directly:
> evpgatherdd(dst, gmask, Address(base, idx_vec, scale), vlen_enc);
> Then we dont need lines 1609-1611 and also 1616-1621 as well.
Gathering is happening at doubleword granularity, where as masks are applier over sub-word lanes.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/16354#discussion_r1475481601
More information about the hotspot-compiler-dev
mailing list