RFR: 8322768: Optimize non-subword vector compress and expand APIs for AVX2 target. [v5]
Emanuel Peter
epeter at openjdk.org
Tue Jan 16 07:11:25 UTC 2024
On Tue, 16 Jan 2024 06:13:43 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:
>> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 5309:
>>
>>> 5307: assert(bt == T_LONG || bt == T_DOUBLE, "");
>>> 5308: vmovmskpd(rtmp, mask, vec_enc);
>>> 5309: shlq(rtmp, 5); // for 64 bit rows (4 longs)
>>
>> Suggestion:
>>
>> shlq(rtmp, 5); // for 32 bit rows (4 longs)
>
> Each long/double permute lane holds 64 bit value.
@jatin-bhateja so why do you shift by 5? I thought 4 longs are 32 bit?
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/17261#discussion_r1453003935
More information about the core-libs-dev
mailing list