RFR: 8322768: Optimize non-subword vector compress and expand APIs for AVX2 target. [v2]
Jatin Bhateja
jbhateja at openjdk.org
Fri Jan 5 07:08:37 UTC 2024
On Thu, 4 Jan 2024 13:41:40 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
>>
>> Updating copyright year of modified files.
>
> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 5307:
>
>> 5305: assert(bt == T_LONG || bt == T_DOUBLE, "");
>> 5306: vmovmskpd(rtmp, mask, vec_enc);
>> 5307: shlq(rtmp, 5);
>
> Might this need to be 6? If I understand right, then you want to have a 64bit stride, hence 2^6, right?
> If that is correct, then this did not show in your tests, and you need a regression test anyway.
This computes the byte offset from start of the table, both integer and long permute table have same row sizes, 8 int elements vs 4 long elements.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/17261#discussion_r1442555037
More information about the hotspot-compiler-dev
mailing list