RFR: 8318650: Optimized subword gather for x86 targets. [v10]
Emanuel Peter
epeter at openjdk.org
Tue Jan 16 07:28:22 UTC 2024
On Tue, 16 Jan 2024 06:08:40 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:
>> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 1627:
>>
>>> 1625: vpsrlvd(dst, dst, xtmp, vlen_enc);
>>> 1626: // Pack double word vector into byte vector.
>>> 1627: vpackI2X(T_BYTE, dst, ones, xtmp, vlen_enc);
>>
>> I would prefer if there was less code duplication here. I think there are just a few values which you could set to variables, and then apply for both versions.
>
> Meaty part of the algorithm accept different operands, line #1593, #1599 and #1601, keep two flows for SHORT and BYTE separate will be better maintainable.
All I see is replacing a 3 with a 4, etc. It just makes this very long to review, and spot the differences.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/16354#discussion_r1453017659
More information about the hotspot-compiler-dev
mailing list