RFR: 8318650: Optimized subword gather for x86 targets. [v10]

Emanuel Peter epeter at openjdk.org
Tue Jan 16 07:28:22 UTC 2024


On Tue, 16 Jan 2024 06:08:40 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 1627:
>> 
>>> 1625:     vpsrlvd(dst, dst, xtmp, vlen_enc);
>>> 1626:     // Pack double word vector into byte vector.
>>> 1627:     vpackI2X(T_BYTE, dst, ones, xtmp, vlen_enc);
>> 
>> I would prefer if there was less code duplication here. I think there are just a few values which you could set to variables, and then apply for both versions.
>
> Meaty part of the algorithm accept different operands, line #1593, #1599 and #1601, keep two flows for SHORT and BYTE separate will be better maintainable.

All I see is replacing a 3 with a 4, etc. It just makes this very long to review, and spot the differences.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16354#discussion_r1453017659


More information about the hotspot-compiler-dev mailing list