[vectorIntrinsics] RFR: 8265312: Unsigned comparison operators

Jatin Bhateja jbhateja at openjdk.java.net
Wed Apr 21 16:56:45 UTC 2021


On Wed, 21 Apr 2021 16:24:44 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:

>> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 2183:
>> 
>>> 2181:     vpand(vtmp3, vtmp3, ExternalAddress(StubRoutines::x86::vector_short_to_byte_mask()), vlen_enc, scratch);
>>> 2182:     vpackuswb(dst, dst, vtmp3, vlen_enc);
>>> 2183:     vpermpd(dst, dst, 0xd8, vlen_enc);
>> 
>> since comparison is performed at lane level (x86 definition 128 bits) and later on we are packing the results of two lanes I am not sure about the need for last permute instruction.
>
> If you look at the packuswb, the packing is done per 128 bit alternating from src1 and src2.
> i.e. 
> 0-127 bits from src1 -> 0-63 bits in dst
> 0-127 bits from src2 -> 64-127 bits in dst
> 128-255 bits from src1 -> 128-191 bits in dst
> 128-255 bits from src2 -> 192-255 bits in dst
> The src1 and src2 are mish-mashed in dst and need to be put in their proper place by permpd.

Yes that's corrects. thanks!

-------------

PR: https://git.openjdk.java.net/panama-vector/pull/68


More information about the panama-dev mailing list