[vectorIntrinsics] RFR: Improve mask reduction operations on AVX [v4]

Sandhya Viswanathan sviswanathan at openjdk.java.net
Wed Nov 17 01:17:57 UTC 2021


On Tue, 16 Nov 2021 08:16:11 GMT, Mai Đặng Quân Anh <duke at openjdk.java.net> wrote:

>> Hi,
>> This patch improves the logic of vector mask reduction operations on AVX, especially int, float, long, double, by using vmovmskpd and vmovmskps instructions. I also do a little refactoring to reduce duplication in toLong. The patch temporarily disables these operations on Neon, though.
>> Thank you very much.
>
> Mai Đặng Quân Anh has updated the pull request incrementally with one additional commit since the last revision:
> 
>   use duplicate dst instead of noreg, use lzcnt for hardwares support it

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4058:

> 4056: void C2_MacroAssembler::vector_mask_operation(int opc, Register dst, KRegister mask,
> 4057:                                               int masklen, int masksize, int vec_enc) {
> 4058:   assert(VM_Version::supports_bmi1(), "Strange hardware");

Also need to check for supports_lzcnt().

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4148:

> 4146:       break;
> 4147:     case Op_VectorMaskLastTrue:
> 4148:       if (VM_Version::supports_bmi1()) {

This should be supports_lzcnt().

-------------

PR: https://git.openjdk.java.net/panama-vector/pull/158


More information about the panama-dev mailing list