[vectorIntrinsics] RFR: 8284459: Add x86 back-end implementation for LEADING and TRAILING ZEROS COUNT operations [v4]

Jatin Bhateja jbhateja at openjdk.java.net
Fri Apr 22 05:19:02 UTC 2022


On Thu, 21 Apr 2022 00:51:34 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:

> I only have one comment remaining. We do auto-vectorize PopCountVL today. The masked support for auto-vectorized tail loop was added recently on mainline. So on masked path we should handle the conversion from long to int when the result type is int vector. Rest of the patch looks good to me. Please fix and integrate.

Hi @sviswa7 , Thanks for your comments, post-loop tail vectorization support does not generate predicated operations apart from vector load/store currently, since SLP algorithm operates over expression trees delimited by memory operations [or end in scalarized users] hence just predicating load/store is sufficient to ensure correct semantics.

> src/hotspot/cpu/x86/x86.ad line 9251:
> 
>> 9249:     __ evmovdquq($dst$$XMMRegister, $src$$XMMRegister, vlen_enc);
>> 9250:     __ vector_count_leading_zeros_evex(bt, $dst$$XMMRegister, $src$$XMMRegister, xnoreg, xnoreg,
>> 9251:                                        xnoreg, $mask$$KRegister, noreg, true, vlen_enc);
> 
> For the PopCountVL and CountLeadingZerosV for long, if the result type is T_INT the evpmovqd instruction needs to be generated.

Its only through vector API will land in this control path and in case of vector API result lane size matches the source lane size.

-------------

PR: https://git.openjdk.java.net/panama-vector/pull/189


More information about the panama-dev mailing list