[vectorIntrinsics+compress] RFR: 8274971: Add PrefixMask API

Mon Oct 11 09:58:22 UTC 2021

On Fri, 8 Oct 2021 11:56:05 GMT, Joshua Zhu <jzhu at openjdk.org> wrote:

> I separate my implementation of "compress" API into several patches for easy review.
> This change is to import PrefixMask API for VectorMask.
> It cooperates with compress/expand API. (See the usage in ALIBABA selectiveStore use case.)
> It returns a prefix mask, based on the true count of the mask.
> Assume "N" is the true count of the mask, the mask bit is set from the beginning lane till the lane numbered "N-1", otherwise it is unset.
> Temporarily mask.prefixMask() is implemented by
> 
>     vectorSpecies.iota().compare(VectorOperators.LT, trueCount());
> 
> The alternative implementation is:
> 
>     vectorSpecies().indexInRange(0, m.trueCount())
> 
> I choose the former implementation since the latter depends on the Intrinsic support of indexVector.
> 
> I'm looking for instructions that could be used to accelerate indexVector/iota, so that vector-to-vector operations together with a store/load and prefix mask could be optimized further into single memory version instruction. 
> Intel experts, do you have any suggestions on SIMD instructions for iota vector generation?

src/jdk.incubator.vector/share/classes/jdk/incubator/vector/VectorMask.java line 640:

> 638:         return species.iota().compare(VectorOperators.LT, trueCount());
> 639:     }
> 640: 

Not sure if this API is needed as a public interface since all the constituents are publicly exposed APIs.

-------------

PR: https://git.openjdk.java.net/panama-vector/pull/148