[vectorIntrinsics+compress] RFR: 8274971: Add PrefixMask API
Jatin Bhateja
jbhateja at openjdk.java.net
Mon Oct 11 09:58:22 UTC 2021
On Fri, 8 Oct 2021 11:56:05 GMT, Joshua Zhu <jzhu at openjdk.org> wrote:
> I separate my implementation of "compress" API into several patches for easy review.
> This change is to import PrefixMask API for VectorMask.
> It cooperates with compress/expand API. (See the usage in ALIBABA selectiveStore use case.)
> It returns a prefix mask, based on the true count of the mask.
> Assume "N" is the true count of the mask, the mask bit is set from the beginning lane till the lane numbered "N-1", otherwise it is unset.
> Temporarily mask.prefixMask() is implemented by
>
> vectorSpecies.iota().compare(VectorOperators.LT, trueCount());
>
> The alternative implementation is:
>
> vectorSpecies().indexInRange(0, m.trueCount())
>
> I choose the former implementation since the latter depends on the Intrinsic support of indexVector.
>
> I'm looking for instructions that could be used to accelerate indexVector/iota, so that vector-to-vector operations together with a store/load and prefix mask could be optimized further into single memory version instruction.
> Intel experts, do you have any suggestions on SIMD instructions for iota vector generation?
src/jdk.incubator.vector/share/classes/jdk/incubator/vector/VectorMask.java line 640:
> 638: return species.iota().compare(VectorOperators.LT, trueCount());
> 639: }
> 640:
Not sure if this API is needed as a public interface since all the constituents are publicly exposed APIs.
-------------
PR: https://git.openjdk.java.net/panama-vector/pull/148
More information about the panama-dev
mailing list