[vectorIntrinsics+compress] RFR: 8274971: Add PrefixMask API
Joshua Zhu
jzhu at openjdk.java.net
Fri Oct 8 12:01:31 UTC 2021
I separate my implementation of "compress" API into several patches for easy review.
This change is to import PrefixMask API for VectorMask.
It cooperates with compress/expand API. (See the usage in ALIBABA selectiveStore use case.)
It returns a prefix mask, based on the true count of the mask.
Assume "N" is the true count of the mask, the mask bit is set from the beginning lane till the lane numbered "N-1", otherwise it is unset.
Temporarily mask.prefixMask() is implemented by
vectorSpecies.iota().compare(VectorOperators.LT, trueCount());
The alternative implementation is:
vectorSpecies().indexInRange(0, m.trueCount())
I choose the former implementation since the latter depends on the Intrinsic support of indexVector.
I'm looking for instructions that could be used to accelerate indexVector/iota, so that vector-to-vector operations together with a store/load and prefix mask could be optimized further into single memory version instruction.
Intel experts, do you have any suggestions on SIMD instructions for iota vector generation?
-------------
Commit messages:
- 8274971: Add PrefixMask API
Changes: https://git.openjdk.java.net/panama-vector/pull/148/files
Webrev: https://webrevs.openjdk.java.net/?repo=panama-vector&pr=148&range=00
Issue: https://bugs.openjdk.java.net/browse/JDK-8274971
Stats: 16 lines in 1 file changed: 16 ins; 0 del; 0 mod
Patch: https://git.openjdk.java.net/panama-vector/pull/148.diff
Fetch: git fetch https://git.openjdk.java.net/panama-vector pull/148/head:pull/148
PR: https://git.openjdk.java.net/panama-vector/pull/148
More information about the panama-dev
mailing list