[vectorIntrinsics+mask] RFR: 8273949: Intrinsic creation for VectorMask.toLong operation.

Paul Sandoz psandoz at openjdk.java.net
Fri Sep 17 20:34:58 UTC 2021


On Fri, 17 Sep 2021 19:50:37 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

> Summary of changes:
> - Intrinsification of VectorMask.toLong() API.
> - Supports inline expansion for both AVX512 and non-AVX512 targets.
> - Used toLong() API to optimize existing Java API implementation of VectorMask.laneIsSet() operation.
> 
> Following performance number are generated using JMH benchmark modification included with the patch.
> 
> System:  Intel(R) Xeon(R) Platinum 8280 CPU @ 2.70GHz (Cascade Lake Server 28C 2S)
> 
> Benchmark | VECSIZE | Baseline Score (ops/ms) | With Opt (ops/ms) | Gain Ratio
> -- | -- | -- | -- | --
> MaskQueryOperationsBenchmark.testToLongByte | 128 | 90451.424 | 346941.379 | 3.835665196
> MaskQueryOperationsBenchmark.testToLongByte | 256 | 63127.764 | 338331.425 | 5.359471072
> MaskQueryOperationsBenchmark.testToLongByte | 512 | 40543.264 | 313836.333 | 7.740776199
> MaskQueryOperationsBenchmark.testToLongLong | 128 | 171989.714 | 152872.758 | 0.88884826
> MaskQueryOperationsBenchmark.testToLongLong | 256 | 164702.273 | 324794.578 | 1.972010295
> MaskQueryOperationsBenchmark.testToLongLong | 512 | 122667.916 | 318060.096 | 2.59285481
> MaskQueryOperationsBenchmark.testToLongShort | 128 | 122656.408 | 346691.082 | 2.826522378
> MaskQueryOperationsBenchmark.testToLongShort | 256 | 96838.555 | 360909.28 | 3.726917239
> MaskQueryOperationsBenchmark.testToLongShort | 512 | 63119.009 | 313075.159 | 4.960077225
> MaskQueryOperationsBenchmark.testToLonglong | 128 | 180855.623 | 324620.433 | 1.794914792
> MaskQueryOperationsBenchmark.testToLonglong | 256 | 122705.631 | 324315.916 | 2.643040204
> MaskQueryOperationsBenchmark.testToLonglong | 512 | 90396.687 | 324318.095 | 3.58772103

src/jdk.incubator.vector/share/classes/jdk/incubator/vector/AbstractMask.java line 66:

> 64:     @Override
> 65:     public boolean laneIsSet(int i) {
> 66:         if (length() <= Long.SIZE) {

We need to check that `i > 0 && i < length()` and if not throw `IllegalArgumentException`.

Currently the behavior is unspecified, can you please add the following to the documentation of `VectorMask.laneIsSet`. (I will handle any CSR related tasks.)


     * @throws IllegalArgumentException if the index is is out of range
     * ({@code < 0 || >= length()})

src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Byte128Vector.java line 712:

> 710:         public int trueCount() {
> 711:             return (int) VectorSupport.maskReductionCoerced(VECTOR_OP_MASK_TRUECOUNT, Byte128Mask.class, byte.class, VLENGTH, this,
> 712:                                                       (m) -> (long)trueCountHelper(((Byte128Mask)m).getBits()));

Suggestion:

            return (int) VectorSupport.maskReductionCoerced(VECTOR_OP_MASK_TRUECOUNT, Byte128Mask.class, byte.class, VLENGTH, this,
                                                      (m) -> trueCountHelper(m.getBits()));


We don't need the cast to `long` for the return from the lambda expression. And I think the cast of the mask is redundant too. Same applies to the other three methods.

-------------

PR: https://git.openjdk.java.net/panama-vector/pull/126


More information about the panama-dev mailing list