RFR: 8256973: Intrinsic creation for VectorMask query (lastTrue, firstTrue, trueCount) APIs [v2]

Fri May 14 08:11:26 UTC 2021

On Fri, 7 May 2021 19:04:01 GMT, Paul Sandoz <psandoz at openjdk.org> wrote:

>>> These mask operations can be considered a form of reduction.
>>> 
>>> Do you think it makes sense to reuse `VectorSupport.reductionCoerced` instead of adding a new intrinsic? (Note that we reuse `VectorSupport.binaryOp` for mask logical binary operations).
>>> 
>>> Perhaps that allows for further reuse later if/when we add operations to integral vectors to count bits like we already have with scalars, such as `Integer.bitCount`, `Integer.numberOfLeadingZeros` etc?
>> 
>> Hi @PaulSandoz , that's a nice suggestion, I think instead of reduction which may emit bulky sequence, VectorMask.toLong() + Long.bitCount() could have been used for trueCount. But since toLong may not work for ARM SVE, so in the mean time intrinsifying at the level of API looked reasonable.
>
>> Hi @PaulSandoz , that's a nice suggestion, I think instead of reduction which may emit bulky sequence, VectorMask.toLong() + Long.bitCount() could have been used for trueCount. But since toLong may not work for ARM SVE, so in the mean time intrinsifying at the level of API looked reasonable.
> 
> Do you mean that reusing `VectorSupport.reductionCoerced` as the intrinsic entry point may emit bulky sequence?
> 
> Note that i was not suggesting to reuse `Long.bitCount()` etc. i was just using that as a example that the bit-wise reduction operations on masks can also apply to integral vectors, suggesting there might be some sharing in C2 just like is done for binary-wise operations, such as logical AND.
> 
> For example:
> 
>         @Override
>         @ForceInline
>         public Int256Mask and(VectorMask<Integer> mask) {
>             Objects.requireNonNull(mask);
>             Int256Mask m = (Int256Mask)mask;
>             return VectorSupport.binaryOp(VECTOR_OP_AND, Int256Mask.class, int.class, VLENGTH,
>                                              this, m,
>                                              (m1, m2) -> m1.bOp(m2, (i, a, b) -> a & b));
>         }
> 
> 
> And notice that `VECTOR_OP_AND` is reused for vector lane-wise binary and reduction operations on `IntVector` etc. Can we do the same for other bitwise reduction-like operations, first implementing optimal support for masks, then later expanding for integral vectors?
> 
> So rather than introducing specific constants, such as `VECTOR_OP_MASK_TRUECOUNT` etc, we can generalize to `VECTOR_OP_BITCOUNT` etc that can apply to both masks and integral vectors, where for masks we interpret `BIT` appropriately to mean `boolean` true value.

Hi @PaulSandoz , thanks your comments on JMH have been addressed. @neliasso @iwanowww kindly share your feedback/comments on compiler side changes.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3916