[vector] RFR (L): Generalized intrinsics for vector operations (first batch)

Fri Feb 9 22:42:07 UTC 2018

> On Feb 9, 2018, at 2:24 PM, Vladimir Ivanov <vladimir.x.ivanov at oracle.com> wrote:
> 
> Thanks, Paul.
> 
> On 2/10/18 1:14 AM, Paul Sandoz wrote:
>> Looks good.
>>  139     @Override
>>  140     @ForceInline
>>  141     public $vectortype$ add(Vector<$Boxtype$,Shapes.$shape$> v) {
>>  142         Objects.requireNonNull(v);
>>  143         return ($vectortype$) VectorIntrinsics.binaryOp(
>>  144             VECTOR_OP_ADD, $vectortype$.class, $type$.class, LENGTH,
>>  145             this, ($vectortype$)v,
>>  146             (v1, v2) -> (($vectortype$)v1).bOp(v2, (i, a, b) -> ($type$)(a + b)));
>>  147     }
>> Given you perform the cast at line 145, so both input vectors are of the same concrete type, do you require it on v1 in the lambda?
> 
> Good point. It can be even further simplified to:
> 
>    public Int256Vector add(Vector<Integer,Shapes.S256Bit> v) {
>        Objects.requireNonNull(v);
>        return VectorIntrinsics.binaryOp(
>            VECTOR_OP_ADD, Int256Vector.class, int.class, LENGTH,
>            this, v,
>            (v1, v2) -> v1.bOp(v2, (i, a, b) -> (int)(a + b)));
>    }
> 

Right, i thought you might want the explicit cast for reasons you say, so i am ok with it.

Paul.

> But I'd prefer to keep the cast at line 145 explicit to stress there's a type check in place on the argument.
> 
> I'll use the following shape:
> 
> 139     @Override
> 140     @ForceInline
> 141     public $vectortype$ add(Vector<$Boxtype$,Shapes.$shape$> v) {
> 142         Objects.requireNonNull(v);
> 143         return VectorIntrinsics.binaryOp(
> 144             VECTOR_OP_ADD, $vectortype$.class, $type$.class, LENGTH,
> 145             this, ($vectortype$)v,
> 146             (v1, v2) -> v1.bOp(v2, (i, a, b) -> ($type$)(a + b)));
> 147     }
> 
> Are you ok with it?
> 
> Best regards,
> Vladimir Ivanov
> 
>> Paul.
>>> On Feb 9, 2018, at 1:38 PM, Vladimir Ivanov <vladimir.x.ivanov at oracle.com> wrote:
>>> 
>>> http://cr.openjdk.java.net/~vlivanov/panama/vector.generalized_intrinsics/webrev.06/
>>> 
>>> Here's the first batch of rewritten intrinsics for vector operations.
>>> 
>>> Main motivation for new implementation is:
>>>  * reduce number of intrinsic needed;
>>>  * improve intrinsification robustness;
>>>  * minimize changes needed in shared C2 code.
>>> 
>>> The idea of generalized intrinsics is to parameterize them with additional information (passed as constant arguments), so JIT-compiler has enough information to dispatch to proper implementation during intrinsification.
>>> 
>>> For example, binary vector operation can be represented as:
>>> 
>>>  // (V,V) -> V
>>>  @HotSpotIntrinsicCandidate
>>>  static <V> V binaryOp(int oprId, Class<V> vectorClass,
>>>                        Class<?> elementType, int vlen,
>>>                        V v1, V v2,
>>>                        BiFunction<V,V,V> defaultImpl) {
>>>      return defaultImpl.apply(v1, v2);
>>>  }
>>> 
>>> and used as:
>>> 
>>>  // (Int256Vector,Int256Vector) -> Int256Vector
>>> 
>>>  public Int256Vector add(Vector<Integer,Shapes.S256Bit> v) {
>>>    return (Int256Vector) VectorIntrinsics.binaryOp(
>>>            VECTOR_OP_ADD,
>>>            Int256Vector.class, int.class, 8,
>>>            this, (Int256Vector)v,
>>>            (v1, v2) -> ((Int256Vector)v1).bOp(v2,
>>>                                   (i, a, b) -> (int)(a+b)));
>>>  }
>>> 
>>> where:
>>>  oprId encodes actual operation (VECTOR_OP_ADD);
>>> 
>>>  vectorClass, elementType, vlen describe concrete vector class (Int256Vector);
>>> 
>>>  v1, v2 are actual arguments of vector binary operations;
>>> 
>>>  defaultImpl - scalar implementation which is used when intrinsification fails.
>>> 
>>> 
>>> Generalized intrinsics are declared on jdk.incubator.vector.VectorIntrinsics and the patch contains 6 of them:
>>>  * broadcastCoerced
>>>  * reductionCoerced
>>>  * binaryOp
>>>  * load/store
>>>  * test
>>> 
>>> The following vector operations were ported to new mechanism:
>>>  * broadcast: zero, broadcast, trueMask, falseMask
>>>  * reduction: addAll, mulAll
>>>  * binaryOp: add, sub, mul, div, and, or, xor
>>>  * load/store: intoArray, fromArray
>>>  * test: anyTrue, allTrue
>>> 
>>> There's new flag added to turn new intrinsics on/off:
>>> 
>>> +  product(bool, UseVectorApiGeneralizedIntrinsics, true,
>>> 
>>> Previous discussions [1] [2].
>>> 
>>> The patch adds alternative implementations, but doesn't remove existing ones, since there are some code dependencies on them. Once the dependencies are broken, the code will go away.
>>> 
>>> Thanks!
>>> 
>>> Best regards,
>>> Vladimir Ivanov
>>> 
>>> [1] http://mail.openjdk.java.net/pipermail/panama-dev/2017-November/000748.html
>>> 
>>> [2] http://mail.openjdk.java.net/pipermail/panama-dev/2017-December/000884.html