RFR: 8293409: [vectorapi] Intrinsify VectorSupport.indexVector
Xiaohong Gong
xgong at openjdk.org
Thu Oct 13 07:32:07 UTC 2022
On Thu, 13 Oct 2022 07:18:24 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:
>> "`VectorSupport.indexVector()`" is used to compute a vector that contains the index values based on a given vector and a scale value (`i.e. index = vec + iota * scale`). This function is widely used in other APIs like "`VectorMask.indexInRange`" which is useful to the tail loop vectorization. And it can be easily implemented with the vector instructions.
>>
>> This patch adds the vector intrinsic implementation of it. The steps are:
>>
>> 1) Load the const "iota" vector.
>>
>> We extend the "`vector_iota_indices`" stubs from byte to other integral types. For floating point vectors, it needs an additional vector cast to get the right iota values.
>>
>> 2) Compute indexes with "`vec + iota * scale`"
>>
>> Here is the performance result to the new added micro benchmark on ARM NEON:
>>
>> Benchmark Gain
>> IndexVectorBenchmark.byteIndexVector 1.477
>> IndexVectorBenchmark.doubleIndexVector 5.031
>> IndexVectorBenchmark.floatIndexVector 5.342
>> IndexVectorBenchmark.intIndexVector 5.529
>> IndexVectorBenchmark.longIndexVector 3.177
>> IndexVectorBenchmark.shortIndexVector 5.841
>>
>>
>> Please help to review and share the feedback! Thanks in advance!
>
> src/hotspot/share/opto/vectorIntrinsics.cpp line 2978:
>
>> 2976: case T_DOUBLE: {
>> 2977: scale = gvn().transform(new ConvI2LNode(scale));
>> 2978: scale = gvn().transform(new ConvL2DNode(scale));
>
> Any specific reason for not directly using ConvI2D for double case.
Good catch, I think it's ok to use ConvI2D here. I will change this. Thanks!
-------------
PR: https://git.openjdk.org/jdk/pull/10332
More information about the core-libs-dev
mailing list