Integrated: 8293409: [vectorapi] Intrinsify VectorSupport.indexVector
Xiaohong Gong
xgong at openjdk.org
Wed Oct 19 09:28:04 UTC 2022
On Mon, 19 Sep 2022 08:51:24 GMT, Xiaohong Gong <xgong at openjdk.org> wrote:
> "`VectorSupport.indexVector()`" is used to compute a vector that contains the index values based on a given vector and a scale value (`i.e. index = vec + iota * scale`). This function is widely used in other APIs like "`VectorMask.indexInRange`" which is useful to the tail loop vectorization. And it can be easily implemented with the vector instructions.
>
> This patch adds the vector intrinsic implementation of it. The steps are:
>
> 1) Load the const "iota" vector.
>
> We extend the "`vector_iota_indices`" stubs from byte to other integral types. For floating point vectors, it needs an additional vector cast to get the right iota values.
>
> 2) Compute indexes with "`vec + iota * scale`"
>
> Here is the performance result to the new added micro benchmark on ARM NEON:
>
> Benchmark Gain
> IndexVectorBenchmark.byteIndexVector 1.477
> IndexVectorBenchmark.doubleIndexVector 5.031
> IndexVectorBenchmark.floatIndexVector 5.342
> IndexVectorBenchmark.intIndexVector 5.529
> IndexVectorBenchmark.longIndexVector 3.177
> IndexVectorBenchmark.shortIndexVector 5.841
>
>
> Please help to review and share the feedback! Thanks in advance!
This pull request has now been integrated.
Changeset: 857b0f9b
Author: Xiaohong Gong <xgong at openjdk.org>
URL: https://git.openjdk.org/jdk/commit/857b0f9b05bc711f3282a0da85fcff131fffab91
Stats: 391 lines in 14 files changed: 361 ins; 9 del; 21 mod
8293409: [vectorapi] Intrinsify VectorSupport.indexVector
Reviewed-by: eliu, jbhateja
-------------
PR: https://git.openjdk.org/jdk/pull/10332
More information about the core-libs-dev
mailing list