RFR: 8262989: Vectorize VectorShuffle checkIndexes, wrapIndexes and laneIsValid methods
Sandhya Viswanathan
sviswanathan at openjdk.java.net
Thu Mar 4 18:48:39 UTC 2021
On Thu, 4 Mar 2021 16:47:30 GMT, Paul Sandoz <psandoz at openjdk.org> wrote:
>> The hot path of VectorShuffle checkIndexes, wrapIndexes and laneIsValid methods can be implemented using Vector API methods.
>>
>> For the attached jmh TestSlice.java, performance improves as below.
>>
>> Before:
>> Benchmark (size) Mode Cnt Score Error Units
>> TestSlice.vectorSliceOrigin 1024 thrpt 5 1224.698 ± 53.825 ops/ms
>> TestSlice.vectorSliceUnsliceOrigin 1024 thrpt 5 657.895 ± 31.945 ops/ms
>>
>> After:
>> Benchmark (size) Mode Cnt Score Error Units
>> TestSlice.vectorSliceOrigin 1024 thrpt 5 11221.532 ± 88.616 ops/ms
>> TestSlice.vectorSliceUnsliceOrigin 1024 thrpt 5 6509.519 ± 18.102 ops/ms
>
> Looks good, a nice incremental improvement.
>
> I suppose `checkIndexes` and `wrapIndexes` could call `laneIsValid`, and then call `anyFalse` on the resulting mask. Dunno if that would affect the generated code.
Calling laneIsValid from checkIndexes and wrapIndexes would look like as below:
VectorMask<E> vecmask = this.laneIsValid();
if (!vecmask.allTrue()) {
I observe a small overhead (~2%) due to couple of extra instructions generated for allTrue.
-------------
PR: https://git.openjdk.java.net/jdk/pull/2819
More information about the core-libs-dev
mailing list