RFR: 8259278: Optimize Vector API slice and unslice operations

Paul Sandoz psandoz at openjdk.java.net
Wed Jan 6 17:20:55 UTC 2021


On Tue, 5 Jan 2021 22:59:29 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:

> This pull request optimizes Vector API slice and unslice operations.
> All the slice and unslice variants that take more than one argument are implemented in terms of already intrinsic methods on similar lines as slice(origin) and unslice(origin).
> Also, the slice and unslice intrinsics for 256-bit byte/short vectors implemented for x86 platforms supporting AVX2 using a sequence of instructions.
> 
> For TestSlice.java attached to JBS (https://bugs.openjdk.java.net/browse/JDK-8259278):
> Before:
> Benchmark                                                       (size) Mode  Cnt    Score    Error   Units
> TestSlice.vectorSliceOrigin                               1024  thrpt    5    17.665 ±  0.980  ops/ms
> TestSlice.vectorSliceOriginVector                     1024  thrpt    5  604.179 ±  5.795  ops/ms
> TestSlice.vectorSliceUnsliceOrigin                    1024  thrpt    5      9.286 ±  0.088  ops/ms
> TestSlice.vectorSliceUnsliceOriginVector          1024  thrpt    5  435.678 ± 30.171 ops/ms
> TestSlice.vectorSliceUnsliceOriginVectorPart    1024  thrpt    5  440.492 ± 24.592 ops/ms
> 
> After:
> Benchmark                                                        (size)   Mode  Cnt     Score    Error   Units
> TestSlice.vectorSliceOrigin                                1024  thrpt    5  1319.044 ± 67.862  ops/ms
> TestSlice.vectorSliceOriginVector                      1024  thrpt    5    969.514 ± 33.411  ops/ms
> TestSlice.vectorSliceUnsliceOrigin                     1024  thrpt    5    687.804 ± 31.511  ops/ms
> TestSlice.vectorSliceUnsliceOriginVector           1024  thrpt    5    560.807 ± 20.600  ops/ms
> TestSlice.vectorSliceUnsliceOriginVectorPart     1024  thrpt    5    560.202 ±  4.012  ops/ms
> 
> Please review.
> 
> Best Regards,
> Sandhya

The following bounds check can be replaced with Objects.checkIndex:
        if ((origin < 0) || (origin >= VLENGTH)) {
            throw new ArrayIndexOutOfBoundsException("Index " + origin + " out of bounds for vector length " + VLENGTH);
        }

In general, I am wondering why the code for the slice/unslice implementations cannot refer to template code of the abstract super class, thereby avoiding much duplication.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1950


More information about the hotspot-dev mailing list