RFR: 8303762: [vectorapi] Intrinsification of Vector.slice [v6]
Vladimir Ivanov
vlivanov at openjdk.org
Tue Apr 11 19:06:38 UTC 2023
On Tue, 4 Apr 2023 13:46:12 GMT, Quan Anh Mai <qamai at openjdk.org> wrote:
>> `Vector::slice` is a method at the top-level class of the Vector API that concatenates the 2 inputs into an intermediate composite and extracts a window equal to the size of the inputs into the result. It is used in vector conversion methods where the part number is not 0 to slice the parts to the correct positions. Slicing is also used in text processing such as utf8 and utf16 validation. x86 starting from SSSE3 has `palignr` which does vector slicing very efficiently. As a result, I think it is beneficial to add a C2 node for this operation as well as intrinsify `Vector::slice` method.
>>
>> A slice is currently implemented as `v2.rearrange(iota).blend(v1.rearrange(iota), blendMask)` which requires preparation of the index vector and the blending mask. Even with the preparations being hoisted out of the loops, microbenchmarks show improvement using the slice instrinsics. Some have tremendous increases in throughput due to the limitation that a mask of length 2 cannot currently be intrinsified, leading to falling back to the Java implementations.
>>
>> Please take a look and have some reviews. Thank you very much.
>
> Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision:
>
> style
src/jdk.incubator.vector/share/classes/jdk/incubator/vector/ShortVector.java line 2295:
> 2293: // to be performant
> 2294: @ForceInline
> 2295: public ShortVector apply(ShortVector v1, ShortVector v2, int o) {
Have you considered matching the corresponding IR during GVN to produce VectorSlice nodes rather than going through VM intrinsic?
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/12909#discussion_r1163216924
More information about the hotspot-compiler-dev
mailing list