RFR: 8261542: X86 slice and unslice intrinsics for 256-bit byte/short vectors
Sandhya Viswanathan
sviswanathan at openjdk.java.net
Thu Feb 18 21:26:41 UTC 2021
On Thu, 18 Feb 2021 19:14:37 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:
>> The slice and unslice intrinsics for 256-bit byte/short vectors can be implemented for x86 platforms supporting AVX2 using a sequence of instructions.
>>
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8261542
>>
>> The PerfSliceOrigin.java jmh test attached to the JBS shows the following performance on AVX2 platform.
>>
>> Before:
>> Benchmark (size) Mode Cnt Score Error Units
>> PerfSliceOrigin.vectorSliceOrigin 1024 thrpt 5 18.887 ± 1.128 ops/ms
>> PerfSliceOrigin.vectorSliceUnsliceOrigin 1024 thrpt 5 9.374 ± 0.370 ops/ms
>>
>> After:
>> Benchmark (size) Mode Cnt Score Error Units
>> PerfSliceOrigin.vectorSliceOrigin 1024 thrpt 5 13861.420 ± 19.071 ops/ms
>> PerfSliceOrigin.vectorSliceUnsliceOrigin 1024 thrpt 5 7895.199 ± 142.580 ops/ms
>
> Please, add a test which verifies correctness of results when this code is used. If we don't have it already.
@vnkozlov thanks a lot for the review.
The test for slice and unslice are already part of test/jdk/jdk/incubator/vector/Byte256VectorTests.java and Short256VectorTests.java.
> src/hotspot/cpu/x86/x86.ad line 7550:
>
>> 7548: // only byte shuffle instruction available on these platforms
>> 7549: int vlen_in_bytes = vector_length_in_bytes(this);
>> 7550: if (UseAVX == 0) {
>
> This code will not be executed with vector length 16 because match_rule_supported_vector() bailout with (size_in_bits == 256 && UseAVX < 2).
Yes you are right, but this code will execute for vector length 16 when UseAVX ==2.
It will also execure for vector length 16 when UseAVX == 3 &&
!VM_Version::supports_avx512bw.
> src/hotspot/cpu/x86/x86.ad line 7506:
>
>> 7504: instruct rearrangeB_avx(legVec dst, legVec src, vec shuffle, legVec vtmp1, legVec vtmp2, rRegP scratch) %{
>> 7505: predicate(vector_element_basic_type(n) == T_BYTE &&
>> 7506: vector_length(n) == 32 && !VM_Version::supports_avx512_vbmi());
>
> Predicate matches bail-out condition in match_rule_supported_vector(). Does it mean this code never used before?
> So you are implementing it now. Right?
Yes, this rule was not used before and I am implementing it now.
-------------
PR: https://git.openjdk.java.net/jdk/pull/2520
More information about the hotspot-compiler-dev
mailing list