RFR: 8261542: X86 slice and unslice intrinsics for 256-bit byte/short vectors [v3]
Sandhya Viswanathan
sviswanathan at openjdk.java.net
Fri Feb 19 01:30:41 UTC 2021
On Thu, 18 Feb 2021 23:31:28 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:
>> Yes you are right, but this code will execute for vector length 16 when UseAVX ==2.
>> It will also execure for vector length 16 when UseAVX == 3 &&
>> !VM_Version::supports_avx512bw.
>
> Next assert checks <= 16 when code is guarded by (UseAVX == 0). It is not (UseAVX ==2).
> Also } else { case is for UseAVX > 0 which includes AVX=1 but vpaddb() (avx3) is used there.
> Seems UseAVX checks wrong here.
The assert checks for vlen_in_bytes <= 16 (128 bits) and so is a correct check for UseAVX=0.
vpaddb is supported on AVX1/AVX2 as well.
vpaddb is supported on AVX1 for up to 128 bit and
on AVX2 for upto 256 bit and
on AVX3 (512) for upto 512 bit vectors.
I have tested this for UseAVX=0, UseAVX=1, UseAVX=2, UseAVX=3 platform.
The check is for UseAVX as with any flavor of AVX, we can use less number of instructions to do this operation.
This is because AVX allows destination to be separate from both the sources.
Please let me know if I am missing something.
-------------
PR: https://git.openjdk.java.net/jdk/pull/2520
More information about the hotspot-dev
mailing list