RFR(M):8214751: X86: Support for VNNI instruction

Deshpande, Vivek R vivek.r.deshpande at intel.com
Thu Dec 6 18:31:32 UTC 2018


Hi All

Could you please review the patch for VNNI VPDPWSSD instruction support.
It would be great if we can get it into JDK12.

The webrev is here:
http://cr.openjdk.java.net/~vdeshpande/8214751/VNNI/webrev.00/
The jbs entry for the same is here:
https://bugs.openjdk.java.net/browse/JDK-8214751


Regards,
Vivek

From: Deshpande, Vivek R
Sent: Monday, December 3, 2018 8:58 PM
To: hotspot-compiler-dev at openjdk.java.net compiler <hotspot-compiler-dev at openjdk.java.net>
Cc: Vladimir Kozlov <vladimir.kozlov at oracle.com>; Viswanathan, Sandhya <sandhya.viswanathan at intel.com>
Subject: RFR(M):8214751: X86: Support for VNNI instruction

Hi All

Could you please review the VNNI VPDPWSSD instruction support with autovectorization.
It can vectorize this operation in the loop:
out[i] += ((in1[2*i] * in2[2*i]) + (in1[2*i+1] * in2[2*i+1]));
More information on VNNI can be found here:
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

The initial performance gains with micro on skylake with AVX3 is 10.8x.
 and it generates
vmovdqu xmm3, xmmword ptr [rbp+r8*2+0x10]
vmovdqu xmm6, xmmword ptr [rdx+r8*2+0x10]
vpmaddwd xmm3, xmm6, xmm3
vpaddd xmm3, xmm3, xmmword ptr [r9+rdi*4+0x10]
vmovdqu xmmword ptr [r9+rdi*4+0x10], xmm3

It can generate vpdpwssd instruction on cascadelake.

The webrev is here:
http://cr.openjdk.java.net/~vdeshpande/8214751/VNNI/webrev.00/
The jbs entry for the same is here:
https://bugs.openjdk.java.net/browse/JDK-8214751

Regards,
Vivek
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20181206/f1dd6028/attachment.html>


More information about the hotspot-compiler-dev mailing list