RFR(M):8167065: Add intrinsic support for double precision shifting on x86_64
Kamath, Smita
smita.kamath at intel.com
Thu Dec 19 02:33:08 UTC 2019
Hi Vladimir,
I have made the code changes you suggested (please look at the email below).
I have also enabled the intrinsic to run only when VBMI2 feature is available.
The intrinsic shows gains of >1.5x above 4k bit BigInteger.
Webrev link: https://cr.openjdk.java.net/~svkamath/bigIntegerShift/webrev01/
Thanks,
Smita
-----Original Message-----
From: Vladimir Kozlov <vladimir.kozlov at oracle.com>
Sent: Wednesday, December 11, 2019 10:55 AM
To: Kamath, Smita <smita.kamath at intel.com>; 'hotspot compiler' <hotspot-compiler-dev at openjdk.java.net>; Viswanathan, Sandhya <sandhya.viswanathan at intel.com>
Subject: Re: RFR(M):8167065: Add intrinsic support for double precision shifting on x86_64
Hi Kamath,
First, general question. What performance you see when VBMI2 instructions are *not* used with your new code vs code generated by C2.
What improvement you see when VBMI2 is used. This is to understand if we need only VBMI2 version of intrinsic or not.
Second. Sandhya recently pushed 8235510 changes to rollback avx512 code for CRC32 due to performance issues. Does you change has any issues on some Intel's CPU too? Should it be excluded on such CPUs?
Third. I would suggest to wait after we fork JDK 14 with this changes. I think it may be too late for 14 because we would need test this including performance testing.
In assembler_x86.cpp use supports_vbmi2() instead of UseVBMI2 in assert.
For that to work in vm_version_x86.cpp#l687 clear CPU_VBMI2 bit when UseAVX < 3 ( < avx512). You can also use supports_vbmi2() instead of (UseAVX > 2 && UseVBMI2) in stubGenerator_x86_64.cpp combinations with that.
Smita >>>done
I don't think we need separate flag UseVBMI2 - it could be controlled by UseAVX flag. We don't have flag for VNNI or other avx512 instructions subset.
Smita >> removed UseVBMI2 flag
In vm_version_x86.cpp you need to add more %s in print statement for new output.
Smita >>> done
You need to add @requires vm.compiler2.enabled to new test's commands to run it only with C2.
Smita >>> done
You need to add intrinsics to Graal's test to ignore them:
http://hg.openjdk.java.net/jdk/jdk/file/d188996ea355/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java#l416
Smita >>>done
Thanks,
Vladimir
On 12/10/19 5:41 PM, Kamath, Smita wrote:
> Hi,
>
>
> As per Intel Architecture Instruction Set Reference [1] VBMI2 Operations will be supported in future Intel ISA. I would like to contribute optimizations for BigInteger shift operations using AVX512+VBMI2 instructions. This optimization is for x86_64 architecture enabled.
>
> Link to Bug: https://bugs.openjdk.java.net/browse/JDK-8167065
>
> Link to webrev :
> http://cr.openjdk.java.net/~svkamath/bigIntegerShift/webrev00/
>
>
>
> I ran jtreg test suite with the algorithm on Intel SDE [2] to confirm that encoding and semantics are correctly implemented.
>
>
> [1]
> https://software.intel.com/sites/default/files/managed/39/c5/325462-sd
> m-vol-1-2abcd-3abcd.pdf (vpshrdv -> Vol. 2C 5-477 and vpshldv -> Vol.
> 2C 5-471)
>
> [2]
> https://software.intel.com/en-us/articles/intel-software-development-e
> mulator
>
>
> Regards,
>
> Smita Kamath
>
More information about the hotspot-compiler-dev
mailing list