RFR(M):8167065: Add intrinsic support for double precision shifting on x86_64

Sat Dec 21 05:12:15 UTC 2019

Testing results are good after fixing the typo.

We should consider implementing this intrinsic in Graal too. We have to upload AOT and Graal test changes anyway.

Thanks,
Vladimir

On 12/20/19 3:47 PM, Vladimir Kozlov wrote:
> Hi Smita,
> 
> You have typo (should be supports_vbmi2):
> 
> src/hotspot/cpu/x86/assembler_x86.cpp:6547:22: error: 'support_vbmi2' is not a member of 'VM_Version'
>      assert(VM_Version::support_vbmi2(), "requires vbmi2");
>                         ^~~~~~~~~~~~~
> 
> Debug build failed. I am retesting with local fix.
> 
> Regards,
> Vladimir K
> 
> On 12/20/19 2:19 PM, Vladimir Kozlov wrote:
>> We should have added core-libs to review since you modified BigInteger.java.
>>
>> webrev02 looks good to me. Let me test it.
>>
>> Thanks,
>> Vladimir
>>
>> On 12/20/19 1:52 PM, Kamath, Smita wrote:
>>> Hi Vladimir,
>>>
>>> Thank you for reviewing the code. I have updated the code as per your recommendations ( please look at the email below).
>>> Link to the updated webrev: https://cr.openjdk.java.net/~svkamath/bigIntegerShift/webrev02/
>>>
>>> Regards,
>>> Smita
>>>
>>> -----Original Message-----
>>> From: Vladimir Kozlov <vladimir.kozlov at oracle.com>
>>> Sent: Thursday, December 19, 2019 5:17 PM
>>> To: Kamath, Smita <smita.kamath at intel.com>
>>> Cc: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; 'hotspot compiler' <hotspot-compiler-dev at openjdk.java.net>
>>> Subject: Re: RFR(M):8167065: Add intrinsic support for double precision shifting on x86_64
>>>
>>> We missed AOT and JVMCI (in HS) changes similar for Base64 [1] to record StubRoutines pointers:
>>>
>>> StubRoutines::_bigIntegerRightShiftWorker
>>> StubRoutines::_bigIntegerLeftShiftWorker
>>> Smita>>>done
>>>
>>> In the test add an other @run command with default setting (without -XX:-TieredCompilation -Xbatch).
>>> Smita>>>done
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> [1] http://cr.openjdk.java.net/~srukmannagar/Base64/webrev.01/
>>>
>>> On 12/18/19 6:33 PM, Kamath, Smita wrote:
>>>> Hi Vladimir,
>>>>
>>>> I have made the code changes you suggested (please look at the email below).
>>>> I have also enabled the intrinsic to run only when VBMI2 feature is available.
>>>> The intrinsic shows gains of >1.5x above 4k bit BigInteger.
>>>>
>>>> Webrev link:
>>>> https://cr.openjdk.java.net/~svkamath/bigIntegerShift/webrev01/
>>>>
>>>> Thanks,
>>>> Smita
>>>>
>>>> -----Original Message-----
>>>> From: Vladimir Kozlov <vladimir.kozlov at oracle.com>
>>>> Sent: Wednesday, December 11, 2019 10:55 AM
>>>> To: Kamath, Smita <smita.kamath at intel.com>; 'hotspot compiler'
>>>> <hotspot-compiler-dev at openjdk.java.net>; Viswanathan, Sandhya
>>>> <sandhya.viswanathan at intel.com>
>>>> Subject: Re: RFR(M):8167065: Add intrinsic support for double
>>>> precision shifting on x86_64
>>>>
>>>> Hi Kamath,
>>>>
>>>> First, general question. What performance you see when VBMI2 instructions are *not* used with your new code vs code 
>>>> generated by C2.
>>>> What improvement you see when VBMI2 is used. This is to understand if we need only VBMI2 version of intrinsic or not.
>>>>
>>>> Second. Sandhya recently pushed 8235510 changes to rollback avx512 code for CRC32 due to performance issues. Does 
>>>> you change has any issues on some Intel's CPU too? Should it be excluded on such CPUs?
>>>>
>>>> Third. I would suggest to wait after we fork JDK 14 with this changes. I think it may be too late for 14 because we 
>>>> would need test this including performance testing.
>>>>
>>>> In assembler_x86.cpp use supports_vbmi2() instead of UseVBMI2 in assert.
>>>> For that to work in vm_version_x86.cpp#l687 clear CPU_VBMI2 bit when UseAVX < 3 ( < avx512). You can also use 
>>>> supports_vbmi2() instead of (UseAVX > 2 && UseVBMI2) in stubGenerator_x86_64.cpp combinations with that.
>>>> Smita >>>done
>>>>
>>>> I don't think we need separate flag UseVBMI2 - it could be controlled by UseAVX flag. We don't have flag for VNNI or 
>>>> other avx512 instructions subset.
>>>> Smita >> removed UseVBMI2 flag
>>>>
>>>> In vm_version_x86.cpp you need to add more %s in print statement for new output.
>>>> Smita  >>> done
>>>>
>>>> You need to add @requires vm.compiler2.enabled to new test's commands to run it only with C2.
>>>> Smita >>> done
>>>>
>>>> You need to add intrinsics to Graal's test to ignore them:
>>>>
>>>> http://hg.openjdk.java.net/jdk/jdk/file/d188996ea355/src/jdk.internal.
>>>> vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/gr
>>>> aalvm/compiler/hotspot/test/CheckGraalIntrinsics.java#l416
>>>> Smita >>>done
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 12/10/19 5:41 PM, Kamath, Smita wrote:
>>>>> Hi,
>>>>>
>>>>>
>>>>> As per Intel Architecture Instruction Set Reference [1] VBMI2 Operations will be supported in future Intel ISA. I 
>>>>> would like to contribute optimizations for BigInteger shift operations using AVX512+VBMI2 instructions. This 
>>>>> optimization is for x86_64 architecture enabled.
>>>>>
>>>>> Link to Bug: https://bugs.openjdk.java.net/browse/JDK-8167065
>>>>>
>>>>> Link to webrev :
>>>>> http://cr.openjdk.java.net/~svkamath/bigIntegerShift/webrev00/
>>>>>
>>>>>
>>>>>
>>>>> I ran jtreg test suite with the algorithm on Intel SDE [2] to confirm that encoding and semantics are correctly 
>>>>> implemented.
>>>>>
>>>>>
>>>>> [1]
>>>>> https://software.intel.com/sites/default/files/managed/39/c5/325462-s
>>>>> d m-vol-1-2abcd-3abcd.pdf (vpshrdv -> Vol. 2C 5-477 and vpshldv ->
>>>>> Vol.
>>>>> 2C 5-471)
>>>>>
>>>>> [2]
>>>>> https://software.intel.com/en-us/articles/intel-software-development-
>>>>> e
>>>>> mulator
>>>>>
>>>>>
>>>>> Regards,
>>>>>
>>>>> Smita Kamath
>>>>>