RFR(M):8167065: Add intrinsic support for double precision shifting on x86_64

Fri Dec 20 23:47:21 UTC 2019

Hi Smita,

You have typo (should be supports_vbmi2):

src/hotspot/cpu/x86/assembler_x86.cpp:6547:22: error: 'support_vbmi2' is not a member of 'VM_Version'
     assert(VM_Version::support_vbmi2(), "requires vbmi2");
                        ^~~~~~~~~~~~~

Debug build failed. I am retesting with local fix.

Regards,
Vladimir K

On 12/20/19 2:19 PM, Vladimir Kozlov wrote:
> We should have added core-libs to review since you modified BigInteger.java.
> 
> webrev02 looks good to me. Let me test it.
> 
> Thanks,
> Vladimir
> 
> On 12/20/19 1:52 PM, Kamath, Smita wrote:
>> Hi Vladimir,
>>
>> Thank you for reviewing the code. I have updated the code as per your recommendations ( please look at the email below).
>> Link to the updated webrev: https://cr.openjdk.java.net/~svkamath/bigIntegerShift/webrev02/
>>
>> Regards,
>> Smita
>>
>> -----Original Message-----
>> From: Vladimir Kozlov <vladimir.kozlov at oracle.com>
>> Sent: Thursday, December 19, 2019 5:17 PM
>> To: Kamath, Smita <smita.kamath at intel.com>
>> Cc: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; 'hotspot compiler' <hotspot-compiler-dev at openjdk.java.net>
>> Subject: Re: RFR(M):8167065: Add intrinsic support for double precision shifting on x86_64
>>
>> We missed AOT and JVMCI (in HS) changes similar for Base64 [1] to record StubRoutines pointers:
>>
>> StubRoutines::_bigIntegerRightShiftWorker
>> StubRoutines::_bigIntegerLeftShiftWorker
>> Smita>>>done
>>
>> In the test add an other @run command with default setting (without -XX:-TieredCompilation -Xbatch).
>> Smita>>>done
>>
>> Thanks,
>> Vladimir
>>
>> [1] http://cr.openjdk.java.net/~srukmannagar/Base64/webrev.01/
>>
>> On 12/18/19 6:33 PM, Kamath, Smita wrote:
>>> Hi Vladimir,
>>>
>>> I have made the code changes you suggested (please look at the email below).
>>> I have also enabled the intrinsic to run only when VBMI2 feature is available.
>>> The intrinsic shows gains of >1.5x above 4k bit BigInteger.
>>>
>>> Webrev link:
>>> https://cr.openjdk.java.net/~svkamath/bigIntegerShift/webrev01/
>>>
>>> Thanks,
>>> Smita
>>>
>>> -----Original Message-----
>>> From: Vladimir Kozlov <vladimir.kozlov at oracle.com>
>>> Sent: Wednesday, December 11, 2019 10:55 AM
>>> To: Kamath, Smita <smita.kamath at intel.com>; 'hotspot compiler'
>>> <hotspot-compiler-dev at openjdk.java.net>; Viswanathan, Sandhya
>>> <sandhya.viswanathan at intel.com>
>>> Subject: Re: RFR(M):8167065: Add intrinsic support for double
>>> precision shifting on x86_64
>>>
>>> Hi Kamath,
>>>
>>> First, general question. What performance you see when VBMI2 instructions are *not* used with your new code vs code 
>>> generated by C2.
>>> What improvement you see when VBMI2 is used. This is to understand if we need only VBMI2 version of intrinsic or not.
>>>
>>> Second. Sandhya recently pushed 8235510 changes to rollback avx512 code for CRC32 due to performance issues. Does you 
>>> change has any issues on some Intel's CPU too? Should it be excluded on such CPUs?
>>>
>>> Third. I would suggest to wait after we fork JDK 14 with this changes. I think it may be too late for 14 because we 
>>> would need test this including performance testing.
>>>
>>> In assembler_x86.cpp use supports_vbmi2() instead of UseVBMI2 in assert.
>>> For that to work in vm_version_x86.cpp#l687 clear CPU_VBMI2 bit when UseAVX < 3 ( < avx512). You can also use 
>>> supports_vbmi2() instead of (UseAVX > 2 && UseVBMI2) in stubGenerator_x86_64.cpp combinations with that.
>>> Smita >>>done
>>>
>>> I don't think we need separate flag UseVBMI2 - it could be controlled by UseAVX flag. We don't have flag for VNNI or 
>>> other avx512 instructions subset.
>>> Smita >> removed UseVBMI2 flag
>>>
>>> In vm_version_x86.cpp you need to add more %s in print statement for new output.
>>> Smita  >>> done
>>>
>>> You need to add @requires vm.compiler2.enabled to new test's commands to run it only with C2.
>>> Smita >>> done
>>>
>>> You need to add intrinsics to Graal's test to ignore them:
>>>
>>> http://hg.openjdk.java.net/jdk/jdk/file/d188996ea355/src/jdk.internal.
>>> vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/gr
>>> aalvm/compiler/hotspot/test/CheckGraalIntrinsics.java#l416
>>> Smita >>>done
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 12/10/19 5:41 PM, Kamath, Smita wrote:
>>>> Hi,
>>>>
>>>>
>>>> As per Intel Architecture Instruction Set Reference [1] VBMI2 Operations will be supported in future Intel ISA. I 
>>>> would like to contribute optimizations for BigInteger shift operations using AVX512+VBMI2 instructions. This 
>>>> optimization is for x86_64 architecture enabled.
>>>>
>>>> Link to Bug: https://bugs.openjdk.java.net/browse/JDK-8167065
>>>>
>>>> Link to webrev :
>>>> http://cr.openjdk.java.net/~svkamath/bigIntegerShift/webrev00/
>>>>
>>>>
>>>>
>>>> I ran jtreg test suite with the algorithm on Intel SDE [2] to confirm that encoding and semantics are correctly 
>>>> implemented.
>>>>
>>>>
>>>> [1]
>>>> https://software.intel.com/sites/default/files/managed/39/c5/325462-s
>>>> d m-vol-1-2abcd-3abcd.pdf (vpshrdv -> Vol. 2C 5-477 and vpshldv ->
>>>> Vol.
>>>> 2C 5-471)
>>>>
>>>> [2]
>>>> https://software.intel.com/en-us/articles/intel-software-development-
>>>> e
>>>> mulator
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Smita Kamath
>>>>