RFR: 8154495: SHA256 AVX2 intrinsic (when no supports_sha() available)

Vladimir Kozlov vladimir.kozlov at oracle.com
Wed Apr 20 20:04:20 UTC 2016


One thing was caught during build is ',' at the last line of enum:

+  STACK_SIZE = _RSP      + _RSP_SIZE,
+};

Compiler complains about it so I removed it in my local repo.

Vladimir

On 4/20/16 12:07 PM, Civlin, Jan wrote:
> Thank you!
>
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Wednesday, April 20, 2016 11:38 AM
> To: Civlin, Jan <jan.civlin at intel.com>; hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
> Subject: Re: RFR: 8154495: SHA256 AVX2 intrinsic (when no supports_sha() available)
>
> Looks good to me. I submitted testing on all platforms before integrating.
>
> Thanks,
> Vladimir
>
> On 4/20/16 3:11 AM, Civlin, Jan wrote:
>> Vladimir,
>>
>> Please look at the updated patch at
>> http://cr.openjdk.java.net/~vdeshpande/8154495/webrev.01/
>>
>> I removed the definitions of unused [v]movdqa(), vpsrldq(), vpslldq().
>>
>> The k256_W is actually a table of the size of two k256 - each line of k256  is repeated twice. As you have suggested I made changes to generate  k256_W  from k256.
>>
>> The patch was tested in three configurations: slowdebug, release and fastdebug in Win/Linux 64.
>>
>> Thank you,
>>
>> J
>>
>> [jcivlin at HSW-EP02 TestSHA]$
>> ../../sha-041116/build/linux-x86_64-normal-server-release/jdk/bin/java
>> -Xbatch -XX:+UseSHA -XX:+UseSHA256Intrinsics
>> -XX:+ShowMessageBoxOnError -Dalgorithm=SHA-256 -jar TestSHA.jar
>> 10000000 provider = SUN algorithm = SHA-256 msgSize = 1024 bytes
>> offset = 0 iters = 10000000 warmupIters = 20000 hash [32]: 78 5b 07 51
>> fc 2c 53 dc 14 a4 ce 3d 80 0e 69 ef 9c e1 00 9e b3 27 cc f4 58 af e0
>> 9c 24 2c 26 c9 TestSHA runtime = 28.756324129 seconds TestSHA
>> throughput = 356.09558280340946 MB/s
>>
>> [jcivlin at HSW-EP02 TestSHA]$
>> ../../sha-041116/build/linux-x86_64-normal-server-fastdebug/jdk/bin/ja
>> va -Xbatch -XX:+UseSHA -XX:+UseSHA256Intrinsics
>> -XX:+ShowMessageBoxOnError -Dalgorithm=SHA-256 -jar TestSHA.jar
>> 10000000 provider = SUN algorithm = SHA-256 msgSize = 1024 bytes
>> offset = 0 iters = 10000000 warmupIters = 20000 hash [32]: 78 5b 07 51
>> fc 2c 53 dc 14 a4 ce 3d 80 0e 69 ef 9c e1 00 9e b3 27 cc f4 58 af e0
>> 9c 24 2c 26 c9 TestSHA runtime = 28.912701124 seconds TestSHA
>> throughput = 354.1696071938408 MB/s
>>
>> [jcivlin at HSW-EP02 TestSHA]$
>> ../../sha-041116/build/linux-x86_64-normal-server-slowdebug/jdk/bin/ja
>> va -Xbatch -XX:+UseSHA -XX:+UseSHA256Intrinsics
>> -XX:+ShowMessageBoxOnError -Dalgorithm=SHA-256 -jar TestSHA.jar
>> 10000000 provider = SUN algorithm = SHA-256 msgSize = 1024 bytes
>> offset = 0 iters = 10000000 warmupIters = 20000 hash [32]: 78 5b 07 51
>> fc 2c 53 dc 14 a4 ce 3d 80 0e 69 ef 9c e1 00 9e b3 27 cc f4 58 af e0
>> 9c 24 2c 26 c9 TestSHA runtime = 29.339789962 seconds TestSHA
>> throughput = 349.01408678325697 MB/s
>>
>>
>> -----Original Message-----
>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>> Sent: Monday, April 18, 2016 5:09 PM
>> To: Civlin, Jan; hotspot compiler
>> Subject: Re: RFR: 8154495: SHA256 AVX2 intrinsic (when no
>> supports_sha() available)
>>
>> Hi Jan,
>>
>> The patch was generated on Windows and have ^M at the end of lines so I can't apply it to our sources.
>>
>> I don't see usage of new [v]movdqa(), vpsrldq(), vpslldq(),  instructions.
>>
>> Please, move new code in macroAssembler_x86_sha.cpp to the end of file.
>>
>> _k256_W[] is the same as _k256[] with repeated 4 values. I would suggest to generated it dynamically in stubGenerator_x86_64.cpp based on _k256:
>>
>> StubRoutines::x86::_k256_W_adr = generate_k256_W();
>>
>> What testing was done? Did you ran with fastdebug build? I am concern about size of new stub and current code_size2 is enough.
>>
>> Thanks,
>> Vladimir
>>
>> On 4/18/16 2:44 PM, Civlin, Jan wrote:
>>> == Correction in the subject line ===
>>>
>>> We would like to contribute the SHA256 AVX2 intrinsic.
>>>
>>> This intrinsic is for x86 AVX2 architecture when no supports_sha() is available. It is L64 code only.
>>>
>>> The patch delivers x2 performance gain for the latency and throughput - both are measured on an average message.
>>>
>>> Contributor: Jan Civlin.
>>>
>>>
>>> bug: https://bugs.openjdk.java.net/browse/JDK-8154495
>>> webrev: http://cr.openjdk.java.net/~vdeshpande/8154495/webrev.00/
>>>


More information about the hotspot-compiler-dev mailing list