RFR: 8154495: SHA256 AVX2 intrinsic (when no supports_sha() available)

Vladimir Kozlov vladimir.kozlov at oracle.com
Wed Apr 20 18:37:55 UTC 2016


Looks good to me. I submitted testing on all platforms before integrating.

Thanks,
Vladimir

On 4/20/16 3:11 AM, Civlin, Jan wrote:
> Vladimir,
>
> Please look at the updated patch at
> http://cr.openjdk.java.net/~vdeshpande/8154495/webrev.01/
>
> I removed the definitions of unused [v]movdqa(), vpsrldq(), vpslldq().
>
> The k256_W is actually a table of the size of two k256 - each line of k256  is repeated twice. As you have suggested I made changes to generate  k256_W  from k256.
>
> The patch was tested in three configurations: slowdebug, release and fastdebug in Win/Linux 64.
>
> Thank you,
>
> J
>
> [jcivlin at HSW-EP02 TestSHA]$ ../../sha-041116/build/linux-x86_64-normal-server-release/jdk/bin/java -Xbatch -XX:+UseSHA -XX:+UseSHA256Intrinsics -XX:+ShowMessageBoxOnError -Dalgorithm=SHA-256 -jar TestSHA.jar 10000000
> provider = SUN
> algorithm = SHA-256
> msgSize = 1024 bytes
> offset = 0
> iters = 10000000
> warmupIters = 20000
> hash [32]: 78 5b 07 51 fc 2c 53 dc 14 a4 ce 3d 80 0e 69 ef 9c e1 00 9e b3 27 cc f4 58 af e0 9c 24 2c 26 c9
> TestSHA runtime = 28.756324129 seconds
> TestSHA throughput = 356.09558280340946 MB/s
>
> [jcivlin at HSW-EP02 TestSHA]$ ../../sha-041116/build/linux-x86_64-normal-server-fastdebug/jdk/bin/java -Xbatch -XX:+UseSHA -XX:+UseSHA256Intrinsics -XX:+ShowMessageBoxOnError -Dalgorithm=SHA-256 -jar TestSHA.jar 10000000
> provider = SUN
> algorithm = SHA-256
> msgSize = 1024 bytes
> offset = 0
> iters = 10000000
> warmupIters = 20000
> hash [32]: 78 5b 07 51 fc 2c 53 dc 14 a4 ce 3d 80 0e 69 ef 9c e1 00 9e b3 27 cc f4 58 af e0 9c 24 2c 26 c9
> TestSHA runtime = 28.912701124 seconds
> TestSHA throughput = 354.1696071938408 MB/s
>
> [jcivlin at HSW-EP02 TestSHA]$ ../../sha-041116/build/linux-x86_64-normal-server-slowdebug/jdk/bin/java -Xbatch -XX:+UseSHA -XX:+UseSHA256Intrinsics -XX:+ShowMessageBoxOnError -Dalgorithm=SHA-256 -jar TestSHA.jar 10000000
> provider = SUN
> algorithm = SHA-256
> msgSize = 1024 bytes
> offset = 0
> iters = 10000000
> warmupIters = 20000
> hash [32]: 78 5b 07 51 fc 2c 53 dc 14 a4 ce 3d 80 0e 69 ef 9c e1 00 9e b3 27 cc f4 58 af e0 9c 24 2c 26 c9
> TestSHA runtime = 29.339789962 seconds
> TestSHA throughput = 349.01408678325697 MB/s
>
>
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Monday, April 18, 2016 5:09 PM
> To: Civlin, Jan; hotspot compiler
> Subject: Re: RFR: 8154495: SHA256 AVX2 intrinsic (when no supports_sha() available)
>
> Hi Jan,
>
> The patch was generated on Windows and have ^M at the end of lines so I can't apply it to our sources.
>
> I don't see usage of new [v]movdqa(), vpsrldq(), vpslldq(),  instructions.
>
> Please, move new code in macroAssembler_x86_sha.cpp to the end of file.
>
> _k256_W[] is the same as _k256[] with repeated 4 values. I would suggest to generated it dynamically in stubGenerator_x86_64.cpp based on _k256:
>
> StubRoutines::x86::_k256_W_adr = generate_k256_W();
>
> What testing was done? Did you ran with fastdebug build? I am concern about size of new stub and current code_size2 is enough.
>
> Thanks,
> Vladimir
>
> On 4/18/16 2:44 PM, Civlin, Jan wrote:
>> == Correction in the subject line ===
>>
>> We would like to contribute the SHA256 AVX2 intrinsic.
>>
>> This intrinsic is for x86 AVX2 architecture when no supports_sha() is available. It is L64 code only.
>>
>> The patch delivers x2 performance gain for the latency and throughput - both are measured on an average message.
>>
>> Contributor: Jan Civlin.
>>
>>
>> bug: https://bugs.openjdk.java.net/browse/JDK-8154495
>> webrev: http://cr.openjdk.java.net/~vdeshpande/8154495/webrev.00/
>>


More information about the hotspot-compiler-dev mailing list