RFR: 8154495: SHA256 AVX2 intrinsic (when no supports_sha() available)
Civlin, Jan
jan.civlin at intel.com
Thu Apr 21 18:15:11 UTC 2016
Vladimir,
I corrected the asserting guards in added instructions, also the guard for the very sha-avx2 function.
Please look at
http://cr.openjdk.java.net/~vdeshpande/8154495/webrev.02/
Thank you,
J
-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
Sent: Wednesday, April 20, 2016 3:51 PM
To: Civlin, Jan <jan.civlin at intel.com>; hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
Subject: Re: RFR: 8154495: SHA256 AVX2 intrinsic (when no supports_sha() available)
Testing is continued but it found next problem already when running tests with -XX:UseSSE=2:
# Internal Error
(/opt/jprt/T/P1/185544.vkozlov/s/hotspot/src/cpu/x86/vm/assembler_x86.cpp:3693),
pid=52652, tid=3587
# Error: assert(VM_Version::supports_ssse3()) failed
V [libjvm.dylib+0x4193d7] report_vm_error(char const*, int, char const*, char const*, ...)+0xcd V [libjvm.dylib+0x1eedd2] Assembler::vpshufb(XMMRegisterImpl*,
XMMRegisterImpl*, XMMRegisterImpl*, int)+0x4e V [libjvm.dylib+0x87c237] MacroAssembler::sha256_AVX2(XMMRegisterImpl*, XMMRegisterImpl*, XMMRegisterImpl*, XMMRegisterImpl*, XMMRegisterImpl*, XMMRegisterImpl*, XMMRegisterImpl*, XMMRegisterImpl*, RegisterImpl*, RegisterImpl*, RegisterImpl*, RegisterImpl*, RegisterImpl*, bool, XMMRegisterImpl*)+0x1297 V [libjvm.dylib+0xa4dc47] StubGenerator::generate_sha256_implCompress(bool, char const*)+0x27b
Vladimir
On 4/20/16 1:13 PM, Civlin, Jan wrote:
> Thank you, Vladimir.
>
> I guess it was a warning.
> I usually keep a comma in the last line of enum so I will not need to change the existing lines if I add new.
>
>
> Section 6.7.2.2 of C99 lists the syntax as:
>
> enum-specifier:
> enum identifieropt { enumerator-list }
> enum identifieropt { enumerator-list , }
> enum identifier
> enumerator-list:
> enumerator
> enumerator-list , enumerator
> enumerator:
> enumeration-constant
> enumeration-constant = constant-expression
>
>
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Wednesday, April 20, 2016 1:04 PM
> To: Civlin, Jan <jan.civlin at intel.com>; hotspot compiler
> <hotspot-compiler-dev at openjdk.java.net>
> Subject: Re: RFR: 8154495: SHA256 AVX2 intrinsic (when no
> supports_sha() available)
>
> One thing was caught during build is ',' at the last line of enum:
>
> + STACK_SIZE = _RSP + _RSP_SIZE,
> +};
>
> Compiler complains about it so I removed it in my local repo.
>
> Vladimir
>
> On 4/20/16 12:07 PM, Civlin, Jan wrote:
>> Thank you!
>>
>> -----Original Message-----
>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>> Sent: Wednesday, April 20, 2016 11:38 AM
>> To: Civlin, Jan <jan.civlin at intel.com>; hotspot compiler
>> <hotspot-compiler-dev at openjdk.java.net>
>> Subject: Re: RFR: 8154495: SHA256 AVX2 intrinsic (when no
>> supports_sha() available)
>>
>> Looks good to me. I submitted testing on all platforms before integrating.
>>
>> Thanks,
>> Vladimir
>>
>> On 4/20/16 3:11 AM, Civlin, Jan wrote:
>>> Vladimir,
>>>
>>> Please look at the updated patch at
>>> http://cr.openjdk.java.net/~vdeshpande/8154495/webrev.01/
>>>
>>> I removed the definitions of unused [v]movdqa(), vpsrldq(), vpslldq().
>>>
>>> The k256_W is actually a table of the size of two k256 - each line of k256 is repeated twice. As you have suggested I made changes to generate k256_W from k256.
>>>
>>> The patch was tested in three configurations: slowdebug, release and fastdebug in Win/Linux 64.
>>>
>>> Thank you,
>>>
>>> J
>>>
>>> [jcivlin at HSW-EP02 TestSHA]$
>>> ../../sha-041116/build/linux-x86_64-normal-server-release/jdk/bin/ja
>>> v a -Xbatch -XX:+UseSHA -XX:+UseSHA256Intrinsics
>>> -XX:+ShowMessageBoxOnError -Dalgorithm=SHA-256 -jar TestSHA.jar
>>> 10000000 provider = SUN algorithm = SHA-256 msgSize = 1024 bytes
>>> offset = 0 iters = 10000000 warmupIters = 20000 hash [32]: 78 5b 07
>>> 51 fc 2c 53 dc 14 a4 ce 3d 80 0e 69 ef 9c e1 00 9e b3 27 cc f4 58 af
>>> e0 9c 24 2c 26 c9 TestSHA runtime = 28.756324129 seconds TestSHA
>>> throughput = 356.09558280340946 MB/s
>>>
>>> [jcivlin at HSW-EP02 TestSHA]$
>>> ../../sha-041116/build/linux-x86_64-normal-server-fastdebug/jdk/bin/
>>> j a va -Xbatch -XX:+UseSHA -XX:+UseSHA256Intrinsics
>>> -XX:+ShowMessageBoxOnError -Dalgorithm=SHA-256 -jar TestSHA.jar
>>> 10000000 provider = SUN algorithm = SHA-256 msgSize = 1024 bytes
>>> offset = 0 iters = 10000000 warmupIters = 20000 hash [32]: 78 5b 07
>>> 51 fc 2c 53 dc 14 a4 ce 3d 80 0e 69 ef 9c e1 00 9e b3 27 cc f4 58 af
>>> e0 9c 24 2c 26 c9 TestSHA runtime = 28.912701124 seconds TestSHA
>>> throughput = 354.1696071938408 MB/s
>>>
>>> [jcivlin at HSW-EP02 TestSHA]$
>>> ../../sha-041116/build/linux-x86_64-normal-server-slowdebug/jdk/bin/
>>> j a va -Xbatch -XX:+UseSHA -XX:+UseSHA256Intrinsics
>>> -XX:+ShowMessageBoxOnError -Dalgorithm=SHA-256 -jar TestSHA.jar
>>> 10000000 provider = SUN algorithm = SHA-256 msgSize = 1024 bytes
>>> offset = 0 iters = 10000000 warmupIters = 20000 hash [32]: 78 5b 07
>>> 51 fc 2c 53 dc 14 a4 ce 3d 80 0e 69 ef 9c e1 00 9e b3 27 cc f4 58 af
>>> e0 9c 24 2c 26 c9 TestSHA runtime = 29.339789962 seconds TestSHA
>>> throughput = 349.01408678325697 MB/s
>>>
>>>
>>> -----Original Message-----
>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>> Sent: Monday, April 18, 2016 5:09 PM
>>> To: Civlin, Jan; hotspot compiler
>>> Subject: Re: RFR: 8154495: SHA256 AVX2 intrinsic (when no
>>> supports_sha() available)
>>>
>>> Hi Jan,
>>>
>>> The patch was generated on Windows and have ^M at the end of lines so I can't apply it to our sources.
>>>
>>> I don't see usage of new [v]movdqa(), vpsrldq(), vpslldq(), instructions.
>>>
>>> Please, move new code in macroAssembler_x86_sha.cpp to the end of file.
>>>
>>> _k256_W[] is the same as _k256[] with repeated 4 values. I would suggest to generated it dynamically in stubGenerator_x86_64.cpp based on _k256:
>>>
>>> StubRoutines::x86::_k256_W_adr = generate_k256_W();
>>>
>>> What testing was done? Did you ran with fastdebug build? I am concern about size of new stub and current code_size2 is enough.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 4/18/16 2:44 PM, Civlin, Jan wrote:
>>>> == Correction in the subject line ===
>>>>
>>>> We would like to contribute the SHA256 AVX2 intrinsic.
>>>>
>>>> This intrinsic is for x86 AVX2 architecture when no supports_sha() is available. It is L64 code only.
>>>>
>>>> The patch delivers x2 performance gain for the latency and throughput - both are measured on an average message.
>>>>
>>>> Contributor: Jan Civlin.
>>>>
>>>>
>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8154495
>>>> webrev: http://cr.openjdk.java.net/~vdeshpande/8154495/webrev.00/
>>>>
More information about the hotspot-compiler-dev
mailing list