RFR: 8154495: SHA256 AVX2 intrinsic (when no supports_sha() available)

Vladimir Kozlov vladimir.kozlov at oracle.com
Thu Apr 21 22:08:43 UTC 2016


I think there is assumption in some tests that x86 does not support sha2.

Next tests failed:

compiler/intrinsics/sha/cli/TestUseSHA256IntrinsicsOptionOnUnsupportedCPU.java

java.lang.AssertionError: Expected message not found: 'Intrinsics for SHA-224 and SHA-256 crypto hash functions not available on this CPU.'.
JVM should start with '-XX:+UseSHA256Intrinsics' flag, but output should contain warning.

compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java

java.lang.AssertionError: Expected message not found: 'SHA instructions are not available on this CPU'.
JVM should start with '-XX:+UseSHA' flag, but output should contain warning.

compiler/intrinsics/sha/sanity/TestSHA256Intrinsics.java

java.lang.RuntimeException: Unexpected count of intrinsic  _sha2_implCompress is expected:false, matched: 2, suspected: 5

compiler/intrinsics/sha/sanity/TestSHA256MultiBlockIntrinsics.java

java.lang.RuntimeException: Unexpected count of intrinsic  _digestBase_implCompressMB is expected:false, matched: 1, suspected: 6
	
Regards,
Vladimir

On 4/21/16 12:58 PM, Civlin, Jan wrote:
> I know how to run jtreg.
> I tested it on TestSHA.jar, admittedly it was rather a standalone run.
>
> $ /cygdrive/E/Java/sha-041116/build/windows-x86_64-normal-server-fastdebug/jdk/bin/java.exe -Xbatch -XX:+UseSHA -XX:+UseSHA256Intrinsics -XX:UseSSE=2 -XX:+ShowMessageBoxOnError -Dalgorithm=SHA-256 -jar TestSHA.jar 10000000
> provider = SUN
> algorithm = SHA-256
> msgSize = 1024 bytes
> offset = 0
> iters = 10000000
> warmupIters = 20000
> hash [32]: 78 5b 07 51 fc 2c 53 dc 14 a4 ce 3d 80 0e 69 ef 9c e1 00 9e b3 27 cc f4 58 af e0 9c 24 2c 26 c9
> TestSHA runtime = 33.507065719 seconds
> TestSHA throughput = 305.60718404516876 MB/s
>
>
> jcivlin at JCIVLIN-DESK /cygdrive/C/Java/hotspot/test/civlin/TestSHA/TestSHA/dist
> $
>
> I'll take a look what is wrong with jtreg.
>
>
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Thursday, April 21, 2016 12:31 PM
> To: Civlin, Jan <jan.civlin at intel.com>; hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
> Subject: Re: RFR: 8154495: SHA256 AVX2 intrinsic (when no supports_sha() available)
>
> Good. But testing found that jrteg SHA tests failed because they don't expect the SHA2 is supported:
>
> "Expected message not found: 'SHA instructions are not available on this CPU'"
>
> hotspot/test/compiler/intrinsics/sha/
>
> Do you know how to run jtreg tests? May be Sandhya or Michael can help.
>
> Thanks,
> Vladimir
>
> On 4/21/16 11:15 AM, Civlin, Jan wrote:
>> Vladimir,
>>
>> I corrected the asserting guards in added instructions, also the guard for the very sha-avx2 function.
>> Please look at
>>
>> http://cr.openjdk.java.net/~vdeshpande/8154495/webrev.02/
>>
>> Thank you,
>>
>> J
>>
>>
>> -----Original Message-----
>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>> Sent: Wednesday, April 20, 2016 3:51 PM
>> To: Civlin, Jan <jan.civlin at intel.com>; hotspot compiler
>> <hotspot-compiler-dev at openjdk.java.net>
>> Subject: Re: RFR: 8154495: SHA256 AVX2 intrinsic (when no
>> supports_sha() available)
>>
>> Testing is continued but it found next problem already when running tests with -XX:UseSSE=2:
>>
>> #  Internal Error
>> (/opt/jprt/T/P1/185544.vkozlov/s/hotspot/src/cpu/x86/vm/assembler_x86.
>> cpp:3693),
>> pid=52652, tid=3587
>> #  Error: assert(VM_Version::supports_ssse3()) failed
>>
>> V  [libjvm.dylib+0x4193d7]  report_vm_error(char const*, int, char
>> const*, char const*, ...)+0xcd V  [libjvm.dylib+0x1eedd2]
>> Assembler::vpshufb(XMMRegisterImpl*,
>> XMMRegisterImpl*, XMMRegisterImpl*, int)+0x4e V
>> [libjvm.dylib+0x87c237] MacroAssembler::sha256_AVX2(XMMRegisterImpl*,
>> XMMRegisterImpl*, XMMRegisterImpl*, XMMRegisterImpl*,
>> XMMRegisterImpl*, XMMRegisterImpl*, XMMRegisterImpl*,
>> XMMRegisterImpl*, RegisterImpl*, RegisterImpl*, RegisterImpl*,
>> RegisterImpl*, RegisterImpl*, bool, XMMRegisterImpl*)+0x1297 V
>> [libjvm.dylib+0xa4dc47]
>> StubGenerator::generate_sha256_implCompress(bool, char const*)+0x27b
>>
>>
>> Vladimir
>>
>> On 4/20/16 1:13 PM, Civlin, Jan wrote:
>>> Thank you, Vladimir.
>>>
>>> I guess it was a warning.
>>> I usually keep a comma in the last line of enum so I will not need to change the existing lines if I add new.
>>>
>>>
>>> Section 6.7.2.2 of C99 lists the syntax as:
>>>
>>> enum-specifier:
>>>        enum identifieropt { enumerator-list }
>>>        enum identifieropt { enumerator-list , }
>>>        enum identifier
>>> enumerator-list:
>>>        enumerator
>>>        enumerator-list , enumerator
>>> enumerator:
>>>        enumeration-constant
>>>        enumeration-constant = constant-expression
>>>
>>>
>>> -----Original Message-----
>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>> Sent: Wednesday, April 20, 2016 1:04 PM
>>> To: Civlin, Jan <jan.civlin at intel.com>; hotspot compiler
>>> <hotspot-compiler-dev at openjdk.java.net>
>>> Subject: Re: RFR: 8154495: SHA256 AVX2 intrinsic (when no
>>> supports_sha() available)
>>>
>>> One thing was caught during build is ',' at the last line of enum:
>>>
>>> +  STACK_SIZE = _RSP      + _RSP_SIZE,
>>> +};
>>>
>>> Compiler complains about it so I removed it in my local repo.
>>>
>>> Vladimir
>>>
>>> On 4/20/16 12:07 PM, Civlin, Jan wrote:
>>>> Thank you!
>>>>
>>>> -----Original Message-----
>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>> Sent: Wednesday, April 20, 2016 11:38 AM
>>>> To: Civlin, Jan <jan.civlin at intel.com>; hotspot compiler
>>>> <hotspot-compiler-dev at openjdk.java.net>
>>>> Subject: Re: RFR: 8154495: SHA256 AVX2 intrinsic (when no
>>>> supports_sha() available)
>>>>
>>>> Looks good to me. I submitted testing on all platforms before integrating.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 4/20/16 3:11 AM, Civlin, Jan wrote:
>>>>> Vladimir,
>>>>>
>>>>> Please look at the updated patch at
>>>>> http://cr.openjdk.java.net/~vdeshpande/8154495/webrev.01/
>>>>>
>>>>> I removed the definitions of unused [v]movdqa(), vpsrldq(), vpslldq().
>>>>>
>>>>> The k256_W is actually a table of the size of two k256 - each line of k256  is repeated twice. As you have suggested I made changes to generate  k256_W  from k256.
>>>>>
>>>>> The patch was tested in three configurations: slowdebug, release and fastdebug in Win/Linux 64.
>>>>>
>>>>> Thank you,
>>>>>
>>>>> J
>>>>>
>>>>> [jcivlin at HSW-EP02 TestSHA]$
>>>>> ../../sha-041116/build/linux-x86_64-normal-server-release/jdk/bin/j
>>>>> a v a -Xbatch -XX:+UseSHA -XX:+UseSHA256Intrinsics
>>>>> -XX:+ShowMessageBoxOnError -Dalgorithm=SHA-256 -jar TestSHA.jar
>>>>> 10000000 provider = SUN algorithm = SHA-256 msgSize = 1024 bytes
>>>>> offset = 0 iters = 10000000 warmupIters = 20000 hash [32]: 78 5b 07
>>>>> 51 fc 2c 53 dc 14 a4 ce 3d 80 0e 69 ef 9c e1 00 9e b3 27 cc f4 58
>>>>> af
>>>>> e0 9c 24 2c 26 c9 TestSHA runtime = 28.756324129 seconds TestSHA
>>>>> throughput = 356.09558280340946 MB/s
>>>>>
>>>>> [jcivlin at HSW-EP02 TestSHA]$
>>>>> ../../sha-041116/build/linux-x86_64-normal-server-fastdebug/jdk/bin
>>>>> / j a va -Xbatch -XX:+UseSHA -XX:+UseSHA256Intrinsics
>>>>> -XX:+ShowMessageBoxOnError -Dalgorithm=SHA-256 -jar TestSHA.jar
>>>>> 10000000 provider = SUN algorithm = SHA-256 msgSize = 1024 bytes
>>>>> offset = 0 iters = 10000000 warmupIters = 20000 hash [32]: 78 5b 07
>>>>> 51 fc 2c 53 dc 14 a4 ce 3d 80 0e 69 ef 9c e1 00 9e b3 27 cc f4 58
>>>>> af
>>>>> e0 9c 24 2c 26 c9 TestSHA runtime = 28.912701124 seconds TestSHA
>>>>> throughput = 354.1696071938408 MB/s
>>>>>
>>>>> [jcivlin at HSW-EP02 TestSHA]$
>>>>> ../../sha-041116/build/linux-x86_64-normal-server-slowdebug/jdk/bin
>>>>> / j a va -Xbatch -XX:+UseSHA -XX:+UseSHA256Intrinsics
>>>>> -XX:+ShowMessageBoxOnError -Dalgorithm=SHA-256 -jar TestSHA.jar
>>>>> 10000000 provider = SUN algorithm = SHA-256 msgSize = 1024 bytes
>>>>> offset = 0 iters = 10000000 warmupIters = 20000 hash [32]: 78 5b 07
>>>>> 51 fc 2c 53 dc 14 a4 ce 3d 80 0e 69 ef 9c e1 00 9e b3 27 cc f4 58
>>>>> af
>>>>> e0 9c 24 2c 26 c9 TestSHA runtime = 29.339789962 seconds TestSHA
>>>>> throughput = 349.01408678325697 MB/s
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>>> Sent: Monday, April 18, 2016 5:09 PM
>>>>> To: Civlin, Jan; hotspot compiler
>>>>> Subject: Re: RFR: 8154495: SHA256 AVX2 intrinsic (when no
>>>>> supports_sha() available)
>>>>>
>>>>> Hi Jan,
>>>>>
>>>>> The patch was generated on Windows and have ^M at the end of lines so I can't apply it to our sources.
>>>>>
>>>>> I don't see usage of new [v]movdqa(), vpsrldq(), vpslldq(),  instructions.
>>>>>
>>>>> Please, move new code in macroAssembler_x86_sha.cpp to the end of file.
>>>>>
>>>>> _k256_W[] is the same as _k256[] with repeated 4 values. I would suggest to generated it dynamically in stubGenerator_x86_64.cpp based on _k256:
>>>>>
>>>>> StubRoutines::x86::_k256_W_adr = generate_k256_W();
>>>>>
>>>>> What testing was done? Did you ran with fastdebug build? I am concern about size of new stub and current code_size2 is enough.
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>> On 4/18/16 2:44 PM, Civlin, Jan wrote:
>>>>>> == Correction in the subject line ===
>>>>>>
>>>>>> We would like to contribute the SHA256 AVX2 intrinsic.
>>>>>>
>>>>>> This intrinsic is for x86 AVX2 architecture when no supports_sha() is available. It is L64 code only.
>>>>>>
>>>>>> The patch delivers x2 performance gain for the latency and throughput - both are measured on an average message.
>>>>>>
>>>>>> Contributor: Jan Civlin.
>>>>>>
>>>>>>
>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8154495
>>>>>> webrev: http://cr.openjdk.java.net/~vdeshpande/8154495/webrev.00/
>>>>>>


More information about the hotspot-compiler-dev mailing list