RFR(S): 8209544: AES encrypt performance regression in jdk11b11

Dmitry Chuyko dmitry.chuyko at bell-sw.com
Wed Sep 5 17:11:44 UTC 2018


On 09/05/2018 07:24 PM, Dmitry Chuyko wrote:
> On 09/05/2018 07:00 PM, Vladimir Kozlov wrote:
>> Hi Dmitry,
>>
>> What are (* bytes) values? Is it bytecode size? Why it is different?
> It is a distance between captured event addresses in particular hot 
> region (first and last).
> Perf attributes one more instruction (2 instrs down) in 132 bytes 
> case, it is just a comparison with 52 (0.37%). The code is the same so 
> this doesn't look too suspicious to me. 
Or it does :-) That may be a branch / branch miss miss inside stub. And 
then we may see extra instructions attributed and the branch itself. The 
extra part of region 1 is

     __ cmpw(keylen, 44);
     __ br(Assembler::EQ, L_doLast);

     __ aese(v0, v1);
     __ aesmc(v0, v0);
     __ aese(v0, v2);
     __ aesmc(v0, v0);

     __ ld1(v1, v2, __ T16B, __ post(key, 32));
     __ rev32(v1, __ T16B, v1);
     __ rev32(v2, __ T16B, v2);

     __ cmpw(keylen, 52);
     __ br(Assembler::EQ, L_doLast);


Region 2 is what happens in L_doLast.

-prof perfnorm shows 7-14% more branch misses.

> But different percentage for stub parts does. Note, regions percentage 
> distribution after inlining looks the same, e.g.
>
> ....[Hottest Methods (after 
> inlining)]..............................................................
>  83.67%        runtime stub  StubRoutines::aescrypt_encryptBlock
>   7.69%         c2, level 4 
> com.sun.crypto.provider.CipherCore::doFinal, version 868
>   4.34%         c2, level 4 
> org.openjdk.bench.javax.crypto.small.generated.AESBench_encrypt_jmhTest::encrypt_thrpt_jmhStub, 
> version 889
>
> and
>
>  84.03%        runtime stub  StubRoutines::aescrypt_encryptBlock
>   7.85%         c2, level 4 
> com.sun.crypto.provider.CipherCore::doFinal, version 860
>   4.22%         c2, level 4 
> org.openjdk.bench.javax.crypto.small.generated.AESBench_encrypt_jmhTest::encrypt_thrpt_jmhStub, 
> version 880
>
> -Dmitry
>
>>
>> Thanks,
>> Vladimir
>>
>> On 9/5/18 8:50 AM, Dmitry Chuyko wrote:
>>> I made few runs on ThunderX2 (aarch64). It is funny but I see almost 
>>> reverse difference in small.AESBench.encrypt: ~4% regression for 
>>> both -XX:-UseSwitchProfiling and patched version against current 
>>> code. No difference for full.AESBench.encrypt.
>>>
>>> Stub code is the same and profiles differ slightly:
>>>
>>> Mainline
>>>   53.91%        runtime stub StubRoutines::aescrypt_encryptBlock 
>>> (128 bytes)
>>>   29.76%        runtime stub StubRoutines::aescrypt_encryptBlock (40 
>>> bytes)
>>>    7.64%         c2, level 4 
>>> com.sun.crypto.provider.CipherCore::doFinal, version 868 (356 bytes)
>>>
>>> -XX:+UnlockExperimentalVMOptions -XX:-UseSwitchProfiling
>>>   57.08%        runtime stub StubRoutines::aescrypt_encryptBlock 
>>> (132 bytes)
>>>   26.95%        runtime stub StubRoutines::aescrypt_encryptBlock (40 
>>> bytes)
>>>    7.85%         c2, level 4 
>>> com.sun.crypto.provider.CipherCore::doFinal, version 860 (384 bytes)
>>>
>>> Patched
>>>   58.15%        runtime stub StubRoutines::aescrypt_encryptBlock 
>>> (132 bytes)
>>>   26.44%        runtime stub StubRoutines::aescrypt_encryptBlock (40 
>>> bytes)
>>>    6.67%         c2, level 4 
>>> com.sun.crypto.provider.CipherCore::doFinal, version 866 (128 bytes)
>>>
>>> -Dmitry
>>>
>>> On 09/05/2018 11:05 AM, Roland Westrelin wrote:
>>>> Thanks for the review. Anyone else?
>>>>
>>>> Roland.
>>>
>



More information about the hotspot-compiler-dev mailing list