RFR(S): 8209544: AES encrypt performance regression in jdk11b11
Dmitry Chuyko
dmitry.chuyko at bell-sw.com
Wed Sep 5 17:11:44 UTC 2018
On 09/05/2018 07:24 PM, Dmitry Chuyko wrote:
> On 09/05/2018 07:00 PM, Vladimir Kozlov wrote:
>> Hi Dmitry,
>>
>> What are (* bytes) values? Is it bytecode size? Why it is different?
> It is a distance between captured event addresses in particular hot
> region (first and last).
> Perf attributes one more instruction (2 instrs down) in 132 bytes
> case, it is just a comparison with 52 (0.37%). The code is the same so
> this doesn't look too suspicious to me.
Or it does :-) That may be a branch / branch miss miss inside stub. And
then we may see extra instructions attributed and the branch itself. The
extra part of region 1 is
__ cmpw(keylen, 44);
__ br(Assembler::EQ, L_doLast);
__ aese(v0, v1);
__ aesmc(v0, v0);
__ aese(v0, v2);
__ aesmc(v0, v0);
__ ld1(v1, v2, __ T16B, __ post(key, 32));
__ rev32(v1, __ T16B, v1);
__ rev32(v2, __ T16B, v2);
__ cmpw(keylen, 52);
__ br(Assembler::EQ, L_doLast);
Region 2 is what happens in L_doLast.
-prof perfnorm shows 7-14% more branch misses.
> But different percentage for stub parts does. Note, regions percentage
> distribution after inlining looks the same, e.g.
>
> ....[Hottest Methods (after
> inlining)]..............................................................
> 83.67% runtime stub StubRoutines::aescrypt_encryptBlock
> 7.69% c2, level 4
> com.sun.crypto.provider.CipherCore::doFinal, version 868
> 4.34% c2, level 4
> org.openjdk.bench.javax.crypto.small.generated.AESBench_encrypt_jmhTest::encrypt_thrpt_jmhStub,
> version 889
>
> and
>
> 84.03% runtime stub StubRoutines::aescrypt_encryptBlock
> 7.85% c2, level 4
> com.sun.crypto.provider.CipherCore::doFinal, version 860
> 4.22% c2, level 4
> org.openjdk.bench.javax.crypto.small.generated.AESBench_encrypt_jmhTest::encrypt_thrpt_jmhStub,
> version 880
>
> -Dmitry
>
>>
>> Thanks,
>> Vladimir
>>
>> On 9/5/18 8:50 AM, Dmitry Chuyko wrote:
>>> I made few runs on ThunderX2 (aarch64). It is funny but I see almost
>>> reverse difference in small.AESBench.encrypt: ~4% regression for
>>> both -XX:-UseSwitchProfiling and patched version against current
>>> code. No difference for full.AESBench.encrypt.
>>>
>>> Stub code is the same and profiles differ slightly:
>>>
>>> Mainline
>>> 53.91% runtime stub StubRoutines::aescrypt_encryptBlock
>>> (128 bytes)
>>> 29.76% runtime stub StubRoutines::aescrypt_encryptBlock (40
>>> bytes)
>>> 7.64% c2, level 4
>>> com.sun.crypto.provider.CipherCore::doFinal, version 868 (356 bytes)
>>>
>>> -XX:+UnlockExperimentalVMOptions -XX:-UseSwitchProfiling
>>> 57.08% runtime stub StubRoutines::aescrypt_encryptBlock
>>> (132 bytes)
>>> 26.95% runtime stub StubRoutines::aescrypt_encryptBlock (40
>>> bytes)
>>> 7.85% c2, level 4
>>> com.sun.crypto.provider.CipherCore::doFinal, version 860 (384 bytes)
>>>
>>> Patched
>>> 58.15% runtime stub StubRoutines::aescrypt_encryptBlock
>>> (132 bytes)
>>> 26.44% runtime stub StubRoutines::aescrypt_encryptBlock (40
>>> bytes)
>>> 6.67% c2, level 4
>>> com.sun.crypto.provider.CipherCore::doFinal, version 866 (128 bytes)
>>>
>>> -Dmitry
>>>
>>> On 09/05/2018 11:05 AM, Roland Westrelin wrote:
>>>> Thanks for the review. Anyone else?
>>>>
>>>> Roland.
>>>
>
More information about the hotspot-compiler-dev
mailing list