[11] RFR(M) 8201193: Use XMM/YMM for objects initialization
Vladimir Kozlov
vladimir.kozlov at oracle.com
Mon Jun 11 20:57:00 UTC 2018
http://cr.openjdk.java.net/~kvn/8201193/webrev.02/
https://bugs.openjdk.java.net/browse/JDK-8201193
Changes are contributed by Rohit from AMD.
For new allocated java object on x86 C2 generates simple loop with a
general (EAX) register store for small objects (< InitArrayShortSize) or
'rep stosq' instructions for big one. 'rep stosq' is not the best for
initialization arrays because it has setup latency but it is only one
instruction (very compact). Modern Intel processors have enhanced 'rep
stosb' which addressed this issue and show very good performance.
AMD processors don't have ERMS (Enhanced REP MOVSB/STOSB) support. To
have good performance it is suggested to use SSE/AVX wide registers
stores instead of 'rep stosq'. It shows better performance on AMD cpus.
Tested with jdk-tier1, hs-tier1, hs-tier2, hs-precheckin-comp (-Xcomp),
hs-graal on all x86 OSs. And using '-XX:-UseFastStosb
-XX:+UseXMMForObjInit' flags combination to force new code generation.
--
Thanks,
Vladimir
More information about the hotspot-compiler-dev
mailing list