Suboptimal code generation in C2?
Vitaly Davidovich
vitalyd at gmail.com
Wed Sep 23 17:49:10 UTC 2015
Hi,
Consider this simple method:
private static int test(final int ops, int start) {
for (int i = 0; i < ops; ++i) {
start++;
}
return start;
}
8u40 C2 nicely eliminates the loop, but performs a bit of a dance before
arriving at the correct return value (I'm ignoring
prologue/epilogue/safepoint poll):
# parm0: rsi = int
# parm1: rdx = int
# [sp+0x20] (sp of caller)
0x00007f395f591100: sub $0x18,%rsp
0x00007f395f591107: mov %rbp,0x10(%rsp) ;*synchronization entry
0x00007f395f59110c: test %esi,%esi
0x00007f395f59110e: jle 0x00007f395f59112f
0x00007f395f591110: mov $0x1,%r10d
0x00007f395f591116: cmp %esi,%r10d
0x00007f395f591119: jge 0x00007f395f591133 ;*if_icmplt
0x00007f395f59111b: dec %esi
0x00007f395f59111d: add %edx,%esi
0x00007f395f59111f: mov %esi,%eax
0x00007f395f591121: inc %eax ;*iload_1
0x00007f395f591123: add $0x10,%rsp
0x00007f395f591127: pop %rbp
0x00007f395f591128: test %eax,0x5e1eed2(%rip) #
0x00007f39653b0000
; {poll_return}
0x00007f395f59112e: retq
0x00007f395f59112f: mov %edx,%eax
0x00007f395f591131: jmp 0x00007f395f591123
0x00007f395f591133: xor %esi,%esi
0x00007f395f591135: jmp 0x00007f395f59111d
For comparison, here's GCC 5.2 -O3 (O2 is the same) output for effectively
the same code:
int test(int ops, int start) {
for(int i = 0; i < ops; ++i) {
start++;
}
return start;
}
test(int, int):
leal (%rsi,%rdi), %edx
movl %esi, %eax
testl %edi, %edi
cmovg %edx, %eax
ret
Any ideas on why C2 doesn't generate roughly the same thing? Clang 3.7 and
ICC 13.0.1 are almost the same as the GCC output above.
Thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20150923/d02e411d/attachment-0001.html>
More information about the hotspot-compiler-dev
mailing list