RFR 8222717 [lworld] Calling convention - repair C1 stack
Ioi Lam
ioi.lam at oracle.com
Tue Apr 23 01:02:25 UTC 2019
Hi Tobias,
Here's an example of the code that I am generating for JDK-8222717
[lworld] Calling
convention - repair C1 stack [1]
The C2 stack repair code relies on the reserved slots in the VEP calling
convention
(see [2], page 14) to preserve the caller's return address.
However, I haven't quite figured out how to do the same thing for
the VVEP calling convention (as doing so will also recursively affect
the VEP convetion).
So for now, I decide to have a simplier approach for C1, by directly
manipulating the
return address on the stack. See line 165 in the following dump.
Actually, I am not quite sure how the C2 code uses the RA pushed by line
31, but it
turns out to be very handy for C1 :-)
Here's the webrev:
http://cr.openjdk.java.net/~iklam/valhalla/8222717-c1-stack-repair.v01/
With C1, the frequency of stack extension is much lower (only when you
have scalarized
floating-point fields), so even though the code is not as efficient as
the C2 stack
repair code, maybe it's still OK?
What do you think?
Thanks
- Ioi
public class Foo {
static value class V {
int a = 0, b = 0;
}
// C1 extends stack (1 extra stack word)
static int test(U u1, int a1, int a2, int a3, int a4, int a5, int a6) {
return a1 + a2 + a6;
}
public static void main(String args[]) {
V v = new V();
System.out.println("Hello: " + test(u, 1, 2, 3, 4, 5, 6));
}
}
----------------------------------------------------------------------
Foo.test(QFoo$U;IIIIII)I [0x00007fc7a0e361e0, 0x00007fc7a0e36398] 440
bytes
[Disassembling for mach='i386:x86-64']
# {method} {0x00007fc78b2f7808} 'test' '(QFoo$U;IIIIII)I' in 'Foo'
[Entry Point]
[Verified Entry Point]
[Verified Value Entry Point (RO)]
# parm0: xmm0 = float
# parm1: xmm1 = float
# parm2: xmm2 = float
# parm3: rsi = int
# parm4: rdx = int
# parm5: rcx = int
# parm6: r8 = int
# parm7: r9 = int
# parm8: rdi = int
# [sp+0x40] (sp of caller)
;; block B1 [0, 0]
0: push %rbp
1: sub $0x30,%rsp
5: mov $0x7fc78b2f7808,%rbx
15: callq 0x00007fc7a09d8ac0 ; {runtime_call buffer_value_args
Runtime1 stub}
20: pop %rbp
// extend stack
21: add $0x30,%rsp
25: pop %r13
27: sub $0x10,%rsp
31: push %r13 ; << RA saved by stack extension code
33: mov %rdi,0x8(%rsp)
38: mov %r9,%rdi
41: mov %r8,%r9
44: mov %rcx,%r8
47: mov %rdx,%rcx
50: mov %rsi,%rdx
53: mov 0x10(%rax),%esi
56: vmovss %xmm0,0x10(%rsi)
61: vmovss %xmm1,0x14(%rsi)
66: vmovss %xmm2,0x18(%rsi)
71: mov %eax,-0x16000(%rsp)
78: push %rbp ; << now RA is just one word below
saved rbp
79: sub $0x30,%rsp
83: movq $0x50,0x8(%rsp)
92: jmpq L_1 (149) v
[Verified Value Entry Point]
# parm0: rsi:rsi = 'java/lang/Object'
# parm1: rdx = int
# parm2: rcx = int
# parm3: r8 = int
# parm4: r9 = int
# parm5: rdi = int
# parm6: [sp+0x40] = int (sp of caller)
128: mov %eax,-0x16000(%rsp)
135: push %rbp ; << RA is one word below saved rbp
136: sub $0x30,%rsp
140: movq $0x40,0x8(%rsp)
L_1
149: mov 0x40(%rsp),%eax
;; block B0 [0, 6]
153: add %ecx,%edx
155: add %eax,%edx
157: mov %rdx,%rax
// stack repair
160: mov 0x38(%rsp),%r13 ; get saved RA
165: mov 0x30(%rsp),%rbp ; restore saved rbp
170: add 0x8(%rsp),%rsp
175: push %r13 ; push RA, so stack would look the
; same as @ line 135
// stack repair - end
177: mov 0x128(%r15),%r10
184: test %eax,(%r10) ; {poll_return}
187: retq ; return to caller
----------------------------------------------------------------------
[1] https://bugs.openjdk.java.net/browse/JDK-8222717
[2]
http://cr.openjdk.java.net/~thartmann/talks/2019-ValueType_Optimizations.pdf
More information about the valhalla-dev
mailing list