RFR 8222717 [lworld] Calling convention - repair C1 stack
Tobias Hartmann
tobias.hartmann at oracle.com
Wed Apr 24 13:21:06 UTC 2019
Hi Ioi,
On 23.04.19 03:02, Ioi Lam wrote:
> Here's an example of the code that I am generating for JDK-8222717 [lworld] Calling
> convention - repair C1 stack [1]
>
> The C2 stack repair code relies on the reserved slots in the VEP calling convention
> (see [2], page 14) to preserve the caller's return address.
>
> However, I haven't quite figured out how to do the same thing for
> the VVEP calling convention (as doing so will also recursively affect the VEP convetion).
> So for now, I decide to have a simplier approach for C1, by directly manipulating the
> return address on the stack. See line 165 in the following dump.
Yes, that's reasonable.
> Actually, I am not quite sure how the C2 code uses the RA pushed by line 31, but it
> turns out to be very handy for C1 :-)
It currently does not use it but just pushes it there for consistency (in case someone looks at the
frame). I'm planning to revisit/benchmark the implementation once everything is done and maybe get
rid of the reserved entry complexity by restoring the RA address in a similar way you are doing it
for C1 (requires two additional mov instructions at the end).
> Here's the webrev:
>
> http://cr.openjdk.java.net/~iklam/valhalla/8222717-c1-stack-repair.v01/
>
> With C1, the frequency of stack extension is much lower (only when you have scalarized
> floating-point fields), so even though the code is not as efficient as the C2 stack
> repair code, maybe it's still OK?
>
> What do you think?
Yes, that looks good to me.
Thanks,
Tobias
>
> public class Foo {
> static value class V {
> int a = 0, b = 0;
> }
> // C1 extends stack (1 extra stack word)
> static int test(U u1, int a1, int a2, int a3, int a4, int a5, int a6) {
> return a1 + a2 + a6;
> }
> public static void main(String args[]) {
> V v = new V();
> System.out.println("Hello: " + test(u, 1, 2, 3, 4, 5, 6));
> }
> }
>
> ----------------------------------------------------------------------
>
> Foo.test(QFoo$U;IIIIII)I [0x00007fc7a0e361e0, 0x00007fc7a0e36398] 440 bytes
> [Disassembling for mach='i386:x86-64']
> # {method} {0x00007fc78b2f7808} 'test' '(QFoo$U;IIIIII)I' in 'Foo'
> [Entry Point]
> [Verified Entry Point]
> [Verified Value Entry Point (RO)]
> # parm0: xmm0 = float
> # parm1: xmm1 = float
> # parm2: xmm2 = float
> # parm3: rsi = int
> # parm4: rdx = int
> # parm5: rcx = int
> # parm6: r8 = int
> # parm7: r9 = int
> # parm8: rdi = int
> # [sp+0x40] (sp of caller)
> ;; block B1 [0, 0]
>
> 0: push %rbp
> 1: sub $0x30,%rsp
> 5: mov $0x7fc78b2f7808,%rbx
> 15: callq 0x00007fc7a09d8ac0 ; {runtime_call buffer_value_args Runtime1 stub}
> 20: pop %rbp
>
> // extend stack
> 21: add $0x30,%rsp
> 25: pop %r13
> 27: sub $0x10,%rsp
> 31: push %r13 ; << RA saved by stack extension code
>
> 33: mov %rdi,0x8(%rsp)
> 38: mov %r9,%rdi
> 41: mov %r8,%r9
> 44: mov %rcx,%r8
> 47: mov %rdx,%rcx
> 50: mov %rsi,%rdx
> 53: mov 0x10(%rax),%esi
> 56: vmovss %xmm0,0x10(%rsi)
> 61: vmovss %xmm1,0x14(%rsi)
> 66: vmovss %xmm2,0x18(%rsi)
>
> 71: mov %eax,-0x16000(%rsp)
>
> 78: push %rbp ; << now RA is just one word below saved rbp
> 79: sub $0x30,%rsp
> 83: movq $0x50,0x8(%rsp)
> 92: jmpq L_1 (149) v
>
> [Verified Value Entry Point]
> # parm0: rsi:rsi = 'java/lang/Object'
> # parm1: rdx = int
> # parm2: rcx = int
> # parm3: r8 = int
> # parm4: r9 = int
> # parm5: rdi = int
> # parm6: [sp+0x40] = int (sp of caller)
> 128: mov %eax,-0x16000(%rsp)
> 135: push %rbp ; << RA is one word below saved rbp
> 136: sub $0x30,%rsp
> 140: movq $0x40,0x8(%rsp)
>
> L_1
> 149: mov 0x40(%rsp),%eax
> ;; block B0 [0, 6]
>
> 153: add %ecx,%edx
> 155: add %eax,%edx
> 157: mov %rdx,%rax
>
> // stack repair
> 160: mov 0x38(%rsp),%r13 ; get saved RA
> 165: mov 0x30(%rsp),%rbp ; restore saved rbp
> 170: add 0x8(%rsp),%rsp
> 175: push %r13 ; push RA, so stack would look the
> ; same as @ line 135
> // stack repair - end
>
> 177: mov 0x128(%r15),%r10
> 184: test %eax,(%r10) ; {poll_return}
> 187: retq ; return to caller
>
>
> ----------------------------------------------------------------------
> [1] https://bugs.openjdk.java.net/browse/JDK-8222717
> [2] http://cr.openjdk.java.net/~thartmann/talks/2019-ValueType_Optimizations.pdf
>
More information about the valhalla-dev
mailing list