[aarch64-port-dev ] Code for frame setup is a bit ... odd?

Fri Nov 22 03:57:20 PST 2013

This is a simple routine:

  # {method} {0x00007fffea369050} 'rotateRight' '(JI)J' in 'java/lang/Long'
  # parm0:    c_rarg1:c_rarg1
                        = long
  # parm1:    c_rarg2   = int
  #           [sp+0x20]  (sp of caller)

  0x00007fffed2e35a0: nop
  0x00007fffed2e35a4: stp	xfp, xlr, [sp,#-16]!
  0x00007fffed2e35a8: sub	sp, sp, #0x10
  0x00007fffed2e35ac: notify	entry           ;*synchronization entry
                                                ; - java.lang.Long::rotateRight at -1 (line 1524)

  0x00007fffed2e35b0: ror	x0, x1, x2      ;*lor  ; - java.lang.Long::rotateRight at 7 (line 1524)

  0x00007fffed2e35b4: add	sp, sp, #0x10
  0x00007fffed2e35b8: ldp	xfp, xlr, [sp],#16
  0x00007fffed2e35bc: notify	reentry
  0x00007fffed2e35c0: adrp	xscratch1, 0x00007ffff7ffb000
                                                ;   {poll_return}
  0x00007fffed2e35c4: ldr	wzr, [xscratch1,#256]  ;   {poll_return}
  0x00007fffed2e35c8: ret

Note that we're creating a stackframe of 16 bytes that isn't used.  I wonder
what's going on here.  x86 seems to do the same thing:

  # {method} {0x00007fc0b68fd8e0} 'rotateRight' '(JI)J' in 'java/lang/Long'
  # parm0:    rsi:rsi   = long
  # parm1:    rdx       = int
  #           [sp+0x20]  (sp of caller)

  0x00007fc0cd28f780: sub    $0x18,%rsp
  0x00007fc0cd28f787: mov    %rbp,0x10(%rsp)    ;*synchronization entry
                                                ; - java.lang.Long::rotateRight at -1 (line 1524)

  0x00007fc0cd28f78c: mov    %rsi,%rax
  0x00007fc0cd28f78f: mov    %edx,%ecx
  0x00007fc0cd28f791: ror    %cl,%rax           ;*lor  ; - java.lang.Long::rotateRight at 7 (line 1524)

  0x00007fc0cd28f794: add    $0x10,%rsp
  0x00007fc0cd28f798: pop    %rbp
  0x00007fc0cd28f799: test   %eax,0xacb2861(%rip)        # 0x00007fc0d7f42000
                                                ;   {poll_return}
  0x00007fc0cd28f79f: retq

There isn't a stack overflow probe because we don't need one: the caller's
stack is big enough.  Maybe no-one cares, because simple leaf routines get
inlined eventually anyway.  Maybe C2 is making sure that the stack is always
32-aligned.

Andrew.