RFR: 8277948: AArch64: Print the correct stack if -XX:+PreserveFramePointer when crash

Denghui Dong ddong at openjdk.java.net
Mon Nov 29 17:47:18 UTC 2021


Hi,

I found that the native stack frames in the hs log are not accurate sometimes on AArch64, not sure if this is a known issue or an issue worth fixing.

The following steps can quick reproduce the problem:

1. apply the diff(comment the dtrace_object_alloc call in interpreter and make a crash on SharedRuntime::dtrace_object_alloc)

  index 39e99bdd5ed..4fc768e94aa 100644
  --- a/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp
  +++ b/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp
  @@ -3558,6 +3558,7 @@ void TemplateTable::_new() {
       __ store_klass_gap(r0, zr);  // zero klass gap for compressed oops
       __ store_klass(r0, r4);      // store klass last

  +/**
       {
         SkipIfEqual skip(_masm, &DTraceAllocProbes, false);
         // Trigger dtrace event for fastpath
  @@ -3567,6 +3568,7 @@ void TemplateTable::_new() {
         __ pop(atos); // restore the return value

       }
  +*/
       __ b(done);
     }

  diff --git a/src/hotspot/cpu/x86/templateTable_x86.cpp b/src/hotspot/cpu/x86/templateTable_x86.cpp
  index 19530b7c57c..15b0509da4c 100644
  --- a/src/hotspot/cpu/x86/templateTable_x86.cpp
  +++ b/src/hotspot/cpu/x86/templateTable_x86.cpp
  @@ -4033,6 +4033,7 @@ void TemplateTable::_new() {
       Register tmp_store_klass = LP64_ONLY(rscratch1) NOT_LP64(noreg);
       __ store_klass(rax, rcx, tmp_store_klass);  // klass

  +/**
       {
         SkipIfEqual skip_if(_masm, &DTraceAllocProbes, 0);
         // Trigger dtrace event for fastpath
  @@ -4041,6 +4042,7 @@ void TemplateTable::_new() {
              CAST_FROM_FN_PTR(address, static_cast<int (*)(oopDesc*)>(SharedRuntime::dtrace_object_alloc)), rax);
         __ pop(atos);
       }
  +*/

       __ jmp(done);
     }
  diff --git a/src/hotspot/share/runtime/sharedRuntime.cpp b/src/hotspot/share/runtime/sharedRuntime.cpp
  index a5de65ea5ab..60b4bd3bcc8 100644
  --- a/src/hotspot/share/runtime/sharedRuntime.cpp
  +++ b/src/hotspot/share/runtime/sharedRuntime.cpp
  @@ -1002,6 +1002,7 @@ jlong SharedRuntime::get_java_tid(Thread* thread) {
    * 6254741.  Once that is fixed we can remove the dummy return value.
    */
   int SharedRuntime::dtrace_object_alloc(oopDesc* o) {
  +  *(int*)0 = 1;
     return dtrace_object_alloc(Thread::current(), o, o->size());
   }


2. `java -XX:+DTraceAllocProbes -Xcomp -XX:-PreserveFramePointer -version`

On x86_64, the native stack in hs log is complete, but in AArch64, the native stack is incorrect.

In the beginning, I thought it might be the influence of PreserveFramePointer. Later, I found that no matter whether PreserveFramePointer is enabled or not, in the hs log of x86_64, the native stack is always correct, and aarch64 is wrong.

After some investigation, I found that this problem is related to the layout of the stack.

On x86_64, whether it is C/C++, interpreter, or JIT, `callee` will always put the `return address` and `fp` of the `caller` at the bottom of the stack.
Hence, `callee` can always get the `caller sp`(aka `sender sp`) by `fp + 2`, and if `caller` is a compiled method, `caller sp` is the key to getting the `caller`'s `caller` since `caller fp` may be invalid.(see frame::sender_for_compiled_frame).


push   %rbp
mov    %rsp,%rbp

         _ _ _ _ _ _
        |           |
        |           |               |
        |_ _ _ _ _ _|               |
        |           |               |
 caller |           | <- caller sp  |
 _ _ _  |_ _ _ _ _ _|               | expand
        |           |               |
        | ret addr  |               | direction
 callee |_ _ _ _ _ _|               |
        |           |               V
        | caller fp | <- fp
        |_ _ _ _ _ _|



But for AArch64, the C/C++ code doesn't put the `return address` and `fp` of the `caller` at the bottom of the stack.
Hence, we cannot use `fp + 2` to calculate the proper `caller sp`(although it is still implemented this way).

When `caller` is a C1/C2 method A, and `callee` a C/C++ method B, we cannot get the `caller` of A since we cannot get the proper sp value of it.


stp x29, x30, [sp, #-N]!
mov x29, sp

         _ _ _ _ _ _
        |           |
        |           |               |
        |_ _ _ _ _ _|               |
        |           |               |
 caller |           | <- caller sp  |
 _ _ _  |_ _ _ _ _ _|     -         | expand
                          |         |
          . . . . .       |         | direction
         _ _ _ _ _ _      |         |
        |           |     | N       |
        | ret addr  |     |         |
 callee |_ _ _ _ _ _|     |         |
        |           |     -         V
        | caller fp | <- fp
        |_ _ _ _ _ _|



I am not very familiar with AArch64 and have no idea how to fix this issue perfectly at current.

Based on my understanding of the implementation, we can get the correct stack trace when PreserveFramePointer is enabled.

Although PreserveFramePointer is disabled by default, I found that some real applications will enable it in the production environment.
Therefore, in my opinion, this fix can help troubleshoot crash issues in applications that enable PreserveFramePointer on AArch64 platform.

This patch changes the logic of l_sender_sp calculation, uses sender_sp() as the value of l_sender_sp when PreserveFramePointer is enabled.

Any input is appreciated.

Thanks,
Denghui

-------------

Commit messages:
 - 8277948: AArch64: Print the correct stack if -XX:+PreserveFramePointer when crash

Changes: https://git.openjdk.java.net/jdk/pull/6597/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6597&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8277948
  Stats: 13 lines in 4 files changed: 11 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6597.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6597/head:pull/6597

PR: https://git.openjdk.java.net/jdk/pull/6597


More information about the hotspot-dev mailing list