RFR: 8277948: AArch64: Print the correct native stack if -XX:+PreserveFramePointer when crash [v2]
Andrew Haley
aph at openjdk.java.net
Mon Jan 17 18:15:21 UTC 2022
On Fri, 14 Jan 2022 16:15:06 GMT, Denghui Dong <ddong at openjdk.org> wrote:
>> Hi,
>>
>> I found that the native stack frames in the hs log are not accurate sometimes on AArch64, not sure if this is a known issue or an issue worth fixing.
>>
>> The following steps can quick reproduce the problem:
>>
>> 1. apply the diff(comment the dtrace_object_alloc call in interpreter and make a crash on SharedRuntime::dtrace_object_alloc)
>>
>> index 39e99bdd5ed..4fc768e94aa 100644
>> --- a/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp
>> +++ b/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp
>> @@ -3558,6 +3558,7 @@ void TemplateTable::_new() {
>> __ store_klass_gap(r0, zr); // zero klass gap for compressed oops
>> __ store_klass(r0, r4); // store klass last
>>
>> +/**
>> {
>> SkipIfEqual skip(_masm, &DTraceAllocProbes, false);
>> // Trigger dtrace event for fastpath
>> @@ -3567,6 +3568,7 @@ void TemplateTable::_new() {
>> __ pop(atos); // restore the return value
>>
>> }
>> +*/
>> __ b(done);
>> }
>>
>> diff --git a/src/hotspot/cpu/x86/templateTable_x86.cpp b/src/hotspot/cpu/x86/templateTable_x86.cpp
>> index 19530b7c57c..15b0509da4c 100644
>> --- a/src/hotspot/cpu/x86/templateTable_x86.cpp
>> +++ b/src/hotspot/cpu/x86/templateTable_x86.cpp
>> @@ -4033,6 +4033,7 @@ void TemplateTable::_new() {
>> Register tmp_store_klass = LP64_ONLY(rscratch1) NOT_LP64(noreg);
>> __ store_klass(rax, rcx, tmp_store_klass); // klass
>>
>> +/**
>> {
>> SkipIfEqual skip_if(_masm, &DTraceAllocProbes, 0);
>> // Trigger dtrace event for fastpath
>> @@ -4041,6 +4042,7 @@ void TemplateTable::_new() {
>> CAST_FROM_FN_PTR(address, static_cast<int (*)(oopDesc*)>(SharedRuntime::dtrace_object_alloc)), rax);
>> __ pop(atos);
>> }
>> +*/
>>
>> __ jmp(done);
>> }
>> diff --git a/src/hotspot/share/runtime/sharedRuntime.cpp b/src/hotspot/share/runtime/sharedRuntime.cpp
>> index a5de65ea5ab..60b4bd3bcc8 100644
>> --- a/src/hotspot/share/runtime/sharedRuntime.cpp
>> +++ b/src/hotspot/share/runtime/sharedRuntime.cpp
>> @@ -1002,6 +1002,7 @@ jlong SharedRuntime::get_java_tid(Thread* thread) {
>> * 6254741. Once that is fixed we can remove the dummy return value.
>> */
>> int SharedRuntime::dtrace_object_alloc(oopDesc* o) {
>> + *(int*)0 = 1;
>> return dtrace_object_alloc(Thread::current(), o, o->size());
>> }
>>
>>
>> 2. `java -XX:+DTraceAllocProbes -Xcomp -XX:-PreserveFramePointer -version`
>>
>> On x86_64, the native stack in hs log is complete, but on AArch64, the native stack is incorrect.
>>
>> In the beginning, I thought it might be the influence of PreserveFramePointer. Later, I found that no matter whether PreserveFramePointer is enabled or not, in the hs log of x86_64, the native stack is always correct, and aarch64 is wrong.
>>
>> After some investigation, I found that this problem is related to the layout of the stack.
>>
>> On x86_64, whether it is C/C++, interpreter, or JIT, `callee` will always put the `return address` and `fp` of the `caller` at the bottom of the stack.
>> Hence, `callee` can always get the `caller sp`(aka `sender sp`) by `fp + 2`, and if `caller` is a compiled method, `caller sp` is the key to getting the `caller`'s `caller` since `caller fp` may be invalid.(see frame::sender_for_compiled_frame).
>>
>>
>> push %rbp
>> mov %rsp,%rbp
>>
>> _ _ _ _ _ _
>> | |
>> | | |
>> |_ _ _ _ _ _| |
>> | | |
>> caller | | <- caller sp |
>> _ _ _ |_ _ _ _ _ _| | expand
>> | | |
>> | ret addr | | direction
>> callee |_ _ _ _ _ _| |
>> | | V
>> | caller fp | <- fp
>> |_ _ _ _ _ _|
>>
>>
>>
>> But for AArch64, the C/C++ code doesn't put the `return address` and `fp` of the `caller` at the bottom of the stack.
>> Hence, we cannot use `fp + 2` to calculate the proper `caller sp`(although it is still implemented this way).
>>
>> When `caller` is a C1/C2 method A, and `callee` is a C/C++ method B, we cannot get the `caller` of A since we cannot get the proper sp value of it.
>>
>>
>> stp x29, x30, [sp, #-N]!
>> mov x29, sp
>>
>> _ _ _ _ _ _
>> | |
>> | | |
>> |_ _ _ _ _ _| |
>> | | |
>> caller | | <- caller sp |
>> _ _ _ |_ _ _ _ _ _| - | expand
>> | |
>> . . . . . | | direction
>> _ _ _ _ _ _ | |
>> | | | N |
>> | ret addr | | |
>> callee |_ _ _ _ _ _| | |
>> | | - V
>> | caller fp | <- fp
>> |_ _ _ _ _ _|
>>
>>
>>
>> I am not very familiar with AArch64 and have no idea how to fix this issue perfectly at current.
>>
>> Based on my understanding of the implementation, we can get the correct stack trace when PreserveFramePointer is enabled.
>>
>> Although PreserveFramePointer is disabled by default, I found that some real applications will enable it in the production environment.
>> Therefore, in my opinion, this fix can help troubleshoot crash issues in applications that enable PreserveFramePointer on AArch64 platform.
>>
>> This patch changes the logic of l_sender_sp calculation, uses sender_sp() as the value of l_sender_sp when PreserveFramePointer is enabled.
>>
>> Any input is appreciated.
>>
>> Thanks,
>> Denghui
>
> Denghui Dong has updated the pull request incrementally with one additional commit since the last revision:
>
> fix pfl() crash problem and rename from_thread to from_anchor
Great! This works much better.
I've been digging deeper to try to understand the issue, in order not to break anything: the unwinder is an important and delicate piece of code.
The code at the very root of this problem is, as far as I can see, this:
frame os::get_sender_for_C_frame(frame* fr) {
return frame(fr->link(), fr->link(), fr->sender_pc());
}
in which the sender's SP is set to be the same as the sender's FP, which is almost certainly wrong. As you have observed, we can't determine the sender's SP when native code is called from Java code, because all we have is a chain of frame pointers. However, if we know that a frame was created from the frame anchor, we can trust both the SP and the FP in that frame. When we're merely printing a backtrace we don't need stack pointer at all, so chasing frame pointers is fine.
I'm grateful for your patience. This work is very much worth doing, but we need to be very careful. I'll dig a little more tomorrow.
-------------
PR: https://git.openjdk.java.net/jdk/pull/6597
More information about the hotspot-dev
mailing list