RFR: 8277948: AArch64: Print the correct native stack if -XX:+PreserveFramePointer when crash [v5]
Andrew Dinn
adinn at openjdk.java.net
Thu Jan 27 17:28:35 UTC 2022
On Thu, 27 Jan 2022 15:15:15 GMT, Denghui Dong <ddong at openjdk.org> wrote:
>> Hi,
>>
>> I found that the native stack frames in the hs log are not accurate sometimes on AArch64, not sure if this is a known issue or an issue worth fixing.
>>
>> The following steps can quick reproduce the problem:
>>
>> 1. apply the diff(comment the dtrace_object_alloc call in interpreter and make a crash on SharedRuntime::dtrace_object_alloc)
>>
>> index 39e99bdd5ed..4fc768e94aa 100644
>> --- a/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp
>> +++ b/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp
>> @@ -3558,6 +3558,7 @@ void TemplateTable::_new() {
>> __ store_klass_gap(r0, zr); // zero klass gap for compressed oops
>> __ store_klass(r0, r4); // store klass last
>>
>> +/**
>> {
>> SkipIfEqual skip(_masm, &DTraceAllocProbes, false);
>> // Trigger dtrace event for fastpath
>> @@ -3567,6 +3568,7 @@ void TemplateTable::_new() {
>> __ pop(atos); // restore the return value
>>
>> }
>> +*/
>> __ b(done);
>> }
>>
>> diff --git a/src/hotspot/cpu/x86/templateTable_x86.cpp b/src/hotspot/cpu/x86/templateTable_x86.cpp
>> index 19530b7c57c..15b0509da4c 100644
>> --- a/src/hotspot/cpu/x86/templateTable_x86.cpp
>> +++ b/src/hotspot/cpu/x86/templateTable_x86.cpp
>> @@ -4033,6 +4033,7 @@ void TemplateTable::_new() {
>> Register tmp_store_klass = LP64_ONLY(rscratch1) NOT_LP64(noreg);
>> __ store_klass(rax, rcx, tmp_store_klass); // klass
>>
>> +/**
>> {
>> SkipIfEqual skip_if(_masm, &DTraceAllocProbes, 0);
>> // Trigger dtrace event for fastpath
>> @@ -4041,6 +4042,7 @@ void TemplateTable::_new() {
>> CAST_FROM_FN_PTR(address, static_cast<int (*)(oopDesc*)>(SharedRuntime::dtrace_object_alloc)), rax);
>> __ pop(atos);
>> }
>> +*/
>>
>> __ jmp(done);
>> }
>> diff --git a/src/hotspot/share/runtime/sharedRuntime.cpp b/src/hotspot/share/runtime/sharedRuntime.cpp
>> index a5de65ea5ab..60b4bd3bcc8 100644
>> --- a/src/hotspot/share/runtime/sharedRuntime.cpp
>> +++ b/src/hotspot/share/runtime/sharedRuntime.cpp
>> @@ -1002,6 +1002,7 @@ jlong SharedRuntime::get_java_tid(Thread* thread) {
>> * 6254741. Once that is fixed we can remove the dummy return value.
>> */
>> int SharedRuntime::dtrace_object_alloc(oopDesc* o) {
>> + *(int*)0 = 1;
>> return dtrace_object_alloc(Thread::current(), o, o->size());
>> }
>>
>>
>> 2. `java -XX:+DTraceAllocProbes -Xcomp -XX:-PreserveFramePointer -version`
>>
>> On x86_64, the native stack in hs log is complete, but on AArch64, the native stack is incorrect.
>>
>> In the beginning, I thought it might be the influence of PreserveFramePointer. Later, I found that no matter whether PreserveFramePointer is enabled or not, in the hs log of x86_64, the native stack is always correct, and aarch64 is wrong.
>>
>> After some investigation, I found that this problem is related to the layout of the stack.
>>
>> On x86_64, whether it is C/C++, interpreter, or JIT, `callee` will always put the `return address` and `fp` of the `caller` at the bottom of the stack.
>> Hence, `callee` can always get the `caller sp`(aka `sender sp`) by `fp + 2`, and if `caller` is a compiled method, `caller sp` is the key to getting the `caller`'s `caller` since `caller fp` may be invalid.(see frame::sender_for_compiled_frame).
>>
>>
>> push %rbp
>> mov %rsp,%rbp
>>
>> _ _ _ _ _ _
>> | |
>> | | |
>> |_ _ _ _ _ _| |
>> | | |
>> caller | | <- caller sp |
>> _ _ _ |_ _ _ _ _ _| | expand
>> | | |
>> | ret addr | | direction
>> callee |_ _ _ _ _ _| |
>> | | V
>> | caller fp | <- fp
>> |_ _ _ _ _ _|
>>
>>
>>
>> But for AArch64, the C/C++ code doesn't put the `return address` and `fp` of the `caller` at the bottom of the stack.
>> Hence, we cannot use `fp + 2` to calculate the proper `caller sp`(although it is still implemented this way).
>>
>> When `caller` is a C1/C2 method A, and `callee` is a C/C++ method B, we cannot get the `caller` of A since we cannot get the proper sp value of it.
>>
>>
>> stp x29, x30, [sp, #-N]!
>> mov x29, sp
>>
>> _ _ _ _ _ _
>> | |
>> | | |
>> |_ _ _ _ _ _| |
>> | | |
>> caller | | <- caller sp |
>> _ _ _ |_ _ _ _ _ _| - | expand
>> | |
>> . . . . . | | direction
>> _ _ _ _ _ _ | |
>> | | | N |
>> | ret addr | | |
>> callee |_ _ _ _ _ _| | |
>> | | - V
>> | caller fp | <- fp
>> |_ _ _ _ _ _|
>>
>>
>>
>> I am not very familiar with AArch64 and have no idea how to fix this issue perfectly at current.
>>
>> Based on my understanding of the implementation, we can get the correct stack trace when PreserveFramePointer is enabled.
>>
>> Although PreserveFramePointer is disabled by default, I found that some real applications will enable it in the production environment.
>> Therefore, in my opinion, this fix can help troubleshoot crash issues in applications that enable PreserveFramePointer on AArch64 platform.
>>
>> This patch changes the logic of l_sender_sp calculation, uses sender_sp() as the value of l_sender_sp when PreserveFramePointer is enabled.
>>
>> Any input is appreciated.
>>
>> Thanks,
>> Denghui
>
> Denghui Dong has updated the pull request incrementally with one additional commit since the last revision:
>
> add comment
Marked as reviewed by adinn (Reviewer).
-------------
PR: https://git.openjdk.java.net/jdk/pull/6597
More information about the hotspot-dev
mailing list