RFR: 8316309: AArch64: VMError::print_native_stack() crashes on Java native method frame

David Holmes dholmes at openjdk.org
Fri Sep 29 04:32:17 UTC 2023


On Thu, 28 Sep 2023 21:07:09 GMT, Patricio Chilano Mateo <pchilanomate at openjdk.org> wrote:

> Please review the following patch. As explained in the bug comments the problem is that os::get_sender_for_C_frame() always constructs a frame as if the sender is also a native C/C++ frame. Setting a correct value for _unextended_sp is important to avoid crashes if this value is later used to get that frame's caller, which will happen if we end up calling frame::sender_for_compiled_frame().
> 
> The issue exists on aarch64 for both linux and macos but the fix for linux is different. The "Procedure Call Standard for the Arm 64-bit Architecture" doesn't specify a location for the frame record within a stack frame (6.4.6), and gcc happens to choose to save it the top of the frame (lowest address) rather than the bottom. This means that changing fr->link() for fr->sender_sp() won't work. The fix is to use the value of fr->link() but adjusted using the code blob frame size before setting it as the _unextended_sp of the sender frame. While working on this fix I realized the issue is not only when the sender is a native nmethod but with all frames associated with a CodeBlob with a frame size > 0 (runtime stub, safepoint stub, etc) so the check takes that into account. I also made a small fix to next_frame() since these mentioned frames should also use frame::sender().
> 
> I created a new test to verify that walking the stack over a native nmethod or runtime stub now works okay. I'll try to add a reliable test case for walking over a safepoint stub too. I tested the fix by running the new test and also running tiers1-4 in mach5. I'll run the upper tiers too.
> 
> Thanks,
> Patricio

src/hotspot/share/utilities/vmError.cpp line 434:

> 432:       return invalid;
> 433:     }
> 434:     if (fr.is_interpreted_frame() || (fr.cb() != nullptr && fr.cb()->frame_size() > 0)) {

This part of the fix is unclear to me. How do the old conditions relate to the new ones?

test/hotspot/jtreg/runtime/ErrorHandling/StackWalkNativeToJava.java line 84:

> 82:         public void callNativeMethod() throws Exception {
> 83:             Object obj = new Object();
> 84:             obj.wait();

Just to be clear, the aim here is to call a native method that will complete by throwing an exception, so you can abort the VM. A comment to that affect would be good. Thanks

test/hotspot/jtreg/runtime/ErrorHandling/StackWalkNativeToJava.java line 116:

> 114:         }
> 115: 
> 116:         public void callVMMethod() throws Exception {

Again a comment outlining how you expect this to abort the VM would be good.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/15972#discussion_r1340877413
PR Review Comment: https://git.openjdk.org/jdk/pull/15972#discussion_r1340876167
PR Review Comment: https://git.openjdk.org/jdk/pull/15972#discussion_r1340876631


More information about the hotspot-dev mailing list