RFR: 8209413: AArch64: NPE in clhsdb jstack command

Nick Gasson (Arm Technology China) Nick.Gasson at arm.com
Fri Feb 1 09:22:39 UTC 2019


Hi all,

Please review this patch to fix a crash in the clhsdb "jstack -v" 
command on AArch64:

Bug: https://bugs.openjdk.java.net/browse/JDK-8209413
Webrev: http://cr.openjdk.java.net/~ngasson/8209413/webrev.01/

On AArch64, if you use "jhsdb clhsdb --pid=..." to attach a debugger to 
a Java process, and then run "jstack -v" while any thread is executing a 
native method, you will get a NullPointerException like this:

java.lang.NullPointerException
	at 
jdk.hotspot.agent/sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:83)
	at 
jdk.hotspot.agent/sun.jvm.hotspot.CommandProcessor$24.doit(CommandProcessor.java:1066)

The problem is this constructor of AARCH64Frame, which is called by when 
trying to construct the frame for the native method (wrapper):

   AARCH64Frame(Address raw_sp, Address raw_fp)

It takes the last-known Java SP and FP, and tries to find the PC 
following the BL instruction that jumped to the native code. At the 
moment it assumes this is at SP[-1]:

   this.pc = raw_sp.getAddressAt(-1 * VM.getVM().getAddressSize());

I think this comes from x86 where the CALL instruction will push it 
there. The fix in this patch is to scan the stack of the native (C++) 
function looking for a pointer back to the frame of the native wrapper. 
If we find this then the LR on entry should be saved in the stack slot 
above. This fails if the native function stack frame is too big, or it's 
a leaf function that didn't push FP/LR. But in this case setting this.pc 
to null makes the frame object look invalid and so won't be printed, 
instead of crashing.

Sample output from the ClhsdbJstack.java jtreg test which passes now 
with this patch:

  - java.lang.Thread.sleep(long) @bci=0, pc=0x0000ffffa05bb68c, 
Method*=0x0000000800143410 (Compiled frame; information may be imprecise)
  - jdk.test.lib.apps.LingeredApp.main(java.lang.String[]) @bci=53, 
line=502, pc=0x0000ffff9892572c, Method*=0x0000ffff2e870d48 (Interpreted 
frame)

An alternative to scanning the native functions's stack might be to use 
the last-known Java PC saved by set_last_Java_frame, this is off by a 
few instructions but I don't think that matters. But this constructor 
potentially has other users so we'd have to fix it anyway.

Thanks,
Nick


More information about the serviceability-dev mailing list