RFR(S): 8058345: Refactor native stack printing from vmError.cpp to debug.cpp to make it available in gdb as well

Volker Simonis volker.simonis at gmail.com
Wed Sep 17 18:29:49 UTC 2014


On Wed, Sep 17, 2014 at 12:10 AM, Vladimir Kozlov
<vladimir.kozlov at oracle.com> wrote:
> On 9/16/14 12:21 PM, Volker Simonis wrote:
>>
>> Hi Vladimir,
>>
>> thanks for looking at the change.
>>
>> 'make_frame' is only intended to be used from within the debugger to
>> simplify the usage of the new 'pns()' (i.e. "print native stack")
>> helper. It can be used as follows:
>>
>> (gdb) call pns(make_frame($sp, $rbp, $pc))
>
>
> It is strange way to use pns(). Why not pass (sp, fp, pc) to pns() and let
> it call make_frame()? To have make_frame() only on ppc and x86 will not
> allow to use pns() on other platforms.
>
> Would be nice to have pns() version (names different) without input
> parameters. Can we use os::current_frame() inside for that?
>
> Add pns() description to help() output.
>
>>
>> "Executing pns"
>> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native
>> code)
>> C  [libpthread.so.0+0xc0fe]  pthread_cond_timedwait+0x13e
>> V  [libjvm.so+0x96c4c1]  os::sleep(Thread*, long, bool)+0x1a1
>> V  [libjvm.so+0x75f442]  JVM_Sleep+0x312
>> j  java.lang.Thread.sleep(J)V+0
>> j  CrashNative.crashIt(Lsun/misc/Unsafe;I)V+10
>> j  CrashNative.doIt()V+45
>> v  ~StubRoutines::call_stub
>> V  [libjvm.so+0x71599f]
>> JavaCalls::call_helper(JavaValue*,methodHandle*, JavaCallArguments*,
>> Thread*)+0xf8f
>>
>> What about the two fixesin in 'print_native_stack()' - do you think they
>> are OK?
>
>
> What about is_runtime_frame()? It is wrapper for runtime calls from compiled
> code.
>

Yes, but I don't see how this could help here, because the native
wrapper which makes problems here is a nmethod and not a runtime stub.

Maybe you mean to additionally add is_runtime_frame() to the check?
Yes, I've just realized that that's indeed needed on amd64 to walk
runtime stubs. SPARC is more graceful and works without these changes,
but on amd64 we need them (on both Solaris and Linux) and on Sparc
they don't hurt.

I've written a small test program which should be similar to the one
you used for 8035983:

import java.util.Hashtable;

public class StackTraceTest {
  static Hashtable ht;
  static {
    ht = new Hashtable();
    ht.put("one", "one");
  }

  public static void foo() {
    bar();
  }

  public static void bar() {
    ht.get("one");
  }

  public static void main(String args[]) {
    for (int i = 0; i < 5; i++) {
      new Thread() {
        public void run() {
          while(true) {
            foo();
          }
        }
      }.start();
    }
  }
}

If I run it with "-XX:-Inline -XX:+PrintCompilation
-XX:-TieredCompilation StackTraceTest" inside the debugger and crash
one of the Java threads in native code, I get the correct stack traces
on SPARC. But on amd64, I only get the following without my changes:

Stack: [0xfffffd7da16f9000,0xfffffd7da17f9000],
sp=0xfffffd7da17f7c60,  free space=1019k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libc.so.1+0xc207f]  _lwp_cond_wait+0x1f
V  [libjvm.so+0x171443b]  int
os::Solaris::cond_wait(_lwp_cond*,_lwp_mutex*)+0x2b
V  [libjvm.so+0x171181a]  void os::PlatformEvent::park()+0x1fa
V  [libjvm.so+0x16e09c1]  void ObjectMonitor::EnterI(Thread*)+0x6f1
V  [libjvm.so+0x16dfc8f]  void ObjectMonitor::enter(Thread*)+0x7cf
V  [libjvm.so+0x18cdd00]  void
ObjectSynchronizer::slow_enter(Handle,BasicLock*,Thread*)+0x2a0
V  [libjvm.so+0x18cd6a7]  void
ObjectSynchronizer::fast_enter(Handle,BasicLock*,bool,Thread*)+0x157
V  [libjvm.so+0x182f39e]  void
SharedRuntime::complete_monitor_locking_C(oopDesc*,BasicLock*,JavaThread*)+0x23e
v  ~RuntimeStub::_complete_monitor_locking_Java
C  0x2aad1dd1000016d8

With the changes (and the additional check for is_runtime_frame()) I
get full stack traces on amd64 as well. So I think the changes should
be at least an improvement:)

> You need to check what fr.real_fp() returns on all platforms for the very
> first frame (_lwp_start). That is what this check about - stop walking when
> it reaches the first frame. fr.sender_sp() returns bogus value which is not
> stack pointer for the first frame. From 8035983 review:
>
> "It seems using fr.sender_sp() in the check work on x86 and sparc.
> On x86 it return stack_base value on sparc it returns STACK_BIAS."
>
> Also on other our platforms it could return 0 or small integer value.
>
> If you can suggest an other way to determine the first frame, please, tell.
>

So the initial problem in 8035983 was that we used
os::is_first_C_frame(&fr) for native frames where the sender was a
compiled frame. That didn't work reliably because,
os::is_first_C_frame(&fr) uses fr->link() to get the frame pointer of
the sender and that doesn't work for compiled senders.

So you replaced os::is_first_C_frame(&fr) by
!on_local_stack((address)(fr.sender_sp() + 1)) but that uses addr_at()
internally which in turn uses fp() so it won't work for frames which
have a bogus frame pointer like native wrappers.

I think using fr.real_fp() should be safe because as far as I can see
it is always fr.sender_sp() - 2 on amd64 and equal to fr.sender_sp()
on SPARC. On Linux/amd64 both, the sp and fp of the first frame will
be 0 (still have to check on SPARC). But the example above works fine
with my changes on both, Linux/amd64 and Solaris/SPARC and
Solaris/amd64.

I'll prepare a new webrev tomorrow which will have the documentation
for "pns" and a version of make_frame() for SPARC.

Regards,
Volker

>> Should I move 'print_native_stack()' to vmError.cpp as suggested by David?
>
>
> I am fine with both places.
>
> Thanks,
> Vladimir
>
>
>>
>> Thank you and best regards,
>> Volker
>>
>> On Tue, Sep 16, 2014 at 8:11 PM, Vladimir Kozlov
>> <vladimir.kozlov at oracle.com> wrote:
>>>
>>> Thank you for fixing frame walk.
>>> I don't see where make_frame() is used.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>>
>>> On 9/16/14 9:35 AM, Volker Simonis wrote:
>>>>
>>>>
>>>> Hi,
>>>>
>>>> while testing my change, I found two other small problems with native
>>>> stack traces:
>>>>
>>>> 1. we can not walk native wrappers on (at least not on Linux/amd64)
>>>> because they are treated as native "C" frames. However, if the native
>>>> wrapper was called from a compiled frame which had no valid frame
>>>> pointer (i.e. %rbp) os::get_sender_for_C_frame(&fr) will produce a bad
>>>> frame. This can be easily fixed by treating native wrappers like java
>>>> frames.
>>>>
>>>> 2. the fix for "8035983: Fix "Native frames:" in crash report (hs_err
>>>> file)" introduced a similar problem. If we walk tha stack from a
>>>> native wrapper down to a compiled frame, we will have a frame with an
>>>> invalid frame pointer. In that case, the newly introduced check from
>>>> change 8035983 will fail, because fr.sender_sp() depends on a valid
>>>> fp. I'll propose to replace fr.sender_sp() by fr.real_fp() which
>>>> should do the same but also works for compiled frames with invalid fp.
>>>>
>>>> Here's the new webrev:
>>>>
>>>> http://cr.openjdk.java.net/~simonis/webrevs/8058345.v1/
>>>>
>>>> What dou you think?
>>>>
>>>> Thank you and best regards,
>>>> Volker
>>>>
>>>>
>>>> On Tue, Sep 16, 2014 at 3:48 PM, Volker Simonis
>>>> <volker.simonis at gmail.com> wrote:
>>>>>
>>>>>
>>>>> 'print_native_stack()' must be visible in both vmError.cpp and
>>>>> debug.cpp. Initially I saw that vmError.cpp already included debug.hpp
>>>>> so I decided to declare it in debug.hpp. But now I realized that also
>>>>> debug.cpp includes vmError.hpp so I could just as well declare
>>>>> 'print_native_stack()' in vmError.hpp and leave the implementation in
>>>>> vmError.cpp. Do you want me to change that?
>>>>>
>>>>> Thank you and best regards,
>>>>> Volker
>>>>>
>>>>>
>>>>> On Tue, Sep 16, 2014 at 8:51 AM, David Holmes <david.holmes at oracle.com>
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>> Hi Volker,
>>>>>>
>>>>>> On 13/09/2014 5:15 AM, Volker Simonis wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> could you please review and sponsor the following small change which
>>>>>>> should make debugging a little more comfortabel (at least on Linux
>>>>>>> for
>>>>>>> now):
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~simonis/webrevs/8058345/
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8058345
>>>>>>>
>>>>>>> In the hs_err files we have a nice mixed stack trace which contains
>>>>>>> both, Java and native frames.
>>>>>>> It would be nice if we could make this functionality available from
>>>>>>> within gdb during debugging sessions (until now we can only print the
>>>>>>> pure Java stack with the "ps()" helper function from debug.cpp).
>>>>>>>
>>>>>>> This new feature can be easily achieved by refactoring the
>>>>>>> corresponding stack printing code from VMError::report() in
>>>>>>> vmError.cpp into its own method in debug.cpp. This change extracts
>>>>>>> that code into the new function 'print_native_stack()' in debug.cpp
>>>>>>> without changing anything of the functionality.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Why does it need to move to debug.cpp to allow this ?
>>>>>>
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>
>>>>>>> It also adds some helper functions which make it easy to call the new
>>>>>>> 'print_native_stack()' method from within gdb. There's the new helper
>>>>>>> function 'pns(frame f)'  which takes a frame argument and calls
>>>>>>> 'print_native_stack()'. We need the frame argument because gdb
>>>>>>> inserts
>>>>>>> a dummy frame for every call and we can't easily walk over this dummy
>>>>>>> frame from our stack printing routine.
>>>>>>>
>>>>>>> To simplify the creation of the frame object, I've added the helper
>>>>>>> functions:
>>>>>>>
>>>>>>> extern "C" frame make_frame(intptr_t* sp, intptr_t* fp, address pc) {
>>>>>>>      return frame(sp, fp, pc);
>>>>>>> }
>>>>>>>
>>>>>>> for x86 (in frame_x86.cpp) and
>>>>>>>
>>>>>>> extern "C" frame make_frame(intptr_t* sp, address pc) {
>>>>>>>      return frame(sp, pc);
>>>>>>> }
>>>>>>>
>>>>>>> for ppc64 in frame_ppc.cpp. With these helper functions we can now
>>>>>>> easily get a mixed stack trace of a Java thread in gdb (see below).
>>>>>>>
>>>>>>> All the helper functions are protected by '#ifndef PRODUCT'
>>>>>>>
>>>>>>> Thank you and best regards,
>>>>>>> Volker
>>>>>>>
>>>>>>>
>>>>>>> (gdb) call pns(make_frame($sp, $rbp, $pc))
>>>>>>>
>>>>>>> "Executing pns"
>>>>>>> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code,
>>>>>>> C=native
>>>>>>> code)
>>>>>>> C  [libpthread.so.0+0xc0fe]  pthread_cond_timedwait+0x13e
>>>>>>> V  [libjvm.so+0x96c4c1]  os::sleep(Thread*, long, bool)+0x1a1
>>>>>>> V  [libjvm.so+0x75f442]  JVM_Sleep+0x312
>>>>>>> j  java.lang.Thread.sleep(J)V+0
>>>>>>> j  CrashNative.crashIt(Lsun/misc/Unsafe;I)V+10
>>>>>>> j  CrashNative.doIt()V+45
>>>>>>> v  ~StubRoutines::call_stub
>>>>>>> V  [libjvm.so+0x71599f]  JavaCalls::call_helper(JavaValue*,
>>>>>>> methodHandle*, JavaCallArguments*, Thread*)+0xf8f
>>>>>>> V  [libjvm.so+0x9eab75]  Reflection::invoke(instanceKlassHandle,
>>>>>>> methodHandle, Handle, bool, objArrayHandle, BasicType,
>>>>>>> objArrayHandle,
>>>>>>> bool, Thread*) [clone .constprop.218]+0xa25
>>>>>>> V  [libjvm.so+0x9eb838]  Reflection::invoke_method(oopDesc*, Handle,
>>>>>>> objArrayHandle, Thread*)+0x1c8
>>>>>>> V  [libjvm.so+0x7637ae]  JVM_InvokeMethod+0xfe
>>>>>>> j
>>>>>>>
>>>>>>>
>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke0(Ljava/lang/reflect/Method;Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+0
>>>>>>> j
>>>>>>>
>>>>>>>
>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+100
>>>>>>> j
>>>>>>>
>>>>>>>
>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+6
>>>>>>> j
>>>>>>>
>>>>>>>
>>>>>>> java.lang.reflect.Method.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+56
>>>>>>> j  CrashNative.mainJava()V+32
>>>>>>> v  ~StubRoutines::call_stub
>>>>>>> V  [libjvm.so+0x71599f]  JavaCalls::call_helper(JavaValue*,
>>>>>>> methodHandle*, JavaCallArguments*, Thread*)+0xf8f
>>>>>>> V  [libjvm.so+0x7384f5]  jni_invoke_static(JNIEnv_*, JavaValue*,
>>>>>>> _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*)
>>>>>>> [clone .isra.238] [clone .constprop.250]+0x385
>>>>>>> V  [libjvm.so+0x73b3d7]  jni_CallStaticVoidMethodV+0xe7
>>>>>>> C  [libCrashNative.so+0x9a9]  JNIEnv_::CallStaticVoidMethod(_jclass*,
>>>>>>> _jmethodID*, ...)+0xb9
>>>>>>> C  [libCrashNative.so+0xa10]  step3(JNIEnv_*, _jobject*)+0x65
>>>>>>> C  [libCrashNative.so+0xa69]  step2(JNIEnv_*, _jobject*)+0x57
>>>>>>> C  [libCrashNative.so+0xa37]  step2(JNIEnv_*, _jobject*)+0x25
>>>>>>> C  [libCrashNative.so+0xa37]  step2(JNIEnv_*, _jobject*)+0x25
>>>>>>> C  [libCrashNative.so+0xa37]  step2(JNIEnv_*, _jobject*)+0x25
>>>>>>> C  [libCrashNative.so+0xa37]  step2(JNIEnv_*, _jobject*)+0x25
>>>>>>> C  [libCrashNative.so+0xa37]  step2(JNIEnv_*, _jobject*)+0x25
>>>>>>> C  [libCrashNative.so+0xa8e]  step1(JNIEnv_*, _jobject*)+0x23
>>>>>>> C  [libCrashNative.so+0x87f]  Java_CrashNative_nativeMethod+0x23
>>>>>>> j  CrashNative.nativeMethod()V+0
>>>>>>> j  CrashNative.main([Ljava/lang/String;)V+9
>>>>>>> v  ~StubRoutines::call_stub
>>>>>>> V  [libjvm.so+0x71599f]  JavaCalls::call_helper(JavaValue*,
>>>>>>> methodHandle*, JavaCallArguments*, Thread*)+0xf8f
>>>>>>> V  [libjvm.so+0x7384f5]  jni_invoke_static(JNIEnv_*, JavaValue*,
>>>>>>> _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*)
>>>>>>> [clone .isra.238] [clone .constprop.250]+0x385
>>>>>>> V  [libjvm.so+0x73b2b0]  jni_CallStaticVoidMethod+0x170
>>>>>>> C  [libjli.so+0x742a]  JavaMain+0x65a
>>>>>>> C  [libpthread.so.0+0x7e9a]  start_thread+0xda
>>>>>>>
>>>>>>
>>>
>


More information about the hotspot-dev mailing list