RFR(S): 8058345: Refactor native stack printing from vmError.cpp to debug.cpp to make it available in gdb as well

Volker Simonis volker.simonis at gmail.com
Fri Sep 19 18:55:50 UTC 2014


Hi,

so here's my new version:

 - documented the "pns" command with examples
 - removed the clumsy "make_frame" generators and introduced a genreic
frame constructor on all platforms which can now be called from pns()
 - pns() must now be called with three arguments (usually registers
like pns($sp, $fp, $pc) but some arguments may be '0' on some
platforms (see the examples in the documentation of pns())
 - tested on Linux (x86, x64, ppc64) and Solaris (SPARC, x64)
 - added additional "Summary" section to the change which mentions
that the change also fixes stack traces on x86 to enable walking of
runtime stubs and native wrappers.

http://cr.openjdk.java.net/~simonis/webrevs/8058345.v2/

Notice that the current version requires trivial changes in your
closed ports (i.e. adding the generic frame constructor) but I'd need
a sponsor anyway:)

Regards,
Volker

On Wed, Sep 17, 2014 at 9:49 PM, Vladimir Kozlov
<vladimir.kozlov at oracle.com> wrote:
> On 9/17/14 11:29 AM, Volker Simonis wrote:
>>
>> On Wed, Sep 17, 2014 at 12:10 AM, Vladimir Kozlov
>> <vladimir.kozlov at oracle.com> wrote:
>>>
>>> On 9/16/14 12:21 PM, Volker Simonis wrote:
>>>>
>>>>
>>>> Hi Vladimir,
>>>>
>>>> thanks for looking at the change.
>>>>
>>>> 'make_frame' is only intended to be used from within the debugger to
>>>> simplify the usage of the new 'pns()' (i.e. "print native stack")
>>>> helper. It can be used as follows:
>>>>
>>>> (gdb) call pns(make_frame($sp, $rbp, $pc))
>>>
>>>
>>>
>>> It is strange way to use pns(). Why not pass (sp, fp, pc) to pns() and
>>> let
>>> it call make_frame()? To have make_frame() only on ppc and x86 will not
>>> allow to use pns() on other platforms.
>>>
>>> Would be nice to have pns() version (names different) without input
>>> parameters. Can we use os::current_frame() inside for that?
>>>
>>> Add pns() description to help() output.
>>>
>>>>
>>>> "Executing pns"
>>>> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code,
>>>> C=native
>>>> code)
>>>> C  [libpthread.so.0+0xc0fe]  pthread_cond_timedwait+0x13e
>>>> V  [libjvm.so+0x96c4c1]  os::sleep(Thread*, long, bool)+0x1a1
>>>> V  [libjvm.so+0x75f442]  JVM_Sleep+0x312
>>>> j  java.lang.Thread.sleep(J)V+0
>>>> j  CrashNative.crashIt(Lsun/misc/Unsafe;I)V+10
>>>> j  CrashNative.doIt()V+45
>>>> v  ~StubRoutines::call_stub
>>>> V  [libjvm.so+0x71599f]
>>>> JavaCalls::call_helper(JavaValue*,methodHandle*, JavaCallArguments*,
>>>> Thread*)+0xf8f
>>>>
>>>> What about the two fixesin in 'print_native_stack()' - do you think they
>>>> are OK?
>>>
>>>
>>>
>>> What about is_runtime_frame()? It is wrapper for runtime calls from
>>> compiled
>>> code.
>>>
>>
>> Yes, but I don't see how this could help here, because the native
>> wrapper which makes problems here is a nmethod and not a runtime stub.
>>
>> Maybe you mean to additionally add is_runtime_frame() to the check?
>
>
> Yes, that is what I meant.
>
> Thanks,
> Vladimir
>
>
>> Yes, I've just realized that that's indeed needed on amd64 to walk
>> runtime stubs. SPARC is more graceful and works without these changes,
>> but on amd64 we need them (on both Solaris and Linux) and on Sparc
>> they don't hurt.
>>
>> I've written a small test program which should be similar to the one
>> you used for 8035983:
>>
>> import java.util.Hashtable;
>>
>> public class StackTraceTest {
>>    static Hashtable ht;
>>    static {
>>      ht = new Hashtable();
>>      ht.put("one", "one");
>>    }
>>
>>    public static void foo() {
>>      bar();
>>    }
>>
>>    public static void bar() {
>>      ht.get("one");
>>    }
>>
>>    public static void main(String args[]) {
>>      for (int i = 0; i < 5; i++) {
>>        new Thread() {
>>          public void run() {
>>            while(true) {
>>              foo();
>>            }
>>          }
>>        }.start();
>>      }
>>    }
>> }
>>
>> If I run it with "-XX:-Inline -XX:+PrintCompilation
>> -XX:-TieredCompilation StackTraceTest" inside the debugger and crash
>> one of the Java threads in native code, I get the correct stack traces
>> on SPARC. But on amd64, I only get the following without my changes:
>>
>> Stack: [0xfffffd7da16f9000,0xfffffd7da17f9000],
>> sp=0xfffffd7da17f7c60,  free space=1019k
>> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native
>> code)
>> C  [libc.so.1+0xc207f]  _lwp_cond_wait+0x1f
>> V  [libjvm.so+0x171443b]  int
>> os::Solaris::cond_wait(_lwp_cond*,_lwp_mutex*)+0x2b
>> V  [libjvm.so+0x171181a]  void os::PlatformEvent::park()+0x1fa
>> V  [libjvm.so+0x16e09c1]  void ObjectMonitor::EnterI(Thread*)+0x6f1
>> V  [libjvm.so+0x16dfc8f]  void ObjectMonitor::enter(Thread*)+0x7cf
>> V  [libjvm.so+0x18cdd00]  void
>> ObjectSynchronizer::slow_enter(Handle,BasicLock*,Thread*)+0x2a0
>> V  [libjvm.so+0x18cd6a7]  void
>> ObjectSynchronizer::fast_enter(Handle,BasicLock*,bool,Thread*)+0x157
>> V  [libjvm.so+0x182f39e]  void
>>
>> SharedRuntime::complete_monitor_locking_C(oopDesc*,BasicLock*,JavaThread*)+0x23e
>> v  ~RuntimeStub::_complete_monitor_locking_Java
>> C  0x2aad1dd1000016d8
>>
>> With the changes (and the additional check for is_runtime_frame()) I
>> get full stack traces on amd64 as well. So I think the changes should
>> be at least an improvement:)
>
>
> Good!
>
>
>>
>>> You need to check what fr.real_fp() returns on all platforms for the very
>>> first frame (_lwp_start). That is what this check about - stop walking
>>> when
>>> it reaches the first frame. fr.sender_sp() returns bogus value which is
>>> not
>>> stack pointer for the first frame. From 8035983 review:
>>>
>>> "It seems using fr.sender_sp() in the check work on x86 and sparc.
>>> On x86 it return stack_base value on sparc it returns STACK_BIAS."
>>>
>>> Also on other our platforms it could return 0 or small integer value.
>>>
>>> If you can suggest an other way to determine the first frame, please,
>>> tell.
>>>
>>
>> So the initial problem in 8035983 was that we used
>> os::is_first_C_frame(&fr) for native frames where the sender was a
>> compiled frame. That didn't work reliably because,
>> os::is_first_C_frame(&fr) uses fr->link() to get the frame pointer of
>> the sender and that doesn't work for compiled senders.
>>
>> So you replaced os::is_first_C_frame(&fr) by
>> !on_local_stack((address)(fr.sender_sp() + 1)) but that uses addr_at()
>> internally which in turn uses fp() so it won't work for frames which
>> have a bogus frame pointer like native wrappers.
>>
>> I think using fr.real_fp() should be safe because as far as I can see
>> it is always fr.sender_sp() - 2 on amd64 and equal to fr.sender_sp()
>> on SPARC. On Linux/amd64 both, the sp and fp of the first frame will
>> be 0 (still have to check on SPARC). But the example above works fine
>> with my changes on both, Linux/amd64 and Solaris/SPARC and
>> Solaris/amd64.
>>
>> I'll prepare a new webrev tomorrow which will have the documentation
>> for "pns" and a version of make_frame() for SPARC.
>>
>> Regards,
>> Volker
>>
>>>> Should I move 'print_native_stack()' to vmError.cpp as suggested by
>>>> David?
>>>
>>>
>>>
>>> I am fine with both places.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>>
>>>>
>>>> Thank you and best regards,
>>>> Volker
>>>>
>>>> On Tue, Sep 16, 2014 at 8:11 PM, Vladimir Kozlov
>>>> <vladimir.kozlov at oracle.com> wrote:
>>>>>
>>>>>
>>>>> Thank you for fixing frame walk.
>>>>> I don't see where make_frame() is used.
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>>
>>>>> On 9/16/14 9:35 AM, Volker Simonis wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> while testing my change, I found two other small problems with native
>>>>>> stack traces:
>>>>>>
>>>>>> 1. we can not walk native wrappers on (at least not on Linux/amd64)
>>>>>> because they are treated as native "C" frames. However, if the native
>>>>>> wrapper was called from a compiled frame which had no valid frame
>>>>>> pointer (i.e. %rbp) os::get_sender_for_C_frame(&fr) will produce a bad
>>>>>> frame. This can be easily fixed by treating native wrappers like java
>>>>>> frames.
>>>>>>
>>>>>> 2. the fix for "8035983: Fix "Native frames:" in crash report (hs_err
>>>>>> file)" introduced a similar problem. If we walk tha stack from a
>>>>>> native wrapper down to a compiled frame, we will have a frame with an
>>>>>> invalid frame pointer. In that case, the newly introduced check from
>>>>>> change 8035983 will fail, because fr.sender_sp() depends on a valid
>>>>>> fp. I'll propose to replace fr.sender_sp() by fr.real_fp() which
>>>>>> should do the same but also works for compiled frames with invalid fp.
>>>>>>
>>>>>> Here's the new webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~simonis/webrevs/8058345.v1/
>>>>>>
>>>>>> What dou you think?
>>>>>>
>>>>>> Thank you and best regards,
>>>>>> Volker
>>>>>>
>>>>>>
>>>>>> On Tue, Sep 16, 2014 at 3:48 PM, Volker Simonis
>>>>>> <volker.simonis at gmail.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 'print_native_stack()' must be visible in both vmError.cpp and
>>>>>>> debug.cpp. Initially I saw that vmError.cpp already included
>>>>>>> debug.hpp
>>>>>>> so I decided to declare it in debug.hpp. But now I realized that also
>>>>>>> debug.cpp includes vmError.hpp so I could just as well declare
>>>>>>> 'print_native_stack()' in vmError.hpp and leave the implementation in
>>>>>>> vmError.cpp. Do you want me to change that?
>>>>>>>
>>>>>>> Thank you and best regards,
>>>>>>> Volker
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Sep 16, 2014 at 8:51 AM, David Holmes
>>>>>>> <david.holmes at oracle.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Hi Volker,
>>>>>>>>
>>>>>>>> On 13/09/2014 5:15 AM, Volker Simonis wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> could you please review and sponsor the following small change
>>>>>>>>> which
>>>>>>>>> should make debugging a little more comfortabel (at least on Linux
>>>>>>>>> for
>>>>>>>>> now):
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~simonis/webrevs/8058345/
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8058345
>>>>>>>>>
>>>>>>>>> In the hs_err files we have a nice mixed stack trace which contains
>>>>>>>>> both, Java and native frames.
>>>>>>>>> It would be nice if we could make this functionality available from
>>>>>>>>> within gdb during debugging sessions (until now we can only print
>>>>>>>>> the
>>>>>>>>> pure Java stack with the "ps()" helper function from debug.cpp).
>>>>>>>>>
>>>>>>>>> This new feature can be easily achieved by refactoring the
>>>>>>>>> corresponding stack printing code from VMError::report() in
>>>>>>>>> vmError.cpp into its own method in debug.cpp. This change extracts
>>>>>>>>> that code into the new function 'print_native_stack()' in debug.cpp
>>>>>>>>> without changing anything of the functionality.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Why does it need to move to debug.cpp to allow this ?
>>>>>>>>
>>>>>>>> David
>>>>>>>> -----
>>>>>>>>
>>>>>>>>
>>>>>>>>> It also adds some helper functions which make it easy to call the
>>>>>>>>> new
>>>>>>>>> 'print_native_stack()' method from within gdb. There's the new
>>>>>>>>> helper
>>>>>>>>> function 'pns(frame f)'  which takes a frame argument and calls
>>>>>>>>> 'print_native_stack()'. We need the frame argument because gdb
>>>>>>>>> inserts
>>>>>>>>> a dummy frame for every call and we can't easily walk over this
>>>>>>>>> dummy
>>>>>>>>> frame from our stack printing routine.
>>>>>>>>>
>>>>>>>>> To simplify the creation of the frame object, I've added the helper
>>>>>>>>> functions:
>>>>>>>>>
>>>>>>>>> extern "C" frame make_frame(intptr_t* sp, intptr_t* fp, address pc)
>>>>>>>>> {
>>>>>>>>>       return frame(sp, fp, pc);
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> for x86 (in frame_x86.cpp) and
>>>>>>>>>
>>>>>>>>> extern "C" frame make_frame(intptr_t* sp, address pc) {
>>>>>>>>>       return frame(sp, pc);
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> for ppc64 in frame_ppc.cpp. With these helper functions we can now
>>>>>>>>> easily get a mixed stack trace of a Java thread in gdb (see below).
>>>>>>>>>
>>>>>>>>> All the helper functions are protected by '#ifndef PRODUCT'
>>>>>>>>>
>>>>>>>>> Thank you and best regards,
>>>>>>>>> Volker
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> (gdb) call pns(make_frame($sp, $rbp, $pc))
>>>>>>>>>
>>>>>>>>> "Executing pns"
>>>>>>>>> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code,
>>>>>>>>> C=native
>>>>>>>>> code)
>>>>>>>>> C  [libpthread.so.0+0xc0fe]  pthread_cond_timedwait+0x13e
>>>>>>>>> V  [libjvm.so+0x96c4c1]  os::sleep(Thread*, long, bool)+0x1a1
>>>>>>>>> V  [libjvm.so+0x75f442]  JVM_Sleep+0x312
>>>>>>>>> j  java.lang.Thread.sleep(J)V+0
>>>>>>>>> j  CrashNative.crashIt(Lsun/misc/Unsafe;I)V+10
>>>>>>>>> j  CrashNative.doIt()V+45
>>>>>>>>> v  ~StubRoutines::call_stub
>>>>>>>>> V  [libjvm.so+0x71599f]  JavaCalls::call_helper(JavaValue*,
>>>>>>>>> methodHandle*, JavaCallArguments*, Thread*)+0xf8f
>>>>>>>>> V  [libjvm.so+0x9eab75]  Reflection::invoke(instanceKlassHandle,
>>>>>>>>> methodHandle, Handle, bool, objArrayHandle, BasicType,
>>>>>>>>> objArrayHandle,
>>>>>>>>> bool, Thread*) [clone .constprop.218]+0xa25
>>>>>>>>> V  [libjvm.so+0x9eb838]  Reflection::invoke_method(oopDesc*,
>>>>>>>>> Handle,
>>>>>>>>> objArrayHandle, Thread*)+0x1c8
>>>>>>>>> V  [libjvm.so+0x7637ae]  JVM_InvokeMethod+0xfe
>>>>>>>>> j
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke0(Ljava/lang/reflect/Method;Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+0
>>>>>>>>> j
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+100
>>>>>>>>> j
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+6
>>>>>>>>> j
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> java.lang.reflect.Method.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+56
>>>>>>>>> j  CrashNative.mainJava()V+32
>>>>>>>>> v  ~StubRoutines::call_stub
>>>>>>>>> V  [libjvm.so+0x71599f]  JavaCalls::call_helper(JavaValue*,
>>>>>>>>> methodHandle*, JavaCallArguments*, Thread*)+0xf8f
>>>>>>>>> V  [libjvm.so+0x7384f5]  jni_invoke_static(JNIEnv_*, JavaValue*,
>>>>>>>>> _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*)
>>>>>>>>> [clone .isra.238] [clone .constprop.250]+0x385
>>>>>>>>> V  [libjvm.so+0x73b3d7]  jni_CallStaticVoidMethodV+0xe7
>>>>>>>>> C  [libCrashNative.so+0x9a9]
>>>>>>>>> JNIEnv_::CallStaticVoidMethod(_jclass*,
>>>>>>>>> _jmethodID*, ...)+0xb9
>>>>>>>>> C  [libCrashNative.so+0xa10]  step3(JNIEnv_*, _jobject*)+0x65
>>>>>>>>> C  [libCrashNative.so+0xa69]  step2(JNIEnv_*, _jobject*)+0x57
>>>>>>>>> C  [libCrashNative.so+0xa37]  step2(JNIEnv_*, _jobject*)+0x25
>>>>>>>>> C  [libCrashNative.so+0xa37]  step2(JNIEnv_*, _jobject*)+0x25
>>>>>>>>> C  [libCrashNative.so+0xa37]  step2(JNIEnv_*, _jobject*)+0x25
>>>>>>>>> C  [libCrashNative.so+0xa37]  step2(JNIEnv_*, _jobject*)+0x25
>>>>>>>>> C  [libCrashNative.so+0xa37]  step2(JNIEnv_*, _jobject*)+0x25
>>>>>>>>> C  [libCrashNative.so+0xa8e]  step1(JNIEnv_*, _jobject*)+0x23
>>>>>>>>> C  [libCrashNative.so+0x87f]  Java_CrashNative_nativeMethod+0x23
>>>>>>>>> j  CrashNative.nativeMethod()V+0
>>>>>>>>> j  CrashNative.main([Ljava/lang/String;)V+9
>>>>>>>>> v  ~StubRoutines::call_stub
>>>>>>>>> V  [libjvm.so+0x71599f]  JavaCalls::call_helper(JavaValue*,
>>>>>>>>> methodHandle*, JavaCallArguments*, Thread*)+0xf8f
>>>>>>>>> V  [libjvm.so+0x7384f5]  jni_invoke_static(JNIEnv_*, JavaValue*,
>>>>>>>>> _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*)
>>>>>>>>> [clone .isra.238] [clone .constprop.250]+0x385
>>>>>>>>> V  [libjvm.so+0x73b2b0]  jni_CallStaticVoidMethod+0x170
>>>>>>>>> C  [libjli.so+0x742a]  JavaMain+0x65a
>>>>>>>>> C  [libpthread.so.0+0x7e9a]  start_thread+0xda
>>>>>>>>>
>>>>>>>>
>>>>>
>>>
>


More information about the hotspot-dev mailing list