RFR(S): 8058345: Refactor native stack printing from vmError.cpp to debug.cpp to make it available in gdb as well
Volker Simonis
volker.simonis at gmail.com
Fri Sep 19 18:55:50 UTC 2014
Hi,
so here's my new version:
- documented the "pns" command with examples
- removed the clumsy "make_frame" generators and introduced a genreic
frame constructor on all platforms which can now be called from pns()
- pns() must now be called with three arguments (usually registers
like pns($sp, $fp, $pc) but some arguments may be '0' on some
platforms (see the examples in the documentation of pns())
- tested on Linux (x86, x64, ppc64) and Solaris (SPARC, x64)
- added additional "Summary" section to the change which mentions
that the change also fixes stack traces on x86 to enable walking of
runtime stubs and native wrappers.
http://cr.openjdk.java.net/~simonis/webrevs/8058345.v2/
Notice that the current version requires trivial changes in your
closed ports (i.e. adding the generic frame constructor) but I'd need
a sponsor anyway:)
Regards,
Volker
On Wed, Sep 17, 2014 at 9:49 PM, Vladimir Kozlov
<vladimir.kozlov at oracle.com> wrote:
> On 9/17/14 11:29 AM, Volker Simonis wrote:
>>
>> On Wed, Sep 17, 2014 at 12:10 AM, Vladimir Kozlov
>> <vladimir.kozlov at oracle.com> wrote:
>>>
>>> On 9/16/14 12:21 PM, Volker Simonis wrote:
>>>>
>>>>
>>>> Hi Vladimir,
>>>>
>>>> thanks for looking at the change.
>>>>
>>>> 'make_frame' is only intended to be used from within the debugger to
>>>> simplify the usage of the new 'pns()' (i.e. "print native stack")
>>>> helper. It can be used as follows:
>>>>
>>>> (gdb) call pns(make_frame($sp, $rbp, $pc))
>>>
>>>
>>>
>>> It is strange way to use pns(). Why not pass (sp, fp, pc) to pns() and
>>> let
>>> it call make_frame()? To have make_frame() only on ppc and x86 will not
>>> allow to use pns() on other platforms.
>>>
>>> Would be nice to have pns() version (names different) without input
>>> parameters. Can we use os::current_frame() inside for that?
>>>
>>> Add pns() description to help() output.
>>>
>>>>
>>>> "Executing pns"
>>>> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code,
>>>> C=native
>>>> code)
>>>> C [libpthread.so.0+0xc0fe] pthread_cond_timedwait+0x13e
>>>> V [libjvm.so+0x96c4c1] os::sleep(Thread*, long, bool)+0x1a1
>>>> V [libjvm.so+0x75f442] JVM_Sleep+0x312
>>>> j java.lang.Thread.sleep(J)V+0
>>>> j CrashNative.crashIt(Lsun/misc/Unsafe;I)V+10
>>>> j CrashNative.doIt()V+45
>>>> v ~StubRoutines::call_stub
>>>> V [libjvm.so+0x71599f]
>>>> JavaCalls::call_helper(JavaValue*,methodHandle*, JavaCallArguments*,
>>>> Thread*)+0xf8f
>>>>
>>>> What about the two fixesin in 'print_native_stack()' - do you think they
>>>> are OK?
>>>
>>>
>>>
>>> What about is_runtime_frame()? It is wrapper for runtime calls from
>>> compiled
>>> code.
>>>
>>
>> Yes, but I don't see how this could help here, because the native
>> wrapper which makes problems here is a nmethod and not a runtime stub.
>>
>> Maybe you mean to additionally add is_runtime_frame() to the check?
>
>
> Yes, that is what I meant.
>
> Thanks,
> Vladimir
>
>
>> Yes, I've just realized that that's indeed needed on amd64 to walk
>> runtime stubs. SPARC is more graceful and works without these changes,
>> but on amd64 we need them (on both Solaris and Linux) and on Sparc
>> they don't hurt.
>>
>> I've written a small test program which should be similar to the one
>> you used for 8035983:
>>
>> import java.util.Hashtable;
>>
>> public class StackTraceTest {
>> static Hashtable ht;
>> static {
>> ht = new Hashtable();
>> ht.put("one", "one");
>> }
>>
>> public static void foo() {
>> bar();
>> }
>>
>> public static void bar() {
>> ht.get("one");
>> }
>>
>> public static void main(String args[]) {
>> for (int i = 0; i < 5; i++) {
>> new Thread() {
>> public void run() {
>> while(true) {
>> foo();
>> }
>> }
>> }.start();
>> }
>> }
>> }
>>
>> If I run it with "-XX:-Inline -XX:+PrintCompilation
>> -XX:-TieredCompilation StackTraceTest" inside the debugger and crash
>> one of the Java threads in native code, I get the correct stack traces
>> on SPARC. But on amd64, I only get the following without my changes:
>>
>> Stack: [0xfffffd7da16f9000,0xfffffd7da17f9000],
>> sp=0xfffffd7da17f7c60, free space=1019k
>> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native
>> code)
>> C [libc.so.1+0xc207f] _lwp_cond_wait+0x1f
>> V [libjvm.so+0x171443b] int
>> os::Solaris::cond_wait(_lwp_cond*,_lwp_mutex*)+0x2b
>> V [libjvm.so+0x171181a] void os::PlatformEvent::park()+0x1fa
>> V [libjvm.so+0x16e09c1] void ObjectMonitor::EnterI(Thread*)+0x6f1
>> V [libjvm.so+0x16dfc8f] void ObjectMonitor::enter(Thread*)+0x7cf
>> V [libjvm.so+0x18cdd00] void
>> ObjectSynchronizer::slow_enter(Handle,BasicLock*,Thread*)+0x2a0
>> V [libjvm.so+0x18cd6a7] void
>> ObjectSynchronizer::fast_enter(Handle,BasicLock*,bool,Thread*)+0x157
>> V [libjvm.so+0x182f39e] void
>>
>> SharedRuntime::complete_monitor_locking_C(oopDesc*,BasicLock*,JavaThread*)+0x23e
>> v ~RuntimeStub::_complete_monitor_locking_Java
>> C 0x2aad1dd1000016d8
>>
>> With the changes (and the additional check for is_runtime_frame()) I
>> get full stack traces on amd64 as well. So I think the changes should
>> be at least an improvement:)
>
>
> Good!
>
>
>>
>>> You need to check what fr.real_fp() returns on all platforms for the very
>>> first frame (_lwp_start). That is what this check about - stop walking
>>> when
>>> it reaches the first frame. fr.sender_sp() returns bogus value which is
>>> not
>>> stack pointer for the first frame. From 8035983 review:
>>>
>>> "It seems using fr.sender_sp() in the check work on x86 and sparc.
>>> On x86 it return stack_base value on sparc it returns STACK_BIAS."
>>>
>>> Also on other our platforms it could return 0 or small integer value.
>>>
>>> If you can suggest an other way to determine the first frame, please,
>>> tell.
>>>
>>
>> So the initial problem in 8035983 was that we used
>> os::is_first_C_frame(&fr) for native frames where the sender was a
>> compiled frame. That didn't work reliably because,
>> os::is_first_C_frame(&fr) uses fr->link() to get the frame pointer of
>> the sender and that doesn't work for compiled senders.
>>
>> So you replaced os::is_first_C_frame(&fr) by
>> !on_local_stack((address)(fr.sender_sp() + 1)) but that uses addr_at()
>> internally which in turn uses fp() so it won't work for frames which
>> have a bogus frame pointer like native wrappers.
>>
>> I think using fr.real_fp() should be safe because as far as I can see
>> it is always fr.sender_sp() - 2 on amd64 and equal to fr.sender_sp()
>> on SPARC. On Linux/amd64 both, the sp and fp of the first frame will
>> be 0 (still have to check on SPARC). But the example above works fine
>> with my changes on both, Linux/amd64 and Solaris/SPARC and
>> Solaris/amd64.
>>
>> I'll prepare a new webrev tomorrow which will have the documentation
>> for "pns" and a version of make_frame() for SPARC.
>>
>> Regards,
>> Volker
>>
>>>> Should I move 'print_native_stack()' to vmError.cpp as suggested by
>>>> David?
>>>
>>>
>>>
>>> I am fine with both places.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>>
>>>>
>>>> Thank you and best regards,
>>>> Volker
>>>>
>>>> On Tue, Sep 16, 2014 at 8:11 PM, Vladimir Kozlov
>>>> <vladimir.kozlov at oracle.com> wrote:
>>>>>
>>>>>
>>>>> Thank you for fixing frame walk.
>>>>> I don't see where make_frame() is used.
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>>
>>>>> On 9/16/14 9:35 AM, Volker Simonis wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> while testing my change, I found two other small problems with native
>>>>>> stack traces:
>>>>>>
>>>>>> 1. we can not walk native wrappers on (at least not on Linux/amd64)
>>>>>> because they are treated as native "C" frames. However, if the native
>>>>>> wrapper was called from a compiled frame which had no valid frame
>>>>>> pointer (i.e. %rbp) os::get_sender_for_C_frame(&fr) will produce a bad
>>>>>> frame. This can be easily fixed by treating native wrappers like java
>>>>>> frames.
>>>>>>
>>>>>> 2. the fix for "8035983: Fix "Native frames:" in crash report (hs_err
>>>>>> file)" introduced a similar problem. If we walk tha stack from a
>>>>>> native wrapper down to a compiled frame, we will have a frame with an
>>>>>> invalid frame pointer. In that case, the newly introduced check from
>>>>>> change 8035983 will fail, because fr.sender_sp() depends on a valid
>>>>>> fp. I'll propose to replace fr.sender_sp() by fr.real_fp() which
>>>>>> should do the same but also works for compiled frames with invalid fp.
>>>>>>
>>>>>> Here's the new webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~simonis/webrevs/8058345.v1/
>>>>>>
>>>>>> What dou you think?
>>>>>>
>>>>>> Thank you and best regards,
>>>>>> Volker
>>>>>>
>>>>>>
>>>>>> On Tue, Sep 16, 2014 at 3:48 PM, Volker Simonis
>>>>>> <volker.simonis at gmail.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 'print_native_stack()' must be visible in both vmError.cpp and
>>>>>>> debug.cpp. Initially I saw that vmError.cpp already included
>>>>>>> debug.hpp
>>>>>>> so I decided to declare it in debug.hpp. But now I realized that also
>>>>>>> debug.cpp includes vmError.hpp so I could just as well declare
>>>>>>> 'print_native_stack()' in vmError.hpp and leave the implementation in
>>>>>>> vmError.cpp. Do you want me to change that?
>>>>>>>
>>>>>>> Thank you and best regards,
>>>>>>> Volker
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Sep 16, 2014 at 8:51 AM, David Holmes
>>>>>>> <david.holmes at oracle.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Hi Volker,
>>>>>>>>
>>>>>>>> On 13/09/2014 5:15 AM, Volker Simonis wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> could you please review and sponsor the following small change
>>>>>>>>> which
>>>>>>>>> should make debugging a little more comfortabel (at least on Linux
>>>>>>>>> for
>>>>>>>>> now):
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~simonis/webrevs/8058345/
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8058345
>>>>>>>>>
>>>>>>>>> In the hs_err files we have a nice mixed stack trace which contains
>>>>>>>>> both, Java and native frames.
>>>>>>>>> It would be nice if we could make this functionality available from
>>>>>>>>> within gdb during debugging sessions (until now we can only print
>>>>>>>>> the
>>>>>>>>> pure Java stack with the "ps()" helper function from debug.cpp).
>>>>>>>>>
>>>>>>>>> This new feature can be easily achieved by refactoring the
>>>>>>>>> corresponding stack printing code from VMError::report() in
>>>>>>>>> vmError.cpp into its own method in debug.cpp. This change extracts
>>>>>>>>> that code into the new function 'print_native_stack()' in debug.cpp
>>>>>>>>> without changing anything of the functionality.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Why does it need to move to debug.cpp to allow this ?
>>>>>>>>
>>>>>>>> David
>>>>>>>> -----
>>>>>>>>
>>>>>>>>
>>>>>>>>> It also adds some helper functions which make it easy to call the
>>>>>>>>> new
>>>>>>>>> 'print_native_stack()' method from within gdb. There's the new
>>>>>>>>> helper
>>>>>>>>> function 'pns(frame f)' which takes a frame argument and calls
>>>>>>>>> 'print_native_stack()'. We need the frame argument because gdb
>>>>>>>>> inserts
>>>>>>>>> a dummy frame for every call and we can't easily walk over this
>>>>>>>>> dummy
>>>>>>>>> frame from our stack printing routine.
>>>>>>>>>
>>>>>>>>> To simplify the creation of the frame object, I've added the helper
>>>>>>>>> functions:
>>>>>>>>>
>>>>>>>>> extern "C" frame make_frame(intptr_t* sp, intptr_t* fp, address pc)
>>>>>>>>> {
>>>>>>>>> return frame(sp, fp, pc);
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> for x86 (in frame_x86.cpp) and
>>>>>>>>>
>>>>>>>>> extern "C" frame make_frame(intptr_t* sp, address pc) {
>>>>>>>>> return frame(sp, pc);
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> for ppc64 in frame_ppc.cpp. With these helper functions we can now
>>>>>>>>> easily get a mixed stack trace of a Java thread in gdb (see below).
>>>>>>>>>
>>>>>>>>> All the helper functions are protected by '#ifndef PRODUCT'
>>>>>>>>>
>>>>>>>>> Thank you and best regards,
>>>>>>>>> Volker
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> (gdb) call pns(make_frame($sp, $rbp, $pc))
>>>>>>>>>
>>>>>>>>> "Executing pns"
>>>>>>>>> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code,
>>>>>>>>> C=native
>>>>>>>>> code)
>>>>>>>>> C [libpthread.so.0+0xc0fe] pthread_cond_timedwait+0x13e
>>>>>>>>> V [libjvm.so+0x96c4c1] os::sleep(Thread*, long, bool)+0x1a1
>>>>>>>>> V [libjvm.so+0x75f442] JVM_Sleep+0x312
>>>>>>>>> j java.lang.Thread.sleep(J)V+0
>>>>>>>>> j CrashNative.crashIt(Lsun/misc/Unsafe;I)V+10
>>>>>>>>> j CrashNative.doIt()V+45
>>>>>>>>> v ~StubRoutines::call_stub
>>>>>>>>> V [libjvm.so+0x71599f] JavaCalls::call_helper(JavaValue*,
>>>>>>>>> methodHandle*, JavaCallArguments*, Thread*)+0xf8f
>>>>>>>>> V [libjvm.so+0x9eab75] Reflection::invoke(instanceKlassHandle,
>>>>>>>>> methodHandle, Handle, bool, objArrayHandle, BasicType,
>>>>>>>>> objArrayHandle,
>>>>>>>>> bool, Thread*) [clone .constprop.218]+0xa25
>>>>>>>>> V [libjvm.so+0x9eb838] Reflection::invoke_method(oopDesc*,
>>>>>>>>> Handle,
>>>>>>>>> objArrayHandle, Thread*)+0x1c8
>>>>>>>>> V [libjvm.so+0x7637ae] JVM_InvokeMethod+0xfe
>>>>>>>>> j
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke0(Ljava/lang/reflect/Method;Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+0
>>>>>>>>> j
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+100
>>>>>>>>> j
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+6
>>>>>>>>> j
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> java.lang.reflect.Method.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+56
>>>>>>>>> j CrashNative.mainJava()V+32
>>>>>>>>> v ~StubRoutines::call_stub
>>>>>>>>> V [libjvm.so+0x71599f] JavaCalls::call_helper(JavaValue*,
>>>>>>>>> methodHandle*, JavaCallArguments*, Thread*)+0xf8f
>>>>>>>>> V [libjvm.so+0x7384f5] jni_invoke_static(JNIEnv_*, JavaValue*,
>>>>>>>>> _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*)
>>>>>>>>> [clone .isra.238] [clone .constprop.250]+0x385
>>>>>>>>> V [libjvm.so+0x73b3d7] jni_CallStaticVoidMethodV+0xe7
>>>>>>>>> C [libCrashNative.so+0x9a9]
>>>>>>>>> JNIEnv_::CallStaticVoidMethod(_jclass*,
>>>>>>>>> _jmethodID*, ...)+0xb9
>>>>>>>>> C [libCrashNative.so+0xa10] step3(JNIEnv_*, _jobject*)+0x65
>>>>>>>>> C [libCrashNative.so+0xa69] step2(JNIEnv_*, _jobject*)+0x57
>>>>>>>>> C [libCrashNative.so+0xa37] step2(JNIEnv_*, _jobject*)+0x25
>>>>>>>>> C [libCrashNative.so+0xa37] step2(JNIEnv_*, _jobject*)+0x25
>>>>>>>>> C [libCrashNative.so+0xa37] step2(JNIEnv_*, _jobject*)+0x25
>>>>>>>>> C [libCrashNative.so+0xa37] step2(JNIEnv_*, _jobject*)+0x25
>>>>>>>>> C [libCrashNative.so+0xa37] step2(JNIEnv_*, _jobject*)+0x25
>>>>>>>>> C [libCrashNative.so+0xa8e] step1(JNIEnv_*, _jobject*)+0x23
>>>>>>>>> C [libCrashNative.so+0x87f] Java_CrashNative_nativeMethod+0x23
>>>>>>>>> j CrashNative.nativeMethod()V+0
>>>>>>>>> j CrashNative.main([Ljava/lang/String;)V+9
>>>>>>>>> v ~StubRoutines::call_stub
>>>>>>>>> V [libjvm.so+0x71599f] JavaCalls::call_helper(JavaValue*,
>>>>>>>>> methodHandle*, JavaCallArguments*, Thread*)+0xf8f
>>>>>>>>> V [libjvm.so+0x7384f5] jni_invoke_static(JNIEnv_*, JavaValue*,
>>>>>>>>> _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*)
>>>>>>>>> [clone .isra.238] [clone .constprop.250]+0x385
>>>>>>>>> V [libjvm.so+0x73b2b0] jni_CallStaticVoidMethod+0x170
>>>>>>>>> C [libjli.so+0x742a] JavaMain+0x65a
>>>>>>>>> C [libpthread.so.0+0x7e9a] start_thread+0xda
>>>>>>>>>
>>>>>>>>
>>>>>
>>>
>
More information about the hotspot-dev
mailing list