[PATCH RFC 0/2] Add linux/ppc64 support for Hotspot serviceability agent to read core files

Volker Simonis volker.simonis at gmail.com
Wed Nov 19 18:33:15 UTC 2014


Hi Maynard,

I just wanted to let you know that I'm still working on fixing the
bogus entries in the stack trace. I'm pretty sure they are related to
inlining. If you run your test program with "-XX:+PrintCompilation
-XX:+PrintInlining -XX:CICompilerCount=1" you'll get the following
output:

..
  10954    5 %           test::run_test @ 59 (99 bytes)
                            @ 75   java.lang.String::getChars (62
bytes)   inline (hot)
                              @ 58   java.lang.System::arraycopy (0
bytes)   (intrinsic)
                            @ 87   test::get_my_chars (86 bytes)   inline (hot)
                              @ 6   java.lang.String::getChars (62
bytes)   inline (hot)
                                @ 58   java.lang.System::arraycopy (0
bytes)   (intrinsic)
                              @ 38   java.lang.String::<init> (62
bytes)   inline (hot)
                                @ 1   java.lang.Object::<init> (1
bytes)   inline (hot)
                                @ 55   java.util.Arrays::copyOfRange
(63 bytes)   too big
                              @ 47   java.lang.StringBuilder::<init>
(7 bytes)   inline (hot)
                              @ 52   java.lang.StringBuilder::append
(8 bytes)   executed < MinInliningThreshold times
                              @ 57   java.lang.StringBuilder::append
(8 bytes)   executed < MinInliningThreshold times
                              @ 62   java.lang.StringBuilder::append
(8 bytes)   executed < MinInliningThreshold times
                              @ 65   java.lang.StringBuilder::toString
(17 bytes)   executed < MinInliningThreshold times
                              @ 79   java.lang.String::length (6
bytes)   inline (hot)
                              @ 82   java.io.OutputStreamWriter::write
(11 bytes)   executed < MinInliningThreshold times
..

The stack trace I get from jstack looks as follows:

Thread 4448: (state = IN_JAVA)
 - test.get_my_chars(int, int, char[], java.lang.String, long)
@bci=43, line=15 (Compiled frame; information may be imprecise)
 - test.run_test() @bci=87, line=35 (Compiled frame)
 - java.lang.String.getChars(int, int, char[], int) @bci=58, line=814
(Compiled frame)
 - test.run_test() @bci=75, line=34 (Compiled frame)

>From a system perspective 'test::run_test' is one native frame,
because 'test::run_test' inlines all the other functions reported
above. HotSpot has special functionality to detect and walk these
inlined methods (so called "virtual frames" or "vframe"s). For some
reason this vframe walking doesn't seem to work in the agent. In gdb,
when calling "ps()" at the same point where I created the above core
file I'll get the following stack trace:

(gdb) call ps()

"Executing ps"
 for thread: "main" #1 prio=5 os_prio=0 tid=0x00003fffb0010800
nid=0x1160 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
   JavaThread state: _thread_in_Java
Thread: 0x00003fffb0010800  [0x1160] State: _running _has_called_back
0 _at_poll_safepoint 0
   JavaThread state: _thread_in_Java

(guessing starting frame id=0x3fffb66ddc60 based on current fp)
C frame (sp=0x00003fffb66ddad0 unextended sp=0x00003fffb66ddad0,
fp=0x00003fffb66ddc60, real_fp=0x00003fffb66ddc60,
pc=0x000000001000067c)
 1 - frame( sp=0x00003fffb66ddc60, unextended_sp=0x00003fffb66ddc60,
fp=0x00003fffb66ddd60, pc=0x00003fffa0159d58)
test.get_my_chars(test.java:16)
 2 - frame( sp=0x00003fffb66ddc60, unextended_sp=0x00003fffb66ddc60,
fp=0x00003fffb66ddd60, pc=0x00003fffa0159d58)
test.run_test(test.java:35)
 3 - frame( sp=0x00003fffb66ddd60, unextended_sp=0x00003fffb66ddd60,
fp=0x00003fffb66dde60, pc=0x00003fffa000f518)
test.main(test.java:56)

I'll keep you informed once I fixed the problem (I'll also look into
the .opd issue afterwards).

Regards,
Volker


On Tue, Nov 18, 2014 at 12:20 AM, Maynard Johnson <maynardj at us.ibm.com> wrote:
> On 11/17/2014 01:21 PM, Volker Simonis wrote:
>> On Mon, Nov 17, 2014 at 6:59 PM, Maynard Johnson <maynardj at us.ibm.com> wrote:
>>> On 11/17/2014 10:20 AM, Volker Simonis wrote:
>>>> Hi Maynard,
>>>>
>>>> I'm currently looking at your changes. At first glance they look good.
>>>>
>>>> I could open a simple core file which contained both, interpreted and
>>>> compiled frames:
>>>>
>>>> $ jstack ./images/j2sdk-image/bin/java core.7034
>>>> ...
>>>> Thread 7035: (state = IN_VM)
>>>>  - sun.misc.Unsafe.putAddress(long, long) @bci=0 (Interpreted frame)
>>>>  - Crash.crashIt(sun.misc.Unsafe, int) @bci=10, line=8 (Interpreted frame)
>>>>  - Crash.doIt() @bci=45, line=23 (Compiled frame)
>>>>  - sun.reflect.NativeMethodAccessorImpl.invoke0(java.lang.reflect.Method,
>>>> java.lang.Object, java.lang.Object[]) @bci=0 (Interpreted frame)
>>>>  - sun.reflect.NativeMethodAccessorImpl.invoke(java.lang.Object,
>>>> java.lang.Object[]) @bci=100, line=62 (Interpreted frame)
>>>>  - sun.reflect.DelegatingMethodAccessorImpl.invoke(java.lang.Object,
>>>> java.lang.Object[]) @bci=6, line=43 (Interpreted frame)
>>>>  - java.lang.reflect.Method.invoke(java.lang.Object,
>>>> java.lang.Object[]) @bci=56, line=498 (Interpreted frame)
>>>>  - Crash.main(java.lang.String[]) @bci=32, line=31 (Interpreted frame)
>>>>
>>>> The one thing that doesn't currently work is "jstack -m" (i.e. "mixed
>>>> mode" for java and native frames). Are you aware of this?
>>> Hi, Volker,
>>> Yeah, I knew about this problem and forgot to mention it in my patch posting. I started
>>> looking at it this morning, and so far, I have at least fixed the UnmappedAddressException.
>>> But now I'm getting different results on little endian vs big endian ppc64 systems.
>>> On BE, I either get no symbol names (i.e., "?????") or wrong symbol names.  On LE,
>>> I seem to get correct symbol names for the first symbol (either __pthread_cond_wait
>>> or __pthread_cond_timedwait) and the last symbol (start_thread) of each stack, but
>>> everything in between is "?????".
>>>
>>
>> Maybe this is related to the fact that we have function descriptors on
>> BE and simple function pointers on LE. You may have a look at the
>> elf-decoder for ppc64 to find some more information.
>
> Yes, indeed.  With the following patch, the mixed mode option works fine on ppc64 little endian,
> but not on big endian:
>
> Index: jdk9-dev/hotspot/agent/src/share/classes/sun/jvm/hotspot/debugger/linux/ppc64/LinuxPPC64CFrame.java
> ===================================================================
> --- jdk9-dev.orig/hotspot/agent/src/share/classes/sun/jvm/hotspot/debugger/linux/ppc64/LinuxPPC64CFrame.java
> +++ jdk9-dev/hotspot/agent/src/share/classes/sun/jvm/hotspot/debugger/linux/ppc64/LinuxPPC64CFrame.java
> @@ -60,14 +60,15 @@ final public class LinuxPPC64CFrame exte
>        return null;
>      }
>
> -    Address nextSP = sp.getAddressAt( PPC64ThreadContext.SP * address_size + PPC64_STACK_BIAS);
> +    Address nextSP = sp.getAddressAt(0);
>      if (nextSP == null) {
>        return null;
>      }
> -    Address nextPC  = sp.getAddressAt(PPC64ThreadContext.PC * address_size + PPC64_STACK_BIAS);
> +    Address nextPC  = sp.getAddressAt(2 * address_size);
>      if (nextPC == null) {
>        return null;
>      }
> +
>      return new LinuxPPC64CFrame(dbg, nextSP, nextPC,address_size);
>    }
>
> -------------------------------------------------
>
> I see that ppc64 fixups were made in the hotspot utilities (by you) about a year ago
> (http://cr.openjdk.java.net/~simonis/webrevs/8019929.v3/).  We obviously need something
> similar in the hotspot agent native code that implements the JNI call 'lookupByAddress0'.
> I hacked the build_symtab_internal() function in hotspot/agent/src/os/linux/symtab.c and
> see that the symbol "offset" we're getting is really the address of the symbol's opd.
> I'm not sure where to start to fix this, so if you have any suggestions, I'm all ears.  :-)
>
> Thanks.
> -Maynard
>>
>>>>
>>>> Regarding your "test.java" example - how do you use it?
>>>>
>>>> If I just attach with jstack to the Java process which runs
>>>> "test.java" I get the correct stack trace of all threads. But I think
>>>> that's actual no SA-functionality but a VM-feature (the same that can
>>>> be triggered by sending kill -SIGQUIT to  java process).
>>>>
>>>> If I attach with "jstack -F" I see the problems you mentioned. First I
>>>> didn't saw any frame at all which confused me but then I also saw the
>>>> two cases mentioned by you. I'll need to have a closer look what
>>>> happens.
>>>
>>> I was just running the 'test' java app and, in another session, killing it with SIGSEGV.
>>> To be honest, I wasn't aware of the 'jstack -F' option.
>>>
>>
>> Another possibility I've just found out is to create a core from gdb
>> with the 'generate-core-file' command. You can than still inspect the
>> original program in gdb while debugging how jstack is working on the
>> core file.
>>
>>> -Maynard
>>>
>>>>
>>>> Regards,
>>>> Volker
>>>>
>>>>
>>>>
>>>> On Fri, Nov 14, 2014 at 7:09 PM, Maynard Johnson <maynardj at us.ibm.com> wrote:
>>>>> When Hotspot SA tools jmap, jstack, and jsadebugd are run against a core file, they fail with the following runtime exception:
>>>>>
>>>>>      OS/CPU combination linux/ppc64 not yet supported
>>>>>
>>>>> I will post a patch set that adds this support.  The patch set consists of the following patches:
>>>>>
>>>>> PATCH 1/2: Updates to non-Java files to support linux/ppc64 Hotspot SA with core files
>>>>>
>>>>> PATCH 2/2: New PPC64 class files (and updates to generic files) to support linux/ppc64 Hotspot SA with core files
>>>>>
>>>>> These two patches apply cleanly to a November 13 pull of the jdk9-dev upstream sources.
>>>>>
>>>>> ------------
>>>>> Open issues:
>>>>> ------------
>>>>>   1) The jstack tool does not print a stack entry for the 'main()' method of the Java
>>>>>      workload (attached) under test.  For example:
>>>>>
>>>>>      (Note:  Addresses and method signatures elided for brevity.)
>>>>>
>>>>>        Thread 24358: (state = IN_JAVA, current Java SP = null
>>>>>        )
>>>>>         - java.lang.String.getChars(...) @bci=58, line=814, pc=..., Method*=... (Compiled frame; ... imprecise)
>>>>>         - test.run_test() @bci=80, line=33, pc=..., Method*=... (Compiled frame)
>>>>>      ==> (Expect an entry for test.main() here)
>>>>>
>>>>>   2) The jstack tool sometimes prints what appears to be two complete stacks for the Java workload. For example:
>>>>>
>>>>>        Thread 24779: (state = IN_JAVA, current Java SP = null
>>>>>        )
>>>>>         - java.lang.String.getChars(...) @bci=58, line=814, pc=..., Method*=... (Compiled frame; ... imprecise)
>>>>>         - test.run_test() @bci=80, line=33, pc=..., Method*=... (Compiled frame)
>>>>>         - test.get_my_chars(...) @bci=39, line=15, pc=..., Method*=... (Compiled frame)
>>>>>         - test.run_test() @bci=92, line=34, pc=..., Method*=... (Compiled frame)
>>>>>
>>>>>        Again, the 'test.main' method is missing, but there's also the anomaly of the
>>>>>        test.run_test' method showing up twice in the stack, implying that it is called
>>>>>        by 'test.get_my_chars' at line 15.  But that that is not accurate. In fact, run_test
>>>>>        does call String.getChars at line 33 *and* it calls test.get_my_chars at line 34 --
>>>>>        but these are totally distinct call graphs.  Somehow, we are seeing these two distinct
>>>>>        stacks in the core file, which seems impossible.
>>>>>
>>>>> ---------
>>>>>
>>>>> Any help offered on these two open issues would be greatly appreciated.
>>>>>
>>>>> -Maynard
>>>>
>>>
>>
>


More information about the serviceability-dev mailing list