[PATCH RFC 0/2] Add linux/ppc64 support for Hotspot serviceability agent to read core files
Volker Simonis
volker.simonis at gmail.com
Thu Nov 20 11:03:14 UTC 2014
Hy Maynard,
I've just realized that in your patch the two directory patterns
$(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/linux_ppc64/*.java and
$(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/ppc64/*.java are absent from
"make/sa.files". This of course breaks incremental builds.
Regards,
Volker
diff -r 6f35dca1949c make/sa.files
--- a/make/sa.files Mon Nov 17 14:47:41 2014 +0100
+++ b/make/sa.files Thu Nov 20 11:58:52 2014 +0100
@@ -94,12 +94,14 @@
$(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/linux_amd64/*.java \
$(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/linux_x86/*.java \
$(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/linux_sparc/*.java \
+$(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/linux_ppc64/*.java \
$(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/posix/*.java \
$(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/solaris_amd64/*.java \
$(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/solaris_sparc/*.java \
$(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/solaris_x86/*.java \
$(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/sparc/*.java \
$(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/x86/*.java \
+$(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/ppc64/*.java \
$(AGENT_SRC_DIR)/sun/jvm/hotspot/tools/*.java \
$(AGENT_SRC_DIR)/sun/jvm/hotspot/tools/jcore/*.java \
$(AGENT_SRC_DIR)/sun/jvm/hotspot/tools/soql/*.java \
On Wed, Nov 19, 2014 at 7:33 PM, Volker Simonis
<volker.simonis at gmail.com> wrote:
> Hi Maynard,
>
> I just wanted to let you know that I'm still working on fixing the
> bogus entries in the stack trace. I'm pretty sure they are related to
> inlining. If you run your test program with "-XX:+PrintCompilation
> -XX:+PrintInlining -XX:CICompilerCount=1" you'll get the following
> output:
>
> ..
> 10954 5 % test::run_test @ 59 (99 bytes)
> @ 75 java.lang.String::getChars (62
> bytes) inline (hot)
> @ 58 java.lang.System::arraycopy (0
> bytes) (intrinsic)
> @ 87 test::get_my_chars (86 bytes) inline (hot)
> @ 6 java.lang.String::getChars (62
> bytes) inline (hot)
> @ 58 java.lang.System::arraycopy (0
> bytes) (intrinsic)
> @ 38 java.lang.String::<init> (62
> bytes) inline (hot)
> @ 1 java.lang.Object::<init> (1
> bytes) inline (hot)
> @ 55 java.util.Arrays::copyOfRange
> (63 bytes) too big
> @ 47 java.lang.StringBuilder::<init>
> (7 bytes) inline (hot)
> @ 52 java.lang.StringBuilder::append
> (8 bytes) executed < MinInliningThreshold times
> @ 57 java.lang.StringBuilder::append
> (8 bytes) executed < MinInliningThreshold times
> @ 62 java.lang.StringBuilder::append
> (8 bytes) executed < MinInliningThreshold times
> @ 65 java.lang.StringBuilder::toString
> (17 bytes) executed < MinInliningThreshold times
> @ 79 java.lang.String::length (6
> bytes) inline (hot)
> @ 82 java.io.OutputStreamWriter::write
> (11 bytes) executed < MinInliningThreshold times
> ..
>
> The stack trace I get from jstack looks as follows:
>
> Thread 4448: (state = IN_JAVA)
> - test.get_my_chars(int, int, char[], java.lang.String, long)
> @bci=43, line=15 (Compiled frame; information may be imprecise)
> - test.run_test() @bci=87, line=35 (Compiled frame)
> - java.lang.String.getChars(int, int, char[], int) @bci=58, line=814
> (Compiled frame)
> - test.run_test() @bci=75, line=34 (Compiled frame)
>
> From a system perspective 'test::run_test' is one native frame,
> because 'test::run_test' inlines all the other functions reported
> above. HotSpot has special functionality to detect and walk these
> inlined methods (so called "virtual frames" or "vframe"s). For some
> reason this vframe walking doesn't seem to work in the agent. In gdb,
> when calling "ps()" at the same point where I created the above core
> file I'll get the following stack trace:
>
> (gdb) call ps()
>
> "Executing ps"
> for thread: "main" #1 prio=5 os_prio=0 tid=0x00003fffb0010800
> nid=0x1160 runnable [0x0000000000000000]
> java.lang.Thread.State: RUNNABLE
> JavaThread state: _thread_in_Java
> Thread: 0x00003fffb0010800 [0x1160] State: _running _has_called_back
> 0 _at_poll_safepoint 0
> JavaThread state: _thread_in_Java
>
> (guessing starting frame id=0x3fffb66ddc60 based on current fp)
> C frame (sp=0x00003fffb66ddad0 unextended sp=0x00003fffb66ddad0,
> fp=0x00003fffb66ddc60, real_fp=0x00003fffb66ddc60,
> pc=0x000000001000067c)
> 1 - frame( sp=0x00003fffb66ddc60, unextended_sp=0x00003fffb66ddc60,
> fp=0x00003fffb66ddd60, pc=0x00003fffa0159d58)
> test.get_my_chars(test.java:16)
> 2 - frame( sp=0x00003fffb66ddc60, unextended_sp=0x00003fffb66ddc60,
> fp=0x00003fffb66ddd60, pc=0x00003fffa0159d58)
> test.run_test(test.java:35)
> 3 - frame( sp=0x00003fffb66ddd60, unextended_sp=0x00003fffb66ddd60,
> fp=0x00003fffb66dde60, pc=0x00003fffa000f518)
> test.main(test.java:56)
>
> I'll keep you informed once I fixed the problem (I'll also look into
> the .opd issue afterwards).
>
> Regards,
> Volker
>
>
> On Tue, Nov 18, 2014 at 12:20 AM, Maynard Johnson <maynardj at us.ibm.com> wrote:
>> On 11/17/2014 01:21 PM, Volker Simonis wrote:
>>> On Mon, Nov 17, 2014 at 6:59 PM, Maynard Johnson <maynardj at us.ibm.com> wrote:
>>>> On 11/17/2014 10:20 AM, Volker Simonis wrote:
>>>>> Hi Maynard,
>>>>>
>>>>> I'm currently looking at your changes. At first glance they look good.
>>>>>
>>>>> I could open a simple core file which contained both, interpreted and
>>>>> compiled frames:
>>>>>
>>>>> $ jstack ./images/j2sdk-image/bin/java core.7034
>>>>> ...
>>>>> Thread 7035: (state = IN_VM)
>>>>> - sun.misc.Unsafe.putAddress(long, long) @bci=0 (Interpreted frame)
>>>>> - Crash.crashIt(sun.misc.Unsafe, int) @bci=10, line=8 (Interpreted frame)
>>>>> - Crash.doIt() @bci=45, line=23 (Compiled frame)
>>>>> - sun.reflect.NativeMethodAccessorImpl.invoke0(java.lang.reflect.Method,
>>>>> java.lang.Object, java.lang.Object[]) @bci=0 (Interpreted frame)
>>>>> - sun.reflect.NativeMethodAccessorImpl.invoke(java.lang.Object,
>>>>> java.lang.Object[]) @bci=100, line=62 (Interpreted frame)
>>>>> - sun.reflect.DelegatingMethodAccessorImpl.invoke(java.lang.Object,
>>>>> java.lang.Object[]) @bci=6, line=43 (Interpreted frame)
>>>>> - java.lang.reflect.Method.invoke(java.lang.Object,
>>>>> java.lang.Object[]) @bci=56, line=498 (Interpreted frame)
>>>>> - Crash.main(java.lang.String[]) @bci=32, line=31 (Interpreted frame)
>>>>>
>>>>> The one thing that doesn't currently work is "jstack -m" (i.e. "mixed
>>>>> mode" for java and native frames). Are you aware of this?
>>>> Hi, Volker,
>>>> Yeah, I knew about this problem and forgot to mention it in my patch posting. I started
>>>> looking at it this morning, and so far, I have at least fixed the UnmappedAddressException.
>>>> But now I'm getting different results on little endian vs big endian ppc64 systems.
>>>> On BE, I either get no symbol names (i.e., "?????") or wrong symbol names. On LE,
>>>> I seem to get correct symbol names for the first symbol (either __pthread_cond_wait
>>>> or __pthread_cond_timedwait) and the last symbol (start_thread) of each stack, but
>>>> everything in between is "?????".
>>>>
>>>
>>> Maybe this is related to the fact that we have function descriptors on
>>> BE and simple function pointers on LE. You may have a look at the
>>> elf-decoder for ppc64 to find some more information.
>>
>> Yes, indeed. With the following patch, the mixed mode option works fine on ppc64 little endian,
>> but not on big endian:
>>
>> Index: jdk9-dev/hotspot/agent/src/share/classes/sun/jvm/hotspot/debugger/linux/ppc64/LinuxPPC64CFrame.java
>> ===================================================================
>> --- jdk9-dev.orig/hotspot/agent/src/share/classes/sun/jvm/hotspot/debugger/linux/ppc64/LinuxPPC64CFrame.java
>> +++ jdk9-dev/hotspot/agent/src/share/classes/sun/jvm/hotspot/debugger/linux/ppc64/LinuxPPC64CFrame.java
>> @@ -60,14 +60,15 @@ final public class LinuxPPC64CFrame exte
>> return null;
>> }
>>
>> - Address nextSP = sp.getAddressAt( PPC64ThreadContext.SP * address_size + PPC64_STACK_BIAS);
>> + Address nextSP = sp.getAddressAt(0);
>> if (nextSP == null) {
>> return null;
>> }
>> - Address nextPC = sp.getAddressAt(PPC64ThreadContext.PC * address_size + PPC64_STACK_BIAS);
>> + Address nextPC = sp.getAddressAt(2 * address_size);
>> if (nextPC == null) {
>> return null;
>> }
>> +
>> return new LinuxPPC64CFrame(dbg, nextSP, nextPC,address_size);
>> }
>>
>> -------------------------------------------------
>>
>> I see that ppc64 fixups were made in the hotspot utilities (by you) about a year ago
>> (http://cr.openjdk.java.net/~simonis/webrevs/8019929.v3/). We obviously need something
>> similar in the hotspot agent native code that implements the JNI call 'lookupByAddress0'.
>> I hacked the build_symtab_internal() function in hotspot/agent/src/os/linux/symtab.c and
>> see that the symbol "offset" we're getting is really the address of the symbol's opd.
>> I'm not sure where to start to fix this, so if you have any suggestions, I'm all ears. :-)
>>
>> Thanks.
>> -Maynard
>>>
>>>>>
>>>>> Regarding your "test.java" example - how do you use it?
>>>>>
>>>>> If I just attach with jstack to the Java process which runs
>>>>> "test.java" I get the correct stack trace of all threads. But I think
>>>>> that's actual no SA-functionality but a VM-feature (the same that can
>>>>> be triggered by sending kill -SIGQUIT to java process).
>>>>>
>>>>> If I attach with "jstack -F" I see the problems you mentioned. First I
>>>>> didn't saw any frame at all which confused me but then I also saw the
>>>>> two cases mentioned by you. I'll need to have a closer look what
>>>>> happens.
>>>>
>>>> I was just running the 'test' java app and, in another session, killing it with SIGSEGV.
>>>> To be honest, I wasn't aware of the 'jstack -F' option.
>>>>
>>>
>>> Another possibility I've just found out is to create a core from gdb
>>> with the 'generate-core-file' command. You can than still inspect the
>>> original program in gdb while debugging how jstack is working on the
>>> core file.
>>>
>>>> -Maynard
>>>>
>>>>>
>>>>> Regards,
>>>>> Volker
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Nov 14, 2014 at 7:09 PM, Maynard Johnson <maynardj at us.ibm.com> wrote:
>>>>>> When Hotspot SA tools jmap, jstack, and jsadebugd are run against a core file, they fail with the following runtime exception:
>>>>>>
>>>>>> OS/CPU combination linux/ppc64 not yet supported
>>>>>>
>>>>>> I will post a patch set that adds this support. The patch set consists of the following patches:
>>>>>>
>>>>>> PATCH 1/2: Updates to non-Java files to support linux/ppc64 Hotspot SA with core files
>>>>>>
>>>>>> PATCH 2/2: New PPC64 class files (and updates to generic files) to support linux/ppc64 Hotspot SA with core files
>>>>>>
>>>>>> These two patches apply cleanly to a November 13 pull of the jdk9-dev upstream sources.
>>>>>>
>>>>>> ------------
>>>>>> Open issues:
>>>>>> ------------
>>>>>> 1) The jstack tool does not print a stack entry for the 'main()' method of the Java
>>>>>> workload (attached) under test. For example:
>>>>>>
>>>>>> (Note: Addresses and method signatures elided for brevity.)
>>>>>>
>>>>>> Thread 24358: (state = IN_JAVA, current Java SP = null
>>>>>> )
>>>>>> - java.lang.String.getChars(...) @bci=58, line=814, pc=..., Method*=... (Compiled frame; ... imprecise)
>>>>>> - test.run_test() @bci=80, line=33, pc=..., Method*=... (Compiled frame)
>>>>>> ==> (Expect an entry for test.main() here)
>>>>>>
>>>>>> 2) The jstack tool sometimes prints what appears to be two complete stacks for the Java workload. For example:
>>>>>>
>>>>>> Thread 24779: (state = IN_JAVA, current Java SP = null
>>>>>> )
>>>>>> - java.lang.String.getChars(...) @bci=58, line=814, pc=..., Method*=... (Compiled frame; ... imprecise)
>>>>>> - test.run_test() @bci=80, line=33, pc=..., Method*=... (Compiled frame)
>>>>>> - test.get_my_chars(...) @bci=39, line=15, pc=..., Method*=... (Compiled frame)
>>>>>> - test.run_test() @bci=92, line=34, pc=..., Method*=... (Compiled frame)
>>>>>>
>>>>>> Again, the 'test.main' method is missing, but there's also the anomaly of the
>>>>>> test.run_test' method showing up twice in the stack, implying that it is called
>>>>>> by 'test.get_my_chars' at line 15. But that that is not accurate. In fact, run_test
>>>>>> does call String.getChars at line 33 *and* it calls test.get_my_chars at line 34 --
>>>>>> but these are totally distinct call graphs. Somehow, we are seeing these two distinct
>>>>>> stacks in the core file, which seems impossible.
>>>>>>
>>>>>> ---------
>>>>>>
>>>>>> Any help offered on these two open issues would be greatly appreciated.
>>>>>>
>>>>>> -Maynard
>>>>>
>>>>
>>>
>>
More information about the serviceability-dev
mailing list