[PATCH RFC 0/2] Add linux/ppc64 support for Hotspot serviceability agent to read core files
Maynard Johnson
maynardj at us.ibm.com
Thu Nov 20 18:05:01 UTC 2014
On 11/20/2014 05:03 AM, Volker Simonis wrote:
> Hy Maynard,
>
> I've just realized that in your patch the two directory patterns
> $(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/linux_ppc64/*.java and
> $(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/ppc64/*.java are absent from
> "make/sa.files". This of course breaks incremental builds.
Nice, thank you. I was wondering why incremental build didn't work.
I had resorted to using a build script that did the following:
rm build/linux-ppc64-normal-server-release/hotspot/linux_ppc64_compiler2/product/../generated/sa-jdi.jar
make all JOBS=24
make install
-Maynard
>
> Regards,
> Volker
>
>
> diff -r 6f35dca1949c make/sa.files
> --- a/make/sa.files Mon Nov 17 14:47:41 2014 +0100
> +++ b/make/sa.files Thu Nov 20 11:58:52 2014 +0100
> @@ -94,12 +94,14 @@
> $(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/linux_amd64/*.java \
> $(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/linux_x86/*.java \
> $(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/linux_sparc/*.java \
> +$(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/linux_ppc64/*.java \
> $(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/posix/*.java \
> $(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/solaris_amd64/*.java \
> $(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/solaris_sparc/*.java \
> $(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/solaris_x86/*.java \
> $(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/sparc/*.java \
> $(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/x86/*.java \
> +$(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/ppc64/*.java \
> $(AGENT_SRC_DIR)/sun/jvm/hotspot/tools/*.java \
> $(AGENT_SRC_DIR)/sun/jvm/hotspot/tools/jcore/*.java \
> $(AGENT_SRC_DIR)/sun/jvm/hotspot/tools/soql/*.java \
>
> On Wed, Nov 19, 2014 at 7:33 PM, Volker Simonis
> <volker.simonis at gmail.com> wrote:
>> Hi Maynard,
>>
>> I just wanted to let you know that I'm still working on fixing the
>> bogus entries in the stack trace. I'm pretty sure they are related to
>> inlining. If you run your test program with "-XX:+PrintCompilation
>> -XX:+PrintInlining -XX:CICompilerCount=1" you'll get the following
>> output:
>>
>> ..
>> 10954 5 % test::run_test @ 59 (99 bytes)
>> @ 75 java.lang.String::getChars (62
>> bytes) inline (hot)
>> @ 58 java.lang.System::arraycopy (0
>> bytes) (intrinsic)
>> @ 87 test::get_my_chars (86 bytes) inline (hot)
>> @ 6 java.lang.String::getChars (62
>> bytes) inline (hot)
>> @ 58 java.lang.System::arraycopy (0
>> bytes) (intrinsic)
>> @ 38 java.lang.String::<init> (62
>> bytes) inline (hot)
>> @ 1 java.lang.Object::<init> (1
>> bytes) inline (hot)
>> @ 55 java.util.Arrays::copyOfRange
>> (63 bytes) too big
>> @ 47 java.lang.StringBuilder::<init>
>> (7 bytes) inline (hot)
>> @ 52 java.lang.StringBuilder::append
>> (8 bytes) executed < MinInliningThreshold times
>> @ 57 java.lang.StringBuilder::append
>> (8 bytes) executed < MinInliningThreshold times
>> @ 62 java.lang.StringBuilder::append
>> (8 bytes) executed < MinInliningThreshold times
>> @ 65 java.lang.StringBuilder::toString
>> (17 bytes) executed < MinInliningThreshold times
>> @ 79 java.lang.String::length (6
>> bytes) inline (hot)
>> @ 82 java.io.OutputStreamWriter::write
>> (11 bytes) executed < MinInliningThreshold times
>> ..
>>
>> The stack trace I get from jstack looks as follows:
>>
>> Thread 4448: (state = IN_JAVA)
>> - test.get_my_chars(int, int, char[], java.lang.String, long)
>> @bci=43, line=15 (Compiled frame; information may be imprecise)
>> - test.run_test() @bci=87, line=35 (Compiled frame)
>> - java.lang.String.getChars(int, int, char[], int) @bci=58, line=814
>> (Compiled frame)
>> - test.run_test() @bci=75, line=34 (Compiled frame)
>>
>> From a system perspective 'test::run_test' is one native frame,
>> because 'test::run_test' inlines all the other functions reported
>> above. HotSpot has special functionality to detect and walk these
>> inlined methods (so called "virtual frames" or "vframe"s). For some
>> reason this vframe walking doesn't seem to work in the agent. In gdb,
>> when calling "ps()" at the same point where I created the above core
>> file I'll get the following stack trace:
>>
>> (gdb) call ps()
>>
>> "Executing ps"
>> for thread: "main" #1 prio=5 os_prio=0 tid=0x00003fffb0010800
>> nid=0x1160 runnable [0x0000000000000000]
>> java.lang.Thread.State: RUNNABLE
>> JavaThread state: _thread_in_Java
>> Thread: 0x00003fffb0010800 [0x1160] State: _running _has_called_back
>> 0 _at_poll_safepoint 0
>> JavaThread state: _thread_in_Java
>>
>> (guessing starting frame id=0x3fffb66ddc60 based on current fp)
>> C frame (sp=0x00003fffb66ddad0 unextended sp=0x00003fffb66ddad0,
>> fp=0x00003fffb66ddc60, real_fp=0x00003fffb66ddc60,
>> pc=0x000000001000067c)
>> 1 - frame( sp=0x00003fffb66ddc60, unextended_sp=0x00003fffb66ddc60,
>> fp=0x00003fffb66ddd60, pc=0x00003fffa0159d58)
>> test.get_my_chars(test.java:16)
>> 2 - frame( sp=0x00003fffb66ddc60, unextended_sp=0x00003fffb66ddc60,
>> fp=0x00003fffb66ddd60, pc=0x00003fffa0159d58)
>> test.run_test(test.java:35)
>> 3 - frame( sp=0x00003fffb66ddd60, unextended_sp=0x00003fffb66ddd60,
>> fp=0x00003fffb66dde60, pc=0x00003fffa000f518)
>> test.main(test.java:56)
>>
>> I'll keep you informed once I fixed the problem (I'll also look into
>> the .opd issue afterwards).
>>
>> Regards,
>> Volker
>>
>>
>> On Tue, Nov 18, 2014 at 12:20 AM, Maynard Johnson <maynardj at us.ibm.com> wrote:
>>> On 11/17/2014 01:21 PM, Volker Simonis wrote:
>>>> On Mon, Nov 17, 2014 at 6:59 PM, Maynard Johnson <maynardj at us.ibm.com> wrote:
>>>>> On 11/17/2014 10:20 AM, Volker Simonis wrote:
>>>>>> Hi Maynard,
>>>>>>
>>>>>> I'm currently looking at your changes. At first glance they look good.
>>>>>>
>>>>>> I could open a simple core file which contained both, interpreted and
>>>>>> compiled frames:
>>>>>>
>>>>>> $ jstack ./images/j2sdk-image/bin/java core.7034
>>>>>> ...
>>>>>> Thread 7035: (state = IN_VM)
>>>>>> - sun.misc.Unsafe.putAddress(long, long) @bci=0 (Interpreted frame)
>>>>>> - Crash.crashIt(sun.misc.Unsafe, int) @bci=10, line=8 (Interpreted frame)
>>>>>> - Crash.doIt() @bci=45, line=23 (Compiled frame)
>>>>>> - sun.reflect.NativeMethodAccessorImpl.invoke0(java.lang.reflect.Method,
>>>>>> java.lang.Object, java.lang.Object[]) @bci=0 (Interpreted frame)
>>>>>> - sun.reflect.NativeMethodAccessorImpl.invoke(java.lang.Object,
>>>>>> java.lang.Object[]) @bci=100, line=62 (Interpreted frame)
>>>>>> - sun.reflect.DelegatingMethodAccessorImpl.invoke(java.lang.Object,
>>>>>> java.lang.Object[]) @bci=6, line=43 (Interpreted frame)
>>>>>> - java.lang.reflect.Method.invoke(java.lang.Object,
>>>>>> java.lang.Object[]) @bci=56, line=498 (Interpreted frame)
>>>>>> - Crash.main(java.lang.String[]) @bci=32, line=31 (Interpreted frame)
>>>>>>
>>>>>> The one thing that doesn't currently work is "jstack -m" (i.e. "mixed
>>>>>> mode" for java and native frames). Are you aware of this?
>>>>> Hi, Volker,
>>>>> Yeah, I knew about this problem and forgot to mention it in my patch posting. I started
>>>>> looking at it this morning, and so far, I have at least fixed the UnmappedAddressException.
>>>>> But now I'm getting different results on little endian vs big endian ppc64 systems.
>>>>> On BE, I either get no symbol names (i.e., "?????") or wrong symbol names. On LE,
>>>>> I seem to get correct symbol names for the first symbol (either __pthread_cond_wait
>>>>> or __pthread_cond_timedwait) and the last symbol (start_thread) of each stack, but
>>>>> everything in between is "?????".
>>>>>
>>>>
>>>> Maybe this is related to the fact that we have function descriptors on
>>>> BE and simple function pointers on LE. You may have a look at the
>>>> elf-decoder for ppc64 to find some more information.
>>>
>>> Yes, indeed. With the following patch, the mixed mode option works fine on ppc64 little endian,
>>> but not on big endian:
>>>
>>> Index: jdk9-dev/hotspot/agent/src/share/classes/sun/jvm/hotspot/debugger/linux/ppc64/LinuxPPC64CFrame.java
>>> ===================================================================
>>> --- jdk9-dev.orig/hotspot/agent/src/share/classes/sun/jvm/hotspot/debugger/linux/ppc64/LinuxPPC64CFrame.java
>>> +++ jdk9-dev/hotspot/agent/src/share/classes/sun/jvm/hotspot/debugger/linux/ppc64/LinuxPPC64CFrame.java
>>> @@ -60,14 +60,15 @@ final public class LinuxPPC64CFrame exte
>>> return null;
>>> }
>>>
>>> - Address nextSP = sp.getAddressAt( PPC64ThreadContext.SP * address_size + PPC64_STACK_BIAS);
>>> + Address nextSP = sp.getAddressAt(0);
>>> if (nextSP == null) {
>>> return null;
>>> }
>>> - Address nextPC = sp.getAddressAt(PPC64ThreadContext.PC * address_size + PPC64_STACK_BIAS);
>>> + Address nextPC = sp.getAddressAt(2 * address_size);
>>> if (nextPC == null) {
>>> return null;
>>> }
>>> +
>>> return new LinuxPPC64CFrame(dbg, nextSP, nextPC,address_size);
>>> }
>>>
>>> -------------------------------------------------
>>>
>>> I see that ppc64 fixups were made in the hotspot utilities (by you) about a year ago
>>> (http://cr.openjdk.java.net/~simonis/webrevs/8019929.v3/). We obviously need something
>>> similar in the hotspot agent native code that implements the JNI call 'lookupByAddress0'.
>>> I hacked the build_symtab_internal() function in hotspot/agent/src/os/linux/symtab.c and
>>> see that the symbol "offset" we're getting is really the address of the symbol's opd.
>>> I'm not sure where to start to fix this, so if you have any suggestions, I'm all ears. :-)
>>>
>>> Thanks.
>>> -Maynard
>>>>
>>>>>>
>>>>>> Regarding your "test.java" example - how do you use it?
>>>>>>
>>>>>> If I just attach with jstack to the Java process which runs
>>>>>> "test.java" I get the correct stack trace of all threads. But I think
>>>>>> that's actual no SA-functionality but a VM-feature (the same that can
>>>>>> be triggered by sending kill -SIGQUIT to java process).
>>>>>>
>>>>>> If I attach with "jstack -F" I see the problems you mentioned. First I
>>>>>> didn't saw any frame at all which confused me but then I also saw the
>>>>>> two cases mentioned by you. I'll need to have a closer look what
>>>>>> happens.
>>>>>
>>>>> I was just running the 'test' java app and, in another session, killing it with SIGSEGV.
>>>>> To be honest, I wasn't aware of the 'jstack -F' option.
>>>>>
>>>>
>>>> Another possibility I've just found out is to create a core from gdb
>>>> with the 'generate-core-file' command. You can than still inspect the
>>>> original program in gdb while debugging how jstack is working on the
>>>> core file.
>>>>
>>>>> -Maynard
>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> Volker
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Nov 14, 2014 at 7:09 PM, Maynard Johnson <maynardj at us.ibm.com> wrote:
>>>>>>> When Hotspot SA tools jmap, jstack, and jsadebugd are run against a core file, they fail with the following runtime exception:
>>>>>>>
>>>>>>> OS/CPU combination linux/ppc64 not yet supported
>>>>>>>
>>>>>>> I will post a patch set that adds this support. The patch set consists of the following patches:
>>>>>>>
>>>>>>> PATCH 1/2: Updates to non-Java files to support linux/ppc64 Hotspot SA with core files
>>>>>>>
>>>>>>> PATCH 2/2: New PPC64 class files (and updates to generic files) to support linux/ppc64 Hotspot SA with core files
>>>>>>>
>>>>>>> These two patches apply cleanly to a November 13 pull of the jdk9-dev upstream sources.
>>>>>>>
>>>>>>> ------------
>>>>>>> Open issues:
>>>>>>> ------------
>>>>>>> 1) The jstack tool does not print a stack entry for the 'main()' method of the Java
>>>>>>> workload (attached) under test. For example:
>>>>>>>
>>>>>>> (Note: Addresses and method signatures elided for brevity.)
>>>>>>>
>>>>>>> Thread 24358: (state = IN_JAVA, current Java SP = null
>>>>>>> )
>>>>>>> - java.lang.String.getChars(...) @bci=58, line=814, pc=..., Method*=... (Compiled frame; ... imprecise)
>>>>>>> - test.run_test() @bci=80, line=33, pc=..., Method*=... (Compiled frame)
>>>>>>> ==> (Expect an entry for test.main() here)
>>>>>>>
>>>>>>> 2) The jstack tool sometimes prints what appears to be two complete stacks for the Java workload. For example:
>>>>>>>
>>>>>>> Thread 24779: (state = IN_JAVA, current Java SP = null
>>>>>>> )
>>>>>>> - java.lang.String.getChars(...) @bci=58, line=814, pc=..., Method*=... (Compiled frame; ... imprecise)
>>>>>>> - test.run_test() @bci=80, line=33, pc=..., Method*=... (Compiled frame)
>>>>>>> - test.get_my_chars(...) @bci=39, line=15, pc=..., Method*=... (Compiled frame)
>>>>>>> - test.run_test() @bci=92, line=34, pc=..., Method*=... (Compiled frame)
>>>>>>>
>>>>>>> Again, the 'test.main' method is missing, but there's also the anomaly of the
>>>>>>> test.run_test' method showing up twice in the stack, implying that it is called
>>>>>>> by 'test.get_my_chars' at line 15. But that that is not accurate. In fact, run_test
>>>>>>> does call String.getChars at line 33 *and* it calls test.get_my_chars at line 34 --
>>>>>>> but these are totally distinct call graphs. Somehow, we are seeing these two distinct
>>>>>>> stacks in the core file, which seems impossible.
>>>>>>>
>>>>>>> ---------
>>>>>>>
>>>>>>> Any help offered on these two open issues would be greatly appreciated.
>>>>>>>
>>>>>>> -Maynard
>>>>>>
>>>>>
>>>>
>>>
>
More information about the serviceability-dev
mailing list