[PATCH RFC 0/2] Add linux/ppc64 support for Hotspot serviceability agent to read core files

Maynard Johnson maynardj at us.ibm.com
Thu Nov 20 18:05:01 UTC 2014


On 11/20/2014 05:03 AM, Volker Simonis wrote:
> Hy Maynard,
> 
> I've just realized that in your patch the two directory patterns
> $(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/linux_ppc64/*.java and
> $(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/ppc64/*.java are absent from
> "make/sa.files". This of course breaks incremental builds.
Nice, thank you.  I was wondering why incremental build didn't work.
I had resorted to using a build script that did the following:
   rm build/linux-ppc64-normal-server-release/hotspot/linux_ppc64_compiler2/product/../generated/sa-jdi.jar
   make all JOBS=24
   make install

-Maynard

> 
> Regards,
> Volker
> 
> 
> diff -r 6f35dca1949c make/sa.files
> --- a/make/sa.files     Mon Nov 17 14:47:41 2014 +0100
> +++ b/make/sa.files     Thu Nov 20 11:58:52 2014 +0100
> @@ -94,12 +94,14 @@
>  $(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/linux_amd64/*.java \
>  $(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/linux_x86/*.java \
>  $(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/linux_sparc/*.java \
> +$(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/linux_ppc64/*.java \
>  $(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/posix/*.java \
>  $(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/solaris_amd64/*.java \
>  $(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/solaris_sparc/*.java \
>  $(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/solaris_x86/*.java \
>  $(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/sparc/*.java \
>  $(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/x86/*.java \
> +$(AGENT_SRC_DIR)/sun/jvm/hotspot/runtime/ppc64/*.java \
>  $(AGENT_SRC_DIR)/sun/jvm/hotspot/tools/*.java \
>  $(AGENT_SRC_DIR)/sun/jvm/hotspot/tools/jcore/*.java \
>  $(AGENT_SRC_DIR)/sun/jvm/hotspot/tools/soql/*.java \
> 
> On Wed, Nov 19, 2014 at 7:33 PM, Volker Simonis
> <volker.simonis at gmail.com> wrote:
>> Hi Maynard,
>>
>> I just wanted to let you know that I'm still working on fixing the
>> bogus entries in the stack trace. I'm pretty sure they are related to
>> inlining. If you run your test program with "-XX:+PrintCompilation
>> -XX:+PrintInlining -XX:CICompilerCount=1" you'll get the following
>> output:
>>
>> ..
>>   10954    5 %           test::run_test @ 59 (99 bytes)
>>                             @ 75   java.lang.String::getChars (62
>> bytes)   inline (hot)
>>                               @ 58   java.lang.System::arraycopy (0
>> bytes)   (intrinsic)
>>                             @ 87   test::get_my_chars (86 bytes)   inline (hot)
>>                               @ 6   java.lang.String::getChars (62
>> bytes)   inline (hot)
>>                                 @ 58   java.lang.System::arraycopy (0
>> bytes)   (intrinsic)
>>                               @ 38   java.lang.String::<init> (62
>> bytes)   inline (hot)
>>                                 @ 1   java.lang.Object::<init> (1
>> bytes)   inline (hot)
>>                                 @ 55   java.util.Arrays::copyOfRange
>> (63 bytes)   too big
>>                               @ 47   java.lang.StringBuilder::<init>
>> (7 bytes)   inline (hot)
>>                               @ 52   java.lang.StringBuilder::append
>> (8 bytes)   executed < MinInliningThreshold times
>>                               @ 57   java.lang.StringBuilder::append
>> (8 bytes)   executed < MinInliningThreshold times
>>                               @ 62   java.lang.StringBuilder::append
>> (8 bytes)   executed < MinInliningThreshold times
>>                               @ 65   java.lang.StringBuilder::toString
>> (17 bytes)   executed < MinInliningThreshold times
>>                               @ 79   java.lang.String::length (6
>> bytes)   inline (hot)
>>                               @ 82   java.io.OutputStreamWriter::write
>> (11 bytes)   executed < MinInliningThreshold times
>> ..
>>
>> The stack trace I get from jstack looks as follows:
>>
>> Thread 4448: (state = IN_JAVA)
>>  - test.get_my_chars(int, int, char[], java.lang.String, long)
>> @bci=43, line=15 (Compiled frame; information may be imprecise)
>>  - test.run_test() @bci=87, line=35 (Compiled frame)
>>  - java.lang.String.getChars(int, int, char[], int) @bci=58, line=814
>> (Compiled frame)
>>  - test.run_test() @bci=75, line=34 (Compiled frame)
>>
>> From a system perspective 'test::run_test' is one native frame,
>> because 'test::run_test' inlines all the other functions reported
>> above. HotSpot has special functionality to detect and walk these
>> inlined methods (so called "virtual frames" or "vframe"s). For some
>> reason this vframe walking doesn't seem to work in the agent. In gdb,
>> when calling "ps()" at the same point where I created the above core
>> file I'll get the following stack trace:
>>
>> (gdb) call ps()
>>
>> "Executing ps"
>>  for thread: "main" #1 prio=5 os_prio=0 tid=0x00003fffb0010800
>> nid=0x1160 runnable [0x0000000000000000]
>>    java.lang.Thread.State: RUNNABLE
>>    JavaThread state: _thread_in_Java
>> Thread: 0x00003fffb0010800  [0x1160] State: _running _has_called_back
>> 0 _at_poll_safepoint 0
>>    JavaThread state: _thread_in_Java
>>
>> (guessing starting frame id=0x3fffb66ddc60 based on current fp)
>> C frame (sp=0x00003fffb66ddad0 unextended sp=0x00003fffb66ddad0,
>> fp=0x00003fffb66ddc60, real_fp=0x00003fffb66ddc60,
>> pc=0x000000001000067c)
>>  1 - frame( sp=0x00003fffb66ddc60, unextended_sp=0x00003fffb66ddc60,
>> fp=0x00003fffb66ddd60, pc=0x00003fffa0159d58)
>> test.get_my_chars(test.java:16)
>>  2 - frame( sp=0x00003fffb66ddc60, unextended_sp=0x00003fffb66ddc60,
>> fp=0x00003fffb66ddd60, pc=0x00003fffa0159d58)
>> test.run_test(test.java:35)
>>  3 - frame( sp=0x00003fffb66ddd60, unextended_sp=0x00003fffb66ddd60,
>> fp=0x00003fffb66dde60, pc=0x00003fffa000f518)
>> test.main(test.java:56)
>>
>> I'll keep you informed once I fixed the problem (I'll also look into
>> the .opd issue afterwards).
>>
>> Regards,
>> Volker
>>
>>
>> On Tue, Nov 18, 2014 at 12:20 AM, Maynard Johnson <maynardj at us.ibm.com> wrote:
>>> On 11/17/2014 01:21 PM, Volker Simonis wrote:
>>>> On Mon, Nov 17, 2014 at 6:59 PM, Maynard Johnson <maynardj at us.ibm.com> wrote:
>>>>> On 11/17/2014 10:20 AM, Volker Simonis wrote:
>>>>>> Hi Maynard,
>>>>>>
>>>>>> I'm currently looking at your changes. At first glance they look good.
>>>>>>
>>>>>> I could open a simple core file which contained both, interpreted and
>>>>>> compiled frames:
>>>>>>
>>>>>> $ jstack ./images/j2sdk-image/bin/java core.7034
>>>>>> ...
>>>>>> Thread 7035: (state = IN_VM)
>>>>>>  - sun.misc.Unsafe.putAddress(long, long) @bci=0 (Interpreted frame)
>>>>>>  - Crash.crashIt(sun.misc.Unsafe, int) @bci=10, line=8 (Interpreted frame)
>>>>>>  - Crash.doIt() @bci=45, line=23 (Compiled frame)
>>>>>>  - sun.reflect.NativeMethodAccessorImpl.invoke0(java.lang.reflect.Method,
>>>>>> java.lang.Object, java.lang.Object[]) @bci=0 (Interpreted frame)
>>>>>>  - sun.reflect.NativeMethodAccessorImpl.invoke(java.lang.Object,
>>>>>> java.lang.Object[]) @bci=100, line=62 (Interpreted frame)
>>>>>>  - sun.reflect.DelegatingMethodAccessorImpl.invoke(java.lang.Object,
>>>>>> java.lang.Object[]) @bci=6, line=43 (Interpreted frame)
>>>>>>  - java.lang.reflect.Method.invoke(java.lang.Object,
>>>>>> java.lang.Object[]) @bci=56, line=498 (Interpreted frame)
>>>>>>  - Crash.main(java.lang.String[]) @bci=32, line=31 (Interpreted frame)
>>>>>>
>>>>>> The one thing that doesn't currently work is "jstack -m" (i.e. "mixed
>>>>>> mode" for java and native frames). Are you aware of this?
>>>>> Hi, Volker,
>>>>> Yeah, I knew about this problem and forgot to mention it in my patch posting. I started
>>>>> looking at it this morning, and so far, I have at least fixed the UnmappedAddressException.
>>>>> But now I'm getting different results on little endian vs big endian ppc64 systems.
>>>>> On BE, I either get no symbol names (i.e., "?????") or wrong symbol names.  On LE,
>>>>> I seem to get correct symbol names for the first symbol (either __pthread_cond_wait
>>>>> or __pthread_cond_timedwait) and the last symbol (start_thread) of each stack, but
>>>>> everything in between is "?????".
>>>>>
>>>>
>>>> Maybe this is related to the fact that we have function descriptors on
>>>> BE and simple function pointers on LE. You may have a look at the
>>>> elf-decoder for ppc64 to find some more information.
>>>
>>> Yes, indeed.  With the following patch, the mixed mode option works fine on ppc64 little endian,
>>> but not on big endian:
>>>
>>> Index: jdk9-dev/hotspot/agent/src/share/classes/sun/jvm/hotspot/debugger/linux/ppc64/LinuxPPC64CFrame.java
>>> ===================================================================
>>> --- jdk9-dev.orig/hotspot/agent/src/share/classes/sun/jvm/hotspot/debugger/linux/ppc64/LinuxPPC64CFrame.java
>>> +++ jdk9-dev/hotspot/agent/src/share/classes/sun/jvm/hotspot/debugger/linux/ppc64/LinuxPPC64CFrame.java
>>> @@ -60,14 +60,15 @@ final public class LinuxPPC64CFrame exte
>>>        return null;
>>>      }
>>>
>>> -    Address nextSP = sp.getAddressAt( PPC64ThreadContext.SP * address_size + PPC64_STACK_BIAS);
>>> +    Address nextSP = sp.getAddressAt(0);
>>>      if (nextSP == null) {
>>>        return null;
>>>      }
>>> -    Address nextPC  = sp.getAddressAt(PPC64ThreadContext.PC * address_size + PPC64_STACK_BIAS);
>>> +    Address nextPC  = sp.getAddressAt(2 * address_size);
>>>      if (nextPC == null) {
>>>        return null;
>>>      }
>>> +
>>>      return new LinuxPPC64CFrame(dbg, nextSP, nextPC,address_size);
>>>    }
>>>
>>> -------------------------------------------------
>>>
>>> I see that ppc64 fixups were made in the hotspot utilities (by you) about a year ago
>>> (http://cr.openjdk.java.net/~simonis/webrevs/8019929.v3/).  We obviously need something
>>> similar in the hotspot agent native code that implements the JNI call 'lookupByAddress0'.
>>> I hacked the build_symtab_internal() function in hotspot/agent/src/os/linux/symtab.c and
>>> see that the symbol "offset" we're getting is really the address of the symbol's opd.
>>> I'm not sure where to start to fix this, so if you have any suggestions, I'm all ears.  :-)
>>>
>>> Thanks.
>>> -Maynard
>>>>
>>>>>>
>>>>>> Regarding your "test.java" example - how do you use it?
>>>>>>
>>>>>> If I just attach with jstack to the Java process which runs
>>>>>> "test.java" I get the correct stack trace of all threads. But I think
>>>>>> that's actual no SA-functionality but a VM-feature (the same that can
>>>>>> be triggered by sending kill -SIGQUIT to  java process).
>>>>>>
>>>>>> If I attach with "jstack -F" I see the problems you mentioned. First I
>>>>>> didn't saw any frame at all which confused me but then I also saw the
>>>>>> two cases mentioned by you. I'll need to have a closer look what
>>>>>> happens.
>>>>>
>>>>> I was just running the 'test' java app and, in another session, killing it with SIGSEGV.
>>>>> To be honest, I wasn't aware of the 'jstack -F' option.
>>>>>
>>>>
>>>> Another possibility I've just found out is to create a core from gdb
>>>> with the 'generate-core-file' command. You can than still inspect the
>>>> original program in gdb while debugging how jstack is working on the
>>>> core file.
>>>>
>>>>> -Maynard
>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> Volker
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Nov 14, 2014 at 7:09 PM, Maynard Johnson <maynardj at us.ibm.com> wrote:
>>>>>>> When Hotspot SA tools jmap, jstack, and jsadebugd are run against a core file, they fail with the following runtime exception:
>>>>>>>
>>>>>>>      OS/CPU combination linux/ppc64 not yet supported
>>>>>>>
>>>>>>> I will post a patch set that adds this support.  The patch set consists of the following patches:
>>>>>>>
>>>>>>> PATCH 1/2: Updates to non-Java files to support linux/ppc64 Hotspot SA with core files
>>>>>>>
>>>>>>> PATCH 2/2: New PPC64 class files (and updates to generic files) to support linux/ppc64 Hotspot SA with core files
>>>>>>>
>>>>>>> These two patches apply cleanly to a November 13 pull of the jdk9-dev upstream sources.
>>>>>>>
>>>>>>> ------------
>>>>>>> Open issues:
>>>>>>> ------------
>>>>>>>   1) The jstack tool does not print a stack entry for the 'main()' method of the Java
>>>>>>>      workload (attached) under test.  For example:
>>>>>>>
>>>>>>>      (Note:  Addresses and method signatures elided for brevity.)
>>>>>>>
>>>>>>>        Thread 24358: (state = IN_JAVA, current Java SP = null
>>>>>>>        )
>>>>>>>         - java.lang.String.getChars(...) @bci=58, line=814, pc=..., Method*=... (Compiled frame; ... imprecise)
>>>>>>>         - test.run_test() @bci=80, line=33, pc=..., Method*=... (Compiled frame)
>>>>>>>      ==> (Expect an entry for test.main() here)
>>>>>>>
>>>>>>>   2) The jstack tool sometimes prints what appears to be two complete stacks for the Java workload. For example:
>>>>>>>
>>>>>>>        Thread 24779: (state = IN_JAVA, current Java SP = null
>>>>>>>        )
>>>>>>>         - java.lang.String.getChars(...) @bci=58, line=814, pc=..., Method*=... (Compiled frame; ... imprecise)
>>>>>>>         - test.run_test() @bci=80, line=33, pc=..., Method*=... (Compiled frame)
>>>>>>>         - test.get_my_chars(...) @bci=39, line=15, pc=..., Method*=... (Compiled frame)
>>>>>>>         - test.run_test() @bci=92, line=34, pc=..., Method*=... (Compiled frame)
>>>>>>>
>>>>>>>        Again, the 'test.main' method is missing, but there's also the anomaly of the
>>>>>>>        test.run_test' method showing up twice in the stack, implying that it is called
>>>>>>>        by 'test.get_my_chars' at line 15.  But that that is not accurate. In fact, run_test
>>>>>>>        does call String.getChars at line 33 *and* it calls test.get_my_chars at line 34 --
>>>>>>>        but these are totally distinct call graphs.  Somehow, we are seeing these two distinct
>>>>>>>        stacks in the core file, which seems impossible.
>>>>>>>
>>>>>>> ---------
>>>>>>>
>>>>>>> Any help offered on these two open issues would be greatly appreciated.
>>>>>>>
>>>>>>> -Maynard
>>>>>>
>>>>>
>>>>
>>>
> 



More information about the serviceability-dev mailing list