: PING: RFR: 8234624: jstack mixed mode should refer DWARF

Kevin Walls kevin.walls at oracle.com
Tue Mar 10 23:53:21 UTC 2020


Hi -

In testing I wasn't seeing any of the Dwarf code triggered.

With LIBSAPROC_DEBUG set I'm getting the "Could not find executable 
section in" for lots of / maybe all the libraries...

src/jdk.hotspot.agent/linux/native/libsaproc/libproc_impl.c

    if (fill_instr_info(newlib)) {
      if (!read_eh_frame(ph, newlib)) {

fill_instr_info is failing, and we never get to read_eh_frame().

output like:

libsaproc DEBUG: [0] vaddr = 0x0, memsz = 0xaba4, filesz = 0xaba4
libsaproc DEBUG: Could not find executable section in 
/lib/x86_64-linux-gnu/libnss_nis-2.27.so

(similar for all libraries).

fill_instr fails if:

  if ((lib->exec_start == 0L) || (lib->exec_end == 0L))

...but isn't exec_start relative to the library address? It's the value 
of ph->vaddr and it is often zero.

I added some booleans and did:

185       if ((lib->exec_start == 0L) || (lib->exec_start > ph->p_vaddr)) {
186         lib->exec_start = ph->p_vaddr;
187         found_start =true;
188       }

(similarly for end) and only failed if:

201   if (!found_start || !found_end) {
202     return false;

...and now it's better.   I go from:

----------------- 3306 -----------------
0x00007f75824acd2d      __GI___pthread_timedjoin_ex + 0x17d

to:

----------------- 31127 -----------------
0x00007fa284d78d2d      __GI___pthread_timedjoin_ex + 0x17d
0x00007fa2857aaf2d      CallJavaMainInNewThread + 0xad
0x00007fa2857a74ed      ContinueInNewThread + 0x4d
0x00007fa2857a8c49      JLI_Launch + 0x1529
0x000055af1b78db1c      main + 0x11c


Thanks
Kevin




On 10/03/2020 12:36, Yasumasa Suenaga wrote:

> Hi Kevin,
>
> Thanks for your comment!
>
> On 2020/03/10 18:58, Kevin Walls wrote:
>> Hi Yasumasa ,
>>
>> The changes build OK for me in the latest jdk, and things still work.
>> I have not yet seen the dwarf usage in action: I've tried a couple of 
>> different systems and so far have not reproduced the problem, i.e. 
>> jstack has not failed on native frames.
>>
>> I may need more recent basic libraries, will look again for somewhere 
>> where the problem happens and get back to you as I really want to run 
>> the changes.
>
> You can see the problem with JShell.
> Some Java frames would not be seen in mixed jstack.
>
>
>> I have mostly minor other comments which don't need a new webrev, 
>> some just comments for the future:
>>
>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.cpp:
>>
>> DW_CFA_nop - shouldn't this continue instead of return?
>> (It may "never" happen, but a nop could appear within some other 
>> instructions?)
>
> DW_CFA_nop is used for padding, so we can ignore (return immediately) it.
>
>
>> DW_CFA_remember_state: a minor typo in the comment, 
>> "DW_CFA_remenber_state".
>
> I will fix it.
>
>
>> We handle DW_CFA_advance_loc, and _loc1 and _loc2, but not 
>> DW_CFA_advance_loc4.  I thought that was odd, but maybe addresses in 
>> these tables never increase by 4-byte amounts, would this mean a lot 
>> of code on one line. 8-)
>> So maybe it's never used in practice, if you think it's unnecessary 
>> no problem, maybe a comment, or add it for robustness.
>
> I will add DW_CFA_advance_loc4.
>
>
>> General-purpose methods like read_leb128(), get_entry_length(), 
>> get_decoded_value() specifically update the _buf pointer in this 
>> DwarfParser.
>>
>> DwarfParser::process_dwarf() moves _buf.
>> It calls process_cie() which reads, moves _buf and restores it to the 
>> original position, then we read augmentation_length from where _buf is.
>> I'm not sure if that's wrong, or if I just need to read again about 
>> the CIE/etc layout.
>>
>> I don't really want to suggest making the code pass around a current 
>> _buf for the invocation of these general purpose methods, but just 
>> wanted to comment that if these get used more widely that might 
>> become necessary.
>
> I saw GDB and binutils source for creating this patch.
> They seems to process similar code because we need to calculate DWARF 
> instructions one-by-one to get the value which relates to specified PC.
>
>
>> Similarly in future, if this DWARF support code became used more 
>> widely, it might want to move to an
>> OS-neutral directory?  It's odd to label it as Linux-specific.
>
> Windows does not use DWARF at least, it uses another feature.
>
> https://urldefense.com/v3/__https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2019__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaTtALtpRg$ 
>
> I'm not sure other platforms (Solaris, macOS) uses DWARF.
> If DWARF is used in them, I can move DWARF related code to posix 
> directory.
>
>
>> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.hpp:
>> Thanks for changing "can_parsable" which was in the earlier version. 8-)
>>
>>
>> These are just comments to mainly say it looks good, and somebody 
>> else out there has read it.
>> I will look for a system that shows the problem, and get back to you 
>> again!
>
>
> Thanks,
>
> Yasumasa
>
>
>> Many thanks
>> Kevin
>>
>> On 27/02/2020 05:13, Yasumasa Suenaga wrote:
>>> Hi all,
>>>
>>> webrev.03 cannot be applied to current jdk/jdk due to 8239224 and 
>>> 8239462 changes (they updated copyright year).
>>> So I modified webrev (only copyright year changes) to be able to 
>>> apply to current jdk/jdk.
>>> Could you review it?
>>>
>>>   http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.04/
>>>
>>> I need one more reviewer to push.
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2020/02/17 13:07, Yasumasa Suenaga wrote:
>>>> PING: Could you review it?
>>>>
>>>>>>    JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>    webrev: 
>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>
>>>> This change has been already reviewed by Serguei.
>>>> I need one more reviewer to push.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2020/02/03 1:37, Yasumasa Suenaga wrote:
>>>>> PING: Could you reveiw this change?
>>>>>
>>>>>>    JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>    webrev: 
>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>
>>>>> I believe this change helps troubleshooter to fight to postmortem 
>>>>> analysis.
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> On 2020/01/19 3:16, Yasumasa Suenaga wrote:
>>>>>> PING: Could you review it?
>>>>>>
>>>>>>    JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>    webrev: 
>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>>
>>>>>> I updated webrev. I discussed with Serguei in off list, and I 
>>>>>> refactored webrev.02 .
>>>>>> It has passed tests on submit repo 
>>>>>> (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549).
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>> On 2019/12/15 10:51, Yasumasa Suenaga wrote:
>>>>>>> Hi Serguei,
>>>>>>>
>>>>>>> Thanks for your comment!
>>>>>>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev.
>>>>>>> Also I fixed to free lib->eh_frame.data in libproc_impl.c as 
>>>>>>> Dmitry said.
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/
>>>>>>>
>>>>>>> This change has been passed all tests on submit repo 
>>>>>>> (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487).
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote:
>>>>>>>> Hi Yasumasa,
>>>>>>>>
>>>>>>>> This is nice move in general.
>>>>>>>> Thank you for working on this!
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html 
>>>>>>>>
>>>>>>>>
>>>>>>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == 
>>>>>>>> 0L) { // Java frame 98 Address rbp = 
>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if 
>>>>>>>> (rbp == null) { 100 return null; 101 } 102 return new 
>>>>>>>> LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native 
>>>>>>>> frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new 
>>>>>>>> DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 
>>>>>>>> Address rbp = 
>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if 
>>>>>>>> (rbp == null) { 110 return null; 111 } 112 return new 
>>>>>>>> LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 
>>>>>>>> dwarf.processDwarf(pc); 115 Address cfa = 
>>>>>>>> ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 
>>>>>>>> !dwarf.isBPOffsetAvailable()) 117 ? 
>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : 
>>>>>>>> context.getRegisterAsAddress(dwarf.getCFARegister()) 119 
>>>>>>>> .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 
>>>>>>>> return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, 
>>>>>>>> pc, dwarf); 124 }
>>>>>>>>
>>>>>>>>
>>>>>>>> I'd suggest to simplify the logic by refactoring to something 
>>>>>>>> like below:
>>>>>>>>
>>>>>>>>            long libptr = dbg.findLibPtrByAddress(pc);
>>>>>>>>            Address cfa = 
>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java 
>>>>>>>> frame
>>>>>>>>            DwarfParser dwarf = null;
>>>>>>>>
>>>>>>>>            if (libptr != 0L) { // Native frame
>>>>>>>>              try {
>>>>>>>>                dwarf = new DwarfParser(libptr);
>>>>>>>>                dwarf.processDwarf(pc);
>>>>>>>>                Address cfa = ((dwarf.getCFARegister() == 
>>>>>>>> AMD64ThreadContext.RBP) &&
>>>>>>>> !dwarf.isBPOffsetAvailable())
>>>>>>>>                                  ? 
>>>>>>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP)
>>>>>>>>                                  : 
>>>>>>>> context.getRegisterAsAddress(dwarf.getCFARegister())
>>>>>>>> .addOffsetTo(dwarf.getCFAOffset());
>>>>>>>>
>>>>>>>>             } catch (DebuggerException e) { // bail out to Java 
>>>>>>>> frame case
>>>>>>>>             }
>>>>>>>>           }
>>>>>>>>           if (cfa == null) {
>>>>>>>>             return null;
>>>>>>>>           }
>>>>>>>>           return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf);
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html 
>>>>>>>>
>>>>>>>>
>>>>>>>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA()
>>>>>>>>
>>>>>>>>    Better to rename 'ofs' => 'offs'.
>>>>>>>>
>>>>>>>> 77 nextCFA = nextCFA.addOffsetTo(- 
>>>>>>>> nextDwarf.getBasePointerOffsetFromCFA());
>>>>>>>>
>>>>>>>>    Extra space after '-' sign.
>>>>>>>>
>>>>>>>> 71 private Address getNextCFA(DwarfParser nextDwarf, 
>>>>>>>> ThreadContext context) {
>>>>>>>>
>>>>>>>>    It feels like the logic has to be somehow 
>>>>>>>> refactored/simplified as
>>>>>>>>    several typical fragments appears in slightly different 
>>>>>>>> contexts.
>>>>>>>>    But it is not easy to understand what it is.
>>>>>>>>    Could you, please, add some comments to key places 
>>>>>>>> explaining this logic.
>>>>>>>>    Then I'll check if it is possible to make it a little bit 
>>>>>>>> simpler.
>>>>>>>>
>>>>>>>> 109 private CFrame javaSender(ThreadContext context) { 110 
>>>>>>>> Address nextCFA; 111 Address nextPC; 112 113 nextPC = 
>>>>>>>> getNextPC(false); 114 if (nextPC == null) { 115 return null; 
>>>>>>>> 116 } 117 118 DwarfParser nextDwarf = null; 119 long libptr = 
>>>>>>>> dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // 
>>>>>>>> Native frame 121 try { 122 nextDwarf = new DwarfParser(libptr); 
>>>>>>>> 123 } catch (DebuggerException e) { 124 nextCFA = 
>>>>>>>> getNextCFA(null, context); 125 return (nextCFA == null) ? null 
>>>>>>>> : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 
>>>>>>>> nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = 
>>>>>>>> getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? 
>>>>>>>> null 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, 
>>>>>>>> nextDwarf); 133 }
>>>>>>>>
>>>>>>>>   The above can be simplified if a DebuggerException can not be 
>>>>>>>> thrown from processDwarf(nextPC):
>>>>>>>>       private CFrame javaSender(ThreadContext context) {
>>>>>>>>         Address nextPC = getNextPC(false);
>>>>>>>>         if (nextPC == null) {
>>>>>>>>           return null;
>>>>>>>>         }
>>>>>>>>         long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>         DwarfParser nextDwarf = null;
>>>>>>>>
>>>>>>>>         if (libptr != 0L) { // Native frame
>>>>>>>>           try {
>>>>>>>>             nextDwarf = new DwarfParser(libptr);
>>>>>>>>             nextDwarf.processDwarf(nextPC);
>>>>>>>>           } catch (DebuggerException e) { // Bail out to Java 
>>>>>>>> frame
>>>>>>>>           }
>>>>>>>>         }
>>>>>>>>         Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>         return (nextCFA == null) ? null : new 
>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>       }
>>>>>>>>
>>>>>>>> 135 public CFrame sender(ThreadProxy thread) { 136 
>>>>>>>> ThreadContext context = thread.getContext(); 137 138 if (dwarf 
>>>>>>>> == null) { // Java frame 139 return javaSender(context); 140 } 
>>>>>>>> 141 142 Address nextPC = getNextPC(true); 143 if (nextPC == 
>>>>>>>> null) { 144 return null; 145 } 146 147 Address nextCFA; 148 
>>>>>>>> DwarfParser nextDwarf = dwarf; 149 if (!dwarf.isIn(nextPC)) { 
>>>>>>>> 150 long libptr = dbg.findLibPtrByAddress(nextPC); 151 if 
>>>>>>>> (libptr == 0L) { 152 // Next frame might be Java frame 153 
>>>>>>>> nextCFA = getNextCFA(null, context); 154 return (nextCFA == 
>>>>>>>> null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, 
>>>>>>>> null); 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); 
>>>>>>>> 158 } catch (DebuggerException e) { 159 nextCFA = 
>>>>>>>> getNextCFA(null, context); 160 return (nextCFA == null) ? null 
>>>>>>>> : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } 
>>>>>>>> 163 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = 
>>>>>>>> getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? 
>>>>>>>> null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 
>>>>>>>> 167 }
>>>>>>>>
>>>>>>>>   This one can be also simplified a little:
>>>>>>>>
>>>>>>>>       public CFrame sender(ThreadProxy thread) {
>>>>>>>>         ThreadContext context = thread.getContext();
>>>>>>>>
>>>>>>>>         if (dwarf == null) { // Java frame
>>>>>>>>           return javaSender(context);
>>>>>>>>         }
>>>>>>>>         Address nextPC = getNextPC(true);
>>>>>>>>         if (nextPC == null) {
>>>>>>>>           return null;
>>>>>>>>         }
>>>>>>>>         DwarfParser nextDwarf = null;
>>>>>>>>         if (!dwarf.isIn(nextPC)) {
>>>>>>>>           long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>           if (libptr != 0L) {
>>>>>>>>             try {
>>>>>>>>               nextDwarf = new DwarfParser(libptr);
>>>>>>>>               nextDwarf.processDwarf(nextPC);
>>>>>>>>             } catch (DebuggerException e) { // Bail out to Java 
>>>>>>>> frame
>>>>>>>>             }
>>>>>>>>           }
>>>>>>>>         }
>>>>>>>>         Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>         return (nextCFA == null) ? null : new 
>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>       }
>>>>>>>>
>>>>>>>> Finally, it looks like just one method could replace both
>>>>>>>> sender(ThreadProxy thread) and javaSender(ThreadContext context):
>>>>>>>>
>>>>>>>>       private CFrame commonSender(ThreadProxy thread) {
>>>>>>>>         ThreadContext context = thread.getContext();
>>>>>>>>         Address nextPC = getNextPC(false);
>>>>>>>>         if (nextPC == null) {
>>>>>>>>           return null;
>>>>>>>>         }
>>>>>>>>         DwarfParser nextDwarf = null;
>>>>>>>>
>>>>>>>>         long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>         if (dwarf == null || !dwarf.isIn(nextPC)) {
>>>>>>>>           long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>>>           if (libptr != 0L) {
>>>>>>>>             try {
>>>>>>>>               nextDwarf = new DwarfParser(libptr);
>>>>>>>>               nextDwarf.processDwarf(nextPC);
>>>>>>>>             } catch (DebuggerException e) { // Bail out to Java 
>>>>>>>> frame
>>>>>>>>             }
>>>>>>>>           }
>>>>>>>>         }
>>>>>>>>         Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>>>         return (nextCFA == null) ? null : new 
>>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>>>       }
>>>>>>>>
>>>>>>>> I'm still reviewing the dwarf parser files.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Serguei
>>>>>>>>
>>>>>>>>
>>>>>>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I refactored LinuxAMD64CFrame.java . It works fine in 
>>>>>>>>> serviceability/sa tests and
>>>>>>>>> all tests on submit repo 
>>>>>>>>> (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923).
>>>>>>>>> Could you review new webrev?
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/
>>>>>>>>>
>>>>>>>>> The diff from previous webrev is here:
>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote:
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> Please review this change:
>>>>>>>>>>
>>>>>>>>>>    JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>>>    webrev: 
>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> According to 2.7 Stack Unwind Algorithm in System V 
>>>>>>>>>> Application Binary Interface AMD64
>>>>>>>>>> Architecture Processor Supplement [1], we need to use DWARF 
>>>>>>>>>> in .eh_frame or .debug_frame
>>>>>>>>>> for stack unwinding.
>>>>>>>>>>
>>>>>>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by default 
>>>>>>>>>> since GCC 4.6, so system
>>>>>>>>>> library (e.g. libc) might be compiled with this feature.
>>>>>>>>>>
>>>>>>>>>> However `jhsdb jstack --mixed` does not do so, it uses base 
>>>>>>>>>> pointer register (RBP).
>>>>>>>>>> So it might be lack of stack frames.
>>>>>>>>>>
>>>>>>>>>> I guess JDK-8219201 is caused by same issue.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> [1] 
>>>>>>>>>> https://urldefense.com/v3/__https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf__;!!GqivPVa7Brio!J801oKj34Q7f-4SzAWGKL67e6Xq2yMlV6f01eqp_fqqhqgKktCBiUi2RUaQusmjOqA$ 
>>>>>>>>>
>>>>>>>>


More information about the serviceability-dev mailing list