: PING: RFR: 8234624: jstack mixed mode should refer DWARF
Yasumasa Suenaga
suenaga at oss.nttdata.com
Tue Mar 10 12:36:38 UTC 2020
Hi Kevin,
Thanks for your comment!
On 2020/03/10 18:58, Kevin Walls wrote:
> Hi Yasumasa ,
>
> The changes build OK for me in the latest jdk, and things still work.
> I have not yet seen the dwarf usage in action: I've tried a couple of different systems and so far have not reproduced the problem, i.e. jstack has not failed on native frames.
>
> I may need more recent basic libraries, will look again for somewhere where the problem happens and get back to you as I really want to run the changes.
You can see the problem with JShell.
Some Java frames would not be seen in mixed jstack.
> I have mostly minor other comments which don't need a new webrev, some just comments for the future:
>
> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.cpp:
>
> DW_CFA_nop - shouldn't this continue instead of return?
> (It may "never" happen, but a nop could appear within some other instructions?)
DW_CFA_nop is used for padding, so we can ignore (return immediately) it.
> DW_CFA_remember_state: a minor typo in the comment, "DW_CFA_remenber_state".
I will fix it.
> We handle DW_CFA_advance_loc, and _loc1 and _loc2, but not DW_CFA_advance_loc4. I thought that was odd, but maybe addresses in these tables never increase by 4-byte amounts, would this mean a lot of code on one line. 8-)
> So maybe it's never used in practice, if you think it's unnecessary no problem, maybe a comment, or add it for robustness.
I will add DW_CFA_advance_loc4.
> General-purpose methods like read_leb128(), get_entry_length(), get_decoded_value() specifically update the _buf pointer in this DwarfParser.
>
> DwarfParser::process_dwarf() moves _buf.
> It calls process_cie() which reads, moves _buf and restores it to the original position, then we read augmentation_length from where _buf is.
> I'm not sure if that's wrong, or if I just need to read again about the CIE/etc layout.
>
> I don't really want to suggest making the code pass around a current _buf for the invocation of these general purpose methods, but just wanted to comment that if these get used more widely that might become necessary.
I saw GDB and binutils source for creating this patch.
They seems to process similar code because we need to calculate DWARF instructions one-by-one to get the value which relates to specified PC.
> Similarly in future, if this DWARF support code became used more widely, it might want to move to an
> OS-neutral directory? It's odd to label it as Linux-specific.
Windows does not use DWARF at least, it uses another feature.
https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2019
I'm not sure other platforms (Solaris, macOS) uses DWARF.
If DWARF is used in them, I can move DWARF related code to posix directory.
> src/jdk.hotspot.agent/linux/native/libsaproc/dwarf.hpp:
> Thanks for changing "can_parsable" which was in the earlier version. 8-)
>
>
> These are just comments to mainly say it looks good, and somebody else out there has read it.
> I will look for a system that shows the problem, and get back to you again!
Thanks,
Yasumasa
> Many thanks
> Kevin
>
> On 27/02/2020 05:13, Yasumasa Suenaga wrote:
>> Hi all,
>>
>> webrev.03 cannot be applied to current jdk/jdk due to 8239224 and 8239462 changes (they updated copyright year).
>> So I modified webrev (only copyright year changes) to be able to apply to current jdk/jdk.
>> Could you review it?
>>
>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.04/
>>
>> I need one more reviewer to push.
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> On 2020/02/17 13:07, Yasumasa Suenaga wrote:
>>> PING: Could you review it?
>>>
>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>> webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>
>>> This change has been already reviewed by Serguei.
>>> I need one more reviewer to push.
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2020/02/03 1:37, Yasumasa Suenaga wrote:
>>>> PING: Could you reveiw this change?
>>>>
>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>> webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>
>>>> I believe this change helps troubleshooter to fight to postmortem analysis.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2020/01/19 3:16, Yasumasa Suenaga wrote:
>>>>> PING: Could you review it?
>>>>>
>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>> webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/
>>>>>
>>>>> I updated webrev. I discussed with Serguei in off list, and I refactored webrev.02 .
>>>>> It has passed tests on submit repo (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549).
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> On 2019/12/15 10:51, Yasumasa Suenaga wrote:
>>>>>> Hi Serguei,
>>>>>>
>>>>>> Thanks for your comment!
>>>>>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev.
>>>>>> Also I fixed to free lib->eh_frame.data in libproc_impl.c as Dmitry said.
>>>>>>
>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/
>>>>>>
>>>>>> This change has been passed all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487).
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote:
>>>>>>> Hi Yasumasa,
>>>>>>>
>>>>>>> This is nice move in general.
>>>>>>> Thank you for working on this!
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html
>>>>>>>
>>>>>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == 0L) { // Java frame 98 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if (rbp == null) { 100 return null; 101 } 102 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if (rbp == null) { 110 return null; 111 } 112 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 dwarf.processDwarf(pc); 115 Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 !dwarf.isBPOffsetAvailable()) 117 ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : context.getRegisterAsAddress(dwarf.getCFARegister()) 119 .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); 124 }
>>>>>>>
>>>>>>>
>>>>>>> I'd suggest to simplify the logic by refactoring to something like below:
>>>>>>>
>>>>>>> long libptr = dbg.findLibPtrByAddress(pc);
>>>>>>> Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java frame
>>>>>>> DwarfParser dwarf = null;
>>>>>>>
>>>>>>> if (libptr != 0L) { // Native frame
>>>>>>> try {
>>>>>>> dwarf = new DwarfParser(libptr);
>>>>>>> dwarf.processDwarf(pc);
>>>>>>> Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) &&
>>>>>>> !dwarf.isBPOffsetAvailable())
>>>>>>> ? context.getRegisterAsAddress(AMD64ThreadContext.RBP)
>>>>>>> : context.getRegisterAsAddress(dwarf.getCFARegister())
>>>>>>> .addOffsetTo(dwarf.getCFAOffset());
>>>>>>>
>>>>>>> } catch (DebuggerException e) { // bail out to Java frame case
>>>>>>> }
>>>>>>> }
>>>>>>> if (cfa == null) {
>>>>>>> return null;
>>>>>>> }
>>>>>>> return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf);
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html
>>>>>>>
>>>>>>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA()
>>>>>>>
>>>>>>> Better to rename 'ofs' => 'offs'.
>>>>>>>
>>>>>>> 77 nextCFA = nextCFA.addOffsetTo(- nextDwarf.getBasePointerOffsetFromCFA());
>>>>>>>
>>>>>>> Extra space after '-' sign.
>>>>>>>
>>>>>>> 71 private Address getNextCFA(DwarfParser nextDwarf, ThreadContext context) {
>>>>>>>
>>>>>>> It feels like the logic has to be somehow refactored/simplified as
>>>>>>> several typical fragments appears in slightly different contexts.
>>>>>>> But it is not easy to understand what it is.
>>>>>>> Could you, please, add some comments to key places explaining this logic.
>>>>>>> Then I'll check if it is possible to make it a little bit simpler.
>>>>>>>
>>>>>>> 109 private CFrame javaSender(ThreadContext context) { 110 Address nextCFA; 111 Address nextPC; 112 113 nextPC = getNextPC(false); 114 if (nextPC == null) { 115 return null; 116 } 117 118 DwarfParser nextDwarf = null; 119 long libptr = dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // Native frame 121 try { 122 nextDwarf = new DwarfParser(libptr); 123 } catch (DebuggerException e) { 124 nextCFA = getNextCFA(null, context); 125 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? null 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 133 }
>>>>>>>
>>>>>>> The above can be simplified if a DebuggerException can not be thrown from processDwarf(nextPC):
>>>>>>> private CFrame javaSender(ThreadContext context) {
>>>>>>> Address nextPC = getNextPC(false);
>>>>>>> if (nextPC == null) {
>>>>>>> return null;
>>>>>>> }
>>>>>>> long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>> DwarfParser nextDwarf = null;
>>>>>>>
>>>>>>> if (libptr != 0L) { // Native frame
>>>>>>> try {
>>>>>>> nextDwarf = new DwarfParser(libptr);
>>>>>>> nextDwarf.processDwarf(nextPC);
>>>>>>> } catch (DebuggerException e) { // Bail out to Java frame
>>>>>>> }
>>>>>>> }
>>>>>>> Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>> return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>> }
>>>>>>>
>>>>>>> 135 public CFrame sender(ThreadProxy thread) { 136 ThreadContext context = thread.getContext(); 137 138 if (dwarf == null) { // Java frame 139 return javaSender(context); 140 } 141 142 Address nextPC = getNextPC(true); 143 if (nextPC == null) { 144 return null; 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 // Next frame might be Java frame 153 nextCFA = getNextCFA(null, context); 154 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); 158 } catch (DebuggerException e) { 159 nextCFA = getNextCFA(null, context); 160 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } 163 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? null : new
>>>>>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 167 }
>>>>>>>
>>>>>>> This one can be also simplified a little:
>>>>>>>
>>>>>>> public CFrame sender(ThreadProxy thread) {
>>>>>>> ThreadContext context = thread.getContext();
>>>>>>>
>>>>>>> if (dwarf == null) { // Java frame
>>>>>>> return javaSender(context);
>>>>>>> }
>>>>>>> Address nextPC = getNextPC(true);
>>>>>>> if (nextPC == null) {
>>>>>>> return null;
>>>>>>> }
>>>>>>> DwarfParser nextDwarf = null;
>>>>>>> if (!dwarf.isIn(nextPC)) {
>>>>>>> long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>> if (libptr != 0L) {
>>>>>>> try {
>>>>>>> nextDwarf = new DwarfParser(libptr);
>>>>>>> nextDwarf.processDwarf(nextPC);
>>>>>>> } catch (DebuggerException e) { // Bail out to Java frame
>>>>>>> }
>>>>>>> }
>>>>>>> }
>>>>>>> Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>> return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>> }
>>>>>>>
>>>>>>> Finally, it looks like just one method could replace both
>>>>>>> sender(ThreadProxy thread) and javaSender(ThreadContext context):
>>>>>>>
>>>>>>> private CFrame commonSender(ThreadProxy thread) {
>>>>>>> ThreadContext context = thread.getContext();
>>>>>>> Address nextPC = getNextPC(false);
>>>>>>> if (nextPC == null) {
>>>>>>> return null;
>>>>>>> }
>>>>>>> DwarfParser nextDwarf = null;
>>>>>>>
>>>>>>> long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>> if (dwarf == null || !dwarf.isIn(nextPC)) {
>>>>>>> long libptr = dbg.findLibPtrByAddress(nextPC);
>>>>>>> if (libptr != 0L) {
>>>>>>> try {
>>>>>>> nextDwarf = new DwarfParser(libptr);
>>>>>>> nextDwarf.processDwarf(nextPC);
>>>>>>> } catch (DebuggerException e) { // Bail out to Java frame
>>>>>>> }
>>>>>>> }
>>>>>>> }
>>>>>>> Address nextCFA = getNextCFA(nextDwarf, context);
>>>>>>> return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf);
>>>>>>> }
>>>>>>>
>>>>>>> I'm still reviewing the dwarf parser files.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>
>>>>>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I refactored LinuxAMD64CFrame.java . It works fine in serviceability/sa tests and
>>>>>>>> all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923).
>>>>>>>> Could you review new webrev?
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/
>>>>>>>>
>>>>>>>> The diff from previous webrev is here:
>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote:
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> Please review this change:
>>>>>>>>>
>>>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8234624
>>>>>>>>> webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> According to 2.7 Stack Unwind Algorithm in System V Application Binary Interface AMD64
>>>>>>>>> Architecture Processor Supplement [1], we need to use DWARF in .eh_frame or .debug_frame
>>>>>>>>> for stack unwinding.
>>>>>>>>>
>>>>>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by default since GCC 4.6, so system
>>>>>>>>> library (e.g. libc) might be compiled with this feature.
>>>>>>>>>
>>>>>>>>> However `jhsdb jstack --mixed` does not do so, it uses base pointer register (RBP).
>>>>>>>>> So it might be lack of stack frames.
>>>>>>>>>
>>>>>>>>> I guess JDK-8219201 is caused by same issue.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> [1] https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf
>>>>>>>
More information about the serviceability-dev
mailing list