RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624

serguei.spitsyn at oracle.com serguei.spitsyn at oracle.com
Tue Mar 24 16:39:34 UTC 2020


Hi Yasumasa,

I'm okay with this update.
My mach5 test run for this patch is passed.

Thanks,
Serguei


On 3/23/20 17:08, Yasumasa Suenaga wrote:
> Hi Serguei,
>
> Thanks for your comment!
> I uploaded new webrev:
>
>   http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.04/
>
> Also I pushed it to submit repo:
>
>   http://hg.openjdk.java.net/jdk/submit/rev/fade6a949bd1
>
> On 2020/03/24 7:39, serguei.spitsyn at oracle.com wrote:
>> Hi Yasumasa,
>>
>> The mach5 tier5 testing looks good.
>> The serviceability/sa/ClhsdbPstack.java is failed without fix and is 
>> not failed with it.
>>
>> Thanks,
>> Serguei
>>
>>
>> On 3/23/20 10:18, serguei.spitsyn at oracle.com wrote:
>>> Hi Yasumasa,
>>>
>>> I looked at you changes.
>>> It is hard to understand if this fully solves the issue.
>>>
>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html 
>>>
>>>
>>> @@ -34,10 +34,11 @@
>>>        public static LinuxAMD64CFrame getTopFrame(LinuxDebugger dbg, 
>>> Address rip, ThreadContext context) {
>>>         Address libptr = dbg.findLibPtrByAddress(rip);
>>>         Address cfa = 
>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP);
>>>         DwarfParser dwarf = null;
>>> + boolean unsupportedDwarf = false;
>>>           if (libptr != null) { // Native frame
>>>           try {
>>>             dwarf = new DwarfParser(libptr);
>>>             dwarf.processDwarf(rip);
>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
>>>
>>> @@ -45,24 +46,33 @@
>>>                    !dwarf.isBPOffsetAvailable())
>>>                       ? 
>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP)
>>>                       : 
>>> context.getRegisterAsAddress(dwarf.getCFARegister())
>>> .addOffsetTo(dwarf.getCFAOffset());
>>>           } catch (DebuggerException e) {
>>> - // Bail out to Java frame case
>>> + if (dwarf != null) {
>>> + // DWARF processing should succeed when the frame is native
>>> + // but it might fail if CIE has language personality routine
>>> + // and/or LSDA.
>>> + dwarf = null;
>>> + unsupportedDwarf = true;
>>> + } else {
>>> + throw e;
>>> + }
>>>           }
>>>         }
>>>           return (cfa == null) ? null
>>> - : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf);
>>> + : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf, !unsupportedDwarf);
>>>      }
>>>
>>> @@ -121,13 +131,25 @@
>>>        }
>>>          return isValidFrame(nextCFA, context) ? nextCFA : null;
>>>      }
>>>   - private DwarfParser getNextDwarf(Address nextPC) {
>>> - DwarfParser nextDwarf = null;
>>> + @Override
>>> + public CFrame sender(ThreadProxy thread) {
>>> + if (!possibleNext) {
>>> + return null;
>>> + }
>>> +
>>> + ThreadContext context = thread.getContext();
>>> +
>>> + Address nextPC = getNextPC(dwarf != null);
>>> + if (nextPC == null) {
>>> + return null;
>>> + }
>>>   + DwarfParser nextDwarf = null;
>>> + boolean unsupportedDwarf = false;
>>>        if ((dwarf != null) && dwarf.isIn(nextPC)) {
>>>          nextDwarf = dwarf;
>>>        } else {
>>>          Address libptr = dbg.findLibPtrByAddress(nextPC);
>>>          if (libptr != null) {
>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
>>>
>>> @@ -138,33 +160,29 @@
>>>            }
>>>          }
>>>        }
>>>          if (nextDwarf != null) {
>>> + try {
>>>          nextDwarf.processDwarf(nextPC);
>>> + } catch (DebuggerException e) {
>>> + // DWARF processing should succeed when the frame is native
>>> + // but it might fail if CIE has language personality routine
>>> + // and/or LSDA.
>>> + nextDwarf = null;
>>> + unsupportedDwarf = true;
>>>        }
>>>
>>> This fix looks like a hack.
>>> Should we just propagate the Debugging exception instead of trying 
>>> to maintain unsupportedDwarf flag?
>
> DwarfParser::processDwarf would throw DebuggerException if it cannot 
> find DWARF which relates to PC.
> PC at this point is for next frame. So current frame (`this` object) 
> is valid, and it should be processed.
>
>
>>> Also, I don't like that DWARF-specific abbreviations (like CIE, 
>>> IDE,LSDA, etc.) are used without any comments explaining them.
>>> The code has to be generally readable without looking into the DWARF 
>>> spec each time.
>
> I added comments for them in this webrev.
>
>
> Thanks,
>
> Yasumasa
>
>
>>> I'm submitting mach5 jobs to make sure the issue has been resolved 
>>> with your fix.
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 3/20/20 17:55, Yasumasa Suenaga wrote:
>>>> Thanks Chris!
>>>> I'm waiting for reviewers for this change.
>>>>
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2020/03/21 4:23, Chris Plummer wrote:
>>>>> Hi Yasumasa,
>>>>>
>>>>> The failure is due to JDK-8231634, so not something you need to 
>>>>> worry about.
>>>>>
>>>>> thanks,
>>>>>
>>>>> Chris
>>>>>
>>>>> On 3/20/20 2:45 AM, Yasumasa Suenaga wrote:
>>>>>> Hi Chris,
>>>>>>
>>>>>> I uploaded new webrev which includes reverting change for 
>>>>>> ProblemList:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/
>>>>>>
>>>>>> I tested it on submit repo 
>>>>>> (mach5-one-ysuenaga-JDK-8240956-2-20200320-0814-9586301),
>>>>>> but it has failed in ClhsdbJstackXcompStress.java.
>>>>>> However I think it is not caused by this change because 
>>>>>> ClhsdbJstackXcompStress.java tests `jhsdb jstack`, not mixed 
>>>>>> mode, it would not parse DWARF.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>> On 2020/03/20 13:55, Chris Plummer wrote:
>>>>>>> Hi Yasumasa,
>>>>>>>
>>>>>>> The test has been problem listed so please add undoing this to 
>>>>>>> your webrev. Here's the diff that problem listed it:
>>>>>>>
>>>>>>> diff --git a/test/hotspot/jtreg/ProblemList.txt 
>>>>>>> b/test/hotspot/jtreg/ProblemList.txt
>>>>>>> --- a/test/hotspot/jtreg/ProblemList.txt
>>>>>>> +++ b/test/hotspot/jtreg/ProblemList.txt
>>>>>>> @@ -115,7 +115,7 @@
>>>>>>>   serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all
>>>>>>>   serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all
>>>>>>>   serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all
>>>>>>> -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all
>>>>>>> +serviceability/sa/ClhsdbPstack.java 8193639,8240956 
>>>>>>> solaris-all,linux-all
>>>>>>>   serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 
>>>>>>> 8193639 solaris-all
>>>>>>>   serviceability/sa/ClhsdbScanOops.java 8193639,8235220,8230731 
>>>>>>> solaris-all,linux-x64,macosx-x64,windows-x64
>>>>>>>   serviceability/sa/ClhsdbSource.java 8193639 solaris-all
>>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On 3/16/20 5:07 AM, Yasumasa Suenaga wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> This webrev has passed submit repo 
>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) and 
>>>>>>>> additional tests.
>>>>>>>> So please review it:
>>>>>>>>
>>>>>>>>   JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>>   webrev: 
>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2020/03/16 21:03, Yasumasa Suenaga wrote:
>>>>>>>>> Thank you so much, David!
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2020/03/16 21:01, David Holmes wrote:
>>>>>>>>>> On 16/03/2020 9:46 pm, David Holmes wrote:
>>>>>>>>>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote:
>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>
>>>>>>>>>>>> I missed loop condition, so I fixed it and pushed to submit 
>>>>>>>>>>>> repo.
>>>>>>>>>>>> Could you try again?
>>>>>>>>>>>>
>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23
>>>>>>>>>>>>
>>>>>>>>>>>> webrev is here:
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>>>>>>>>>>
>>>>>>>>>>> Test job resubmitted. Will advise results if it completes 
>>>>>>>>>>> before I go to bed :)
>>>>>>>>>>
>>>>>>>>>> Seems to have passed okay.
>>>>>>>>>>
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>> David
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks a lot!
>>>>>>>>>>>>
>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 2020/03/16 16:17, David Holmes wrote:
>>>>>>>>>>>>> Sorry it is still crashing.
>>>>>>>>>>>>>
>>>>>>>>>>>>> #
>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime 
>>>>>>>>>>>>> Environment:
>>>>>>>>>>>>> #
>>>>>>>>>>>>> #  SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702, 
>>>>>>>>>>>>> tid=13704
>>>>>>>>>>>>> #
>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0) 
>>>>>>>>>>>>> (fastdebug build 
>>>>>>>>>>>>> 15-internal+0-2020-03-16-0640217.suenaga.source)
>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 
>>>>>>>>>>>>> 15-internal+0-2020-03-16-0640217.suenaga.source, mixed 
>>>>>>>>>>>>> mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>>>>>>>>>> # Problematic frame:
>>>>>>>>>>>>> # C  [libsaproc.so+0x494e] 
>>>>>>>>>>>>> DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>>>>>> #
>>>>>>>>>>>>>
>>>>>>>>>>>>> Same as before.
>>>>>>>>>>>>>
>>>>>>>>>>>>> David
>>>>>>>>>>>>> -----
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 16/03/2020 4:57 pm, David Holmes wrote:
>>>>>>>>>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote:
>>>>>>>>>>>>>>>> BTW, if you submit it to the submit repo, we can then 
>>>>>>>>>>>>>>>> go and run additional internal tests (and even more 
>>>>>>>>>>>>>>>> builds) using that job.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for that tip Chris!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I've pushed the change to submit repo, but I've not yet 
>>>>>>>>>>>>>>> received the result.
>>>>>>>>>>>>>>> I will share you when I get job ID.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> We can see the id. Just need to wait for the builds to 
>>>>>>>>>>>>>> complete before submitting the additional tests.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thank you for testing it.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I updated webrev to avoid bailout to Java frame when 
>>>>>>>>>>>>>>>>> DWARF has language personality routine or LSDA.
>>>>>>>>>>>>>>>>> Could you try it?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/ 
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 .
>>>>>>>>>>>>>>>>> I've pushed it to submit repo.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Diff from webrev.00 is here:
>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote:
>>>>>>>>>>>>>>>>>> Correction ...
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I can't review this as I know nothing about the 
>>>>>>>>>>>>>>>>>>>> code, but I'm putting the patch through our 
>>>>>>>>>>>>>>>>>>>> internal testing.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Sorry but the crashes still exist:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java 
>>>>>>>>>>>>>>>>>>> Runtime Environment:
>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>> #  SIGSEGV (0xb) at pc=0x00007fcdd403894e, 
>>>>>>>>>>>>>>>>>>> pid=16948, tid=16949
>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment 
>>>>>>>>>>>>>>>>>>> (15.0) (fastdebug build 
>>>>>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev)
>>>>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM 
>>>>>>>>>>>>>>>>>>> (fastdebug 
>>>>>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev, 
>>>>>>>>>>>>>>>>>>> mixed mode, sharing, tiered, compressed oops, g1 gc, 
>>>>>>>>>>>>>>>>>>> linux-amd64)
>>>>>>>>>>>>>>>>>>> # Problematic frame:
>>>>>>>>>>>>>>>>>>> # C  [libsaproc.so+0x494e] 
>>>>>>>>>>>>>>>>>>> DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> in fact they seem worse as the test seems to always 
>>>>>>>>>>>>>>>>>>> crash now.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs 
>>>>>>>>>>>>>>>>>> of the test in linux-x64. I don't see a pattern as to 
>>>>>>>>>>>>>>>>>> where it fails versus passes.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> It doesn't fail for me locally.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>    JBS: 
>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>>>>>>>>>>>>>>>    webrev: 
>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/ 
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for 
>>>>>>>>>>>>>>>>>>>>> unwinding native frames in jstack mixed mode.
>>>>>>>>>>>>>>>>>>>>> However some error has seen intermittently after 
>>>>>>>>>>>>>>>>>>>>> that.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I investigated the cause of this, I found two 
>>>>>>>>>>>>>>>>>>>>> concerns:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>    A: lack of buffer (.eh_frame section data) 
>>>>>>>>>>>>>>>>>>>>> range check
>>>>>>>>>>>>>>>>>>>>>    B: Language personality routine and Language 
>>>>>>>>>>>>>>>>>>>>> Specific Data Area (LSDA) are not considered
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I addd range check for .eh_frame processing, and 
>>>>>>>>>>>>>>>>>>>>> ignore personality routine and LSDA in this webrev.
>>>>>>>>>>>>>>>>>>>>> Also I added bailout code if DWARF processing is 
>>>>>>>>>>>>>>>>>>>>> failed due to these concerns.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> This change has passed all tests on submit repo 
>>>>>>>>>>>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671), 
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle 
>>>>>>>>>>>>>>>>>>>>> Linux 7.7 container.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>
>>




More information about the serviceability-dev mailing list