RFR: 8240956: SEGV in DwarfParser::process_dwarf after JDK-8234624
serguei.spitsyn at oracle.com
serguei.spitsyn at oracle.com
Tue Mar 24 16:39:34 UTC 2020
Hi Yasumasa,
I'm okay with this update.
My mach5 test run for this patch is passed.
Thanks,
Serguei
On 3/23/20 17:08, Yasumasa Suenaga wrote:
> Hi Serguei,
>
> Thanks for your comment!
> I uploaded new webrev:
>
> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.04/
>
> Also I pushed it to submit repo:
>
> http://hg.openjdk.java.net/jdk/submit/rev/fade6a949bd1
>
> On 2020/03/24 7:39, serguei.spitsyn at oracle.com wrote:
>> Hi Yasumasa,
>>
>> The mach5 tier5 testing looks good.
>> The serviceability/sa/ClhsdbPstack.java is failed without fix and is
>> not failed with it.
>>
>> Thanks,
>> Serguei
>>
>>
>> On 3/23/20 10:18, serguei.spitsyn at oracle.com wrote:
>>> Hi Yasumasa,
>>>
>>> I looked at you changes.
>>> It is hard to understand if this fully solves the issue.
>>>
>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html
>>>
>>>
>>> @@ -34,10 +34,11 @@
>>> public static LinuxAMD64CFrame getTopFrame(LinuxDebugger dbg,
>>> Address rip, ThreadContext context) {
>>> Address libptr = dbg.findLibPtrByAddress(rip);
>>> Address cfa =
>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP);
>>> DwarfParser dwarf = null;
>>> + boolean unsupportedDwarf = false;
>>> if (libptr != null) { // Native frame
>>> try {
>>> dwarf = new DwarfParser(libptr);
>>> dwarf.processDwarf(rip);
>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>>
>>> @@ -45,24 +46,33 @@
>>> !dwarf.isBPOffsetAvailable())
>>> ?
>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP)
>>> :
>>> context.getRegisterAsAddress(dwarf.getCFARegister())
>>> .addOffsetTo(dwarf.getCFAOffset());
>>> } catch (DebuggerException e) {
>>> - // Bail out to Java frame case
>>> + if (dwarf != null) {
>>> + // DWARF processing should succeed when the frame is native
>>> + // but it might fail if CIE has language personality routine
>>> + // and/or LSDA.
>>> + dwarf = null;
>>> + unsupportedDwarf = true;
>>> + } else {
>>> + throw e;
>>> + }
>>> }
>>> }
>>> return (cfa == null) ? null
>>> - : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf);
>>> + : new LinuxAMD64CFrame(dbg, cfa, rip, dwarf, !unsupportedDwarf);
>>> }
>>>
>>> @@ -121,13 +131,25 @@
>>> }
>>> return isValidFrame(nextCFA, context) ? nextCFA : null;
>>> }
>>> - private DwarfParser getNextDwarf(Address nextPC) {
>>> - DwarfParser nextDwarf = null;
>>> + @Override
>>> + public CFrame sender(ThreadProxy thread) {
>>> + if (!possibleNext) {
>>> + return null;
>>> + }
>>> +
>>> + ThreadContext context = thread.getContext();
>>> +
>>> + Address nextPC = getNextPC(dwarf != null);
>>> + if (nextPC == null) {
>>> + return null;
>>> + }
>>> + DwarfParser nextDwarf = null;
>>> + boolean unsupportedDwarf = false;
>>> if ((dwarf != null) && dwarf.isIn(nextPC)) {
>>> nextDwarf = dwarf;
>>> } else {
>>> Address libptr = dbg.findLibPtrByAddress(nextPC);
>>> if (libptr != null) {
>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>>
>>> @@ -138,33 +160,29 @@
>>> }
>>> }
>>> }
>>> if (nextDwarf != null) {
>>> + try {
>>> nextDwarf.processDwarf(nextPC);
>>> + } catch (DebuggerException e) {
>>> + // DWARF processing should succeed when the frame is native
>>> + // but it might fail if CIE has language personality routine
>>> + // and/or LSDA.
>>> + nextDwarf = null;
>>> + unsupportedDwarf = true;
>>> }
>>>
>>> This fix looks like a hack.
>>> Should we just propagate the Debugging exception instead of trying
>>> to maintain unsupportedDwarf flag?
>
> DwarfParser::processDwarf would throw DebuggerException if it cannot
> find DWARF which relates to PC.
> PC at this point is for next frame. So current frame (`this` object)
> is valid, and it should be processed.
>
>
>>> Also, I don't like that DWARF-specific abbreviations (like CIE,
>>> IDE,LSDA, etc.) are used without any comments explaining them.
>>> The code has to be generally readable without looking into the DWARF
>>> spec each time.
>
> I added comments for them in this webrev.
>
>
> Thanks,
>
> Yasumasa
>
>
>>> I'm submitting mach5 jobs to make sure the issue has been resolved
>>> with your fix.
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 3/20/20 17:55, Yasumasa Suenaga wrote:
>>>> Thanks Chris!
>>>> I'm waiting for reviewers for this change.
>>>>
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2020/03/21 4:23, Chris Plummer wrote:
>>>>> Hi Yasumasa,
>>>>>
>>>>> The failure is due to JDK-8231634, so not something you need to
>>>>> worry about.
>>>>>
>>>>> thanks,
>>>>>
>>>>> Chris
>>>>>
>>>>> On 3/20/20 2:45 AM, Yasumasa Suenaga wrote:
>>>>>> Hi Chris,
>>>>>>
>>>>>> I uploaded new webrev which includes reverting change for
>>>>>> ProblemList:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.03/
>>>>>>
>>>>>> I tested it on submit repo
>>>>>> (mach5-one-ysuenaga-JDK-8240956-2-20200320-0814-9586301),
>>>>>> but it has failed in ClhsdbJstackXcompStress.java.
>>>>>> However I think it is not caused by this change because
>>>>>> ClhsdbJstackXcompStress.java tests `jhsdb jstack`, not mixed
>>>>>> mode, it would not parse DWARF.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>> On 2020/03/20 13:55, Chris Plummer wrote:
>>>>>>> Hi Yasumasa,
>>>>>>>
>>>>>>> The test has been problem listed so please add undoing this to
>>>>>>> your webrev. Here's the diff that problem listed it:
>>>>>>>
>>>>>>> diff --git a/test/hotspot/jtreg/ProblemList.txt
>>>>>>> b/test/hotspot/jtreg/ProblemList.txt
>>>>>>> --- a/test/hotspot/jtreg/ProblemList.txt
>>>>>>> +++ b/test/hotspot/jtreg/ProblemList.txt
>>>>>>> @@ -115,7 +115,7 @@
>>>>>>> serviceability/sa/ClhsdbPrintAll.java 8193639 solaris-all
>>>>>>> serviceability/sa/ClhsdbPrintAs.java 8193639 solaris-all
>>>>>>> serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris-all
>>>>>>> -serviceability/sa/ClhsdbPstack.java 8193639 solaris-all
>>>>>>> +serviceability/sa/ClhsdbPstack.java 8193639,8240956
>>>>>>> solaris-all,linux-all
>>>>>>> serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java
>>>>>>> 8193639 solaris-all
>>>>>>> serviceability/sa/ClhsdbScanOops.java 8193639,8235220,8230731
>>>>>>> solaris-all,linux-x64,macosx-x64,windows-x64
>>>>>>> serviceability/sa/ClhsdbSource.java 8193639 solaris-all
>>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On 3/16/20 5:07 AM, Yasumasa Suenaga wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> This webrev has passed submit repo
>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200316-0924-9487169) and
>>>>>>>> additional tests.
>>>>>>>> So please review it:
>>>>>>>>
>>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>> webrev:
>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2020/03/16 21:03, Yasumasa Suenaga wrote:
>>>>>>>>> Thank you so much, David!
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2020/03/16 21:01, David Holmes wrote:
>>>>>>>>>> On 16/03/2020 9:46 pm, David Holmes wrote:
>>>>>>>>>>> On 16/03/2020 7:20 pm, Yasumasa Suenaga wrote:
>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>
>>>>>>>>>>>> I missed loop condition, so I fixed it and pushed to submit
>>>>>>>>>>>> repo.
>>>>>>>>>>>> Could you try again?
>>>>>>>>>>>>
>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/9c148df17f23
>>>>>>>>>>>>
>>>>>>>>>>>> webrev is here:
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.02/
>>>>>>>>>>>
>>>>>>>>>>> Test job resubmitted. Will advise results if it completes
>>>>>>>>>>> before I go to bed :)
>>>>>>>>>>
>>>>>>>>>> Seems to have passed okay.
>>>>>>>>>>
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>> David
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks a lot!
>>>>>>>>>>>>
>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 2020/03/16 16:17, David Holmes wrote:
>>>>>>>>>>>>> Sorry it is still crashing.
>>>>>>>>>>>>>
>>>>>>>>>>>>> #
>>>>>>>>>>>>> # A fatal error has been detected by the Java Runtime
>>>>>>>>>>>>> Environment:
>>>>>>>>>>>>> #
>>>>>>>>>>>>> # SIGSEGV (0xb) at pc=0x00007f98ec01e94e, pid=13702,
>>>>>>>>>>>>> tid=13704
>>>>>>>>>>>>> #
>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment (15.0)
>>>>>>>>>>>>> (fastdebug build
>>>>>>>>>>>>> 15-internal+0-2020-03-16-0640217.suenaga.source)
>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug
>>>>>>>>>>>>> 15-internal+0-2020-03-16-0640217.suenaga.source, mixed
>>>>>>>>>>>>> mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
>>>>>>>>>>>>> # Problematic frame:
>>>>>>>>>>>>> # C [libsaproc.so+0x494e]
>>>>>>>>>>>>> DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>>>>>> #
>>>>>>>>>>>>>
>>>>>>>>>>>>> Same as before.
>>>>>>>>>>>>>
>>>>>>>>>>>>> David
>>>>>>>>>>>>> -----
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 16/03/2020 4:57 pm, David Holmes wrote:
>>>>>>>>>>>>>> On 16/03/2020 4:51 pm, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>> On 2020/03/16 15:43, Chris Plummer wrote:
>>>>>>>>>>>>>>>> BTW, if you submit it to the submit repo, we can then
>>>>>>>>>>>>>>>> go and run additional internal tests (and even more
>>>>>>>>>>>>>>>> builds) using that job.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for that tip Chris!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I've pushed the change to submit repo, but I've not yet
>>>>>>>>>>>>>>> received the result.
>>>>>>>>>>>>>>> I will share you when I get job ID.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> We can see the id. Just need to wait for the builds to
>>>>>>>>>>>>>> complete before submitting the additional tests.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 3/15/20 11:36 PM, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thank you for testing it.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I updated webrev to avoid bailout to Java frame when
>>>>>>>>>>>>>>>>> DWARF has language personality routine or LSDA.
>>>>>>>>>>>>>>>>> Could you try it?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.01/
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> It works well on my Fedora 31 and Oracle Linux 7.7 .
>>>>>>>>>>>>>>>>> I've pushed it to submit repo.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Diff from webrev.00 is here:
>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/6f11cd275652
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 2020/03/16 13:12, David Holmes wrote:
>>>>>>>>>>>>>>>>>> Correction ...
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 16/03/2020 12:53 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>>>> On 16/03/2020 12:17 pm, David Holmes wrote:
>>>>>>>>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I can't review this as I know nothing about the
>>>>>>>>>>>>>>>>>>>> code, but I'm putting the patch through our
>>>>>>>>>>>>>>>>>>>> internal testing.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Sorry but the crashes still exist:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>> # A fatal error has been detected by the Java
>>>>>>>>>>>>>>>>>>> Runtime Environment:
>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>> # SIGSEGV (0xb) at pc=0x00007fcdd403894e,
>>>>>>>>>>>>>>>>>>> pid=16948, tid=16949
>>>>>>>>>>>>>>>>>>> #
>>>>>>>>>>>>>>>>>>> # JRE version: Java(TM) SE Runtime Environment
>>>>>>>>>>>>>>>>>>> (15.0) (fastdebug build
>>>>>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev)
>>>>>>>>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM
>>>>>>>>>>>>>>>>>>> (fastdebug
>>>>>>>>>>>>>>>>>>> 15-internal+0-2020-03-16-0214474.david.holmes.jdk-dev,
>>>>>>>>>>>>>>>>>>> mixed mode, sharing, tiered, compressed oops, g1 gc,
>>>>>>>>>>>>>>>>>>> linux-amd64)
>>>>>>>>>>>>>>>>>>> # Problematic frame:
>>>>>>>>>>>>>>>>>>> # C [libsaproc.so+0x494e]
>>>>>>>>>>>>>>>>>>> DwarfParser::process_dwarf(unsigned long)+0x4e
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> in fact they seem worse as the test seems to always
>>>>>>>>>>>>>>>>>>> crash now.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Not worse - sorry. I see 6 failures out of 119 runs
>>>>>>>>>>>>>>>>>> of the test in linux-x64. I don't see a pattern as to
>>>>>>>>>>>>>>>>>> where it fails versus passes.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> It doesn't fail for me locally.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On 14/03/2020 11:35 am, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> JBS:
>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8240956
>>>>>>>>>>>>>>>>>>>>> webrev:
>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240956/webrev.00/
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> JDK-8234624 introduced DWARF parser in SA for
>>>>>>>>>>>>>>>>>>>>> unwinding native frames in jstack mixed mode.
>>>>>>>>>>>>>>>>>>>>> However some error has seen intermittently after
>>>>>>>>>>>>>>>>>>>>> that.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I investigated the cause of this, I found two
>>>>>>>>>>>>>>>>>>>>> concerns:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> A: lack of buffer (.eh_frame section data)
>>>>>>>>>>>>>>>>>>>>> range check
>>>>>>>>>>>>>>>>>>>>> B: Language personality routine and Language
>>>>>>>>>>>>>>>>>>>>> Specific Data Area (LSDA) are not considered
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I addd range check for .eh_frame processing, and
>>>>>>>>>>>>>>>>>>>>> ignore personality routine and LSDA in this webrev.
>>>>>>>>>>>>>>>>>>>>> Also I added bailout code if DWARF processing is
>>>>>>>>>>>>>>>>>>>>> failed due to these concerns.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> This change has passed all tests on submit repo
>>>>>>>>>>>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8240956-20200313-1518-9434671),
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> also I tested it on my Fedora 31 box and Oracle
>>>>>>>>>>>>>>>>>>>>> Linux 7.7 container.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>
>>
More information about the serviceability-dev
mailing list