RFR(XXS): 8224793: os::die() does not honor CreateCoredumpOnCrash option
David Holmes
david.holmes at oracle.com
Sat Jun 1 02:28:01 UTC 2019
Hi Dan, Kim,
Glad this was resolved. Code changes look good.
I agree that consolidation into os_posix.cpp should be a separate RFE. I
also suspect that BSD comment has not been true for a long time.
Windows is as always a special case. None of Window's termination
mechanisms "dump core". If you want a minidump on windows then you have
to explicitly ask the debugging subsystem to produce one for you. The
logic we have for deciding when to do that on Windows is already complex
and I would not suggest trying to incorporate that into os::die() -
seems a lot of pain for little gain and just more things that could go
wrong as we try to terminate.
Thanks,
David
-----
On 1/06/2019 6:25 am, Daniel D. Daugherty wrote:
> Greetings,
>
> I've updated the fix based on the code review discussions with Kim and
> David H. I think it's easier to review using the full webrev, but if
> you're curious exactly how the code evolved, then there is also an
> incremental webrev.
>
> Full webrev:
> http://cr.openjdk.java.net/~dcubed/8224793-webrev/2-for-jdk-jdk13.full/
>
> Incremental webrev:
> http://cr.openjdk.java.net/~dcubed/8224793-webrev/2-for-jdk-jdk13.inc/
>
> I have Mach5 Tier[1-3] testing running right now and it is looking good.
>
> Dan
>
>
> On 5/29/19 3:29 PM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> I've updated the fix from calling os::exit(1) to calling
>> os::signal_raise(SIGKILL). Here's the updated webrevs:
>>
>> Full webrev:
>> http://cr.openjdk.java.net/~dcubed/8224793-webrev/1-for-jdk-jdk13.full/
>>
>> Incremental webrev:
>> http://cr.openjdk.java.net/~dcubed/8224793-webrev/1-for-jdk-jdk13.inc/
>>
>> I have Mach5 Tier[1-3] testing running right now and it is looking good.
>> I'll follow with Mach5 Tier[456] testing. Manually running 'java' with
>> the options from the test in JDK-8188872 shows this snippet:
>>
>>> # Problematic frame:
>>> # V [libjvm.dylib+0xc06347] VMError::controlled_crash(int)+0xd5
>>> #
>>> # CreateCoredumpOnCrash turned off, no core file dumped
>>> #
>>> # An error report file with more information is saved as:
>>> # /work/shared/bug_hunt/jdk_jdk_exp/hs_err_pid79552.log
>>> Recording reporting_start_time for TestUnresponsiveErrorHandler.
>>> # [ timer expired, abort... ]
>>> Killed: 9
>>>
>>> $ echo $?
>>> 137
>>
>> The "Killed: 9" is a _nice_ addition that shows we had an abnormal
>> termination. The exit_code == 137 is also way better than my
>> original exit_code == 1.
>>
>> David, Kim and I are still hashing out whether -XX:-CreateCoredumpOnCrash
>> takes precedence or not.
>>
>> Dan
>>
>>
>>
>> On 5/28/19 7:37 PM, Daniel D. Daugherty wrote:
>>> Resending with the bug's JBS link included...
>>>
>>> Greetings,
>>>
>>> While working on another bug (JDK-8188872), I happened to notice that
>>> the test was generating core files even though the test is run with
>>> the '-XX:+CreateCoredumpOnCrash' option.
>>>
>>> After investigating, I discovered that the os::die() function does not
>>> honor the CreateCoredumpOnCrash option which was a surprise to me.
>>>
>>> Webrev URL:
>>> http://cr.openjdk.java.net/~dcubed/8224793-webrev/0-for-jdk-jdk13/
>>>
>>> Bug: JDK-8224793 os::die() does not honor CreateCoredumpOnCrash option
>>> https://bugs.openjdk.java.net/browse/JDK-8224793
>>>
>>> Testing: Mach5 Tier[1-5]
>>> Included the fix in my latest round of 8153224 testing
>>> on Solaris-X64 where this bug reproduces quite a bit.
>>>
>>> Thanks, in advance, for any comments, suggestions, or questions.
>>>
>>> Dan
>>>
>>>
>>>
>>
>>
>
More information about the hotspot-runtime-dev
mailing list