RFR(S): 8074354: tests that intentionally crash the VM can timeout creating a core file

Daniel D. Daugherty daniel.daugherty at oracle.com
Mon Mar 30 19:51:48 UTC 2015


Re: report_coredump_status() is really record_coredump_status()

The report_coredump_status() function is designed to record two
things about core dumps for later reporting by the code that
generates the hs_err_pid file. That's why the original
report_coredump_status() didn't output anything to stderr.

By changing the Windows calls to report_coredump_status() into
jio_fprintf() calls you have:

- put output onto stderr that should go to the hs_err_pid file
- made the windows code paths work differently than the non-windows
   code paths

Dan


On 3/30/15 1:28 PM, Yumin Qi wrote:
> Thanks Dan
>
> On 3/30/2015 11:25 AM, Daniel D. Daugherty wrote:
>> On 3/25/15 12:11 PM, Yumin Qi wrote:
>>> Hi, all
>>>
>>>     I updated the webrev with a new change to test case:
>>>     test/runtime/Unsafe/RangeCheck.java
>>>
>>>     Add -XX:-CreateCoredumpOnCrash to test, no dump needed on this 
>>> case.
>>>
>>>     new  webrev: http://cr.openjdk.java.net/~minqi/8074354/webrev05/
>>
>> General nit - please update copyright lines to 2015 as needed
>>
>> src/share/vm/runtime/globals.hpp
>>     No comments.
>>
>> src/share/vm/runtime/os.hpp
>>     line 720:   // On Windows this will create an actual minidump, on 
>> Linux/Solaris it will simply check core dump limits
>>     line 721:   static void check_dump_limit(char* buffer, size_t 
>> bufferSize);
>>         The comment on line 720 no longer matches with the function
>>         rename from check_or_create_dump() to check_dump_limit().
>>
>>         You have this comment in vmError.cpp:
>>         943     // check core dump limits on Linux/Solaris, nothing 
>> on Windows
>>         944     os::check_dump_limit(buffer, sizeof(buffer));
>>
>>         so the two comments are out of sync.
>>
> Will convert them to be consistent.
>> src/share/vm/runtime/arguments.cpp
>>     line 3252: } else if (match_option(option, 
>> "-XX:+CreateMinidumpOnCrash", &tail)) {
>>     line 3256: } else if (match_option(option, 
>> "-XX:-CreateMinidumpOnCrash", &tail)) {
>>         These two checks should use the match_option() version
>>         without a '&tail' parameter.
>>
> Will use version w/o tail.
>> src/share/vm/utilities/vmError.cpp
>>     line 217: bool VMError::coredump_status = true;       // presume 
>> we can dump core file first
>>         I don't think you should init this to true. It confuses
>>         things with VMError::report_coredump_status(). It will also
>>         confuse this code:
>>
>>         526   STEP(63, "(printing core file information)")
>>         529       if (coredump_status) {
>>         530         st->print("Core dump written. Default location: 
>> %s", coredump_message);
>>
>>         because coredump_status won't accurately reflect whether
>>         the coredump_message field has been set.
>>
>>     line 943: // check core dump limits on Linux/Solaris, nothing on 
>> Windows
>>         This updated comment doesn't match the unmodified
>>         comment in os.hpp:
>>
>>         720:   // On Windows this will create an actual minidump, on 
>> Linux/Solaris it will simply check core dump limits
>>
> I will consider your comments, then revise with a new version. I 
> remember here there is a case, if status not set, it will output an 
> inconsistent message. Will check again.
>> src/os/aix/vm/os_aix.cpp
>>     No comments.
>>
>> src/os/bsd/vm/os_bsd.cpp
>>     No comments.
>>
>> src/os/linux/vm/os_linux.cpp
>>     No comments.
>>
>> src/os/posix/vm/os_posix.cpp
>>     No comments.
>>
>> src/os/solaris/vm/os_solaris.cpp
>>     No comments.
>>
>> src/os/windows/vm/os_windows.cpp
>>     line 1008: static char buffer[O_BUFLEN];
>>         Why switch from a buffer parameter to a static buffer?
>>         What happens with parallel calls to abort()? Will the static
>>         buffer get stomped by the competing threads?
> The original buffer is static too.  Carrying an buffer for abort seems 
> not right. This buffer in fact only for storing file name (based on 
> pid) here.
> abort() will dump the core file, should we prevent multi-thread 
> calling to this function? I did not check this part, will check if it 
> is MT-safe.
>
>>
>>     line 1015:   // If running on a client version of Windows and 
>> user has not explicitly enabled dumping
>>     line 1016:   if (!os::win32::is_windows_server() && 
>> !CreateCoredumpOnCrash) {
>>         The default for CreateCoredumpOnCrash is now 'true' so the
>>         comment and logic are no longer correct here. The Windows
>>         Client VM will generate minidumps by default.
>>
>>     old line 1007: VMError::report_coredump_status("Minidumps are not 
>> enabled by default on client versions of Windows", false);
>>     new line 1017: jio_fprintf(stderr, "Minidumps are not enabled by 
>> default on client versions of Windows.\n");
>>         There are a number of places where we change from
>>         VMError::report_coredump_status() to jio_fprintf()
>>         and I'm not quite seeing why this was done.
>>
>>         Update: VMError::report_coredump_status() is misnamed. It 
>> doesn't
>>         report anything in the sense that it doesn't print anything. It
>>         does record two pieces of information about core dumps so maybe
>>         it should be VMError::record_coredump_status().
>>
>>     line 1100: jio_fprintf(stderr, "%s.\n", buffer);
>>         At this point, buffer contains the path to the
>>         mini-dump file so the above line simply prints
>>         that on stderr. Why?
>>
>>         Yes, I see that the old code did something similar:
>>
>>         1090     VMError::report_coredump_status(buffer, true);
>>
>>         but report_coredump_status() didn't actually print
>>         anything. It just squirreled away these in vmError.cpp:
>>
>>         217 bool VMError::coredump_status;
>>         218 char VMError::coredump_message[O_BUFLEN];
>>
>>         See comments above about how report_coredump_status()
>>         is misnamed.
>>
>>     At this point, I'm convinced that all the changes from
>>     VMError::report_coredump_status() to jio_fprintf() are
>>     a bad idea.
>>
>
> Changing to jio_fprintf is because report_coredump_status did not 
> output anything, it is just a logging as you indicated. It is 
> reasonable we change it to record_coredump_status instead.  How about 
> add printout to report_coredump_status? So I do not need to use 
> jio_printf here.
>
> Other consideration of using jio_fprintf after call shutdown called, 
> the static output stream still exists, but not guaranteed to work as 
> expected.
>
>
> Thanks
> Yumin
>> test/runtime/ErrorHandling/ProblematicFrameTest.java
>>     No comments.
>>
>> test/runtime/ErrorHandling/SafeFetchInErrorHandlingTest.java
>>     No comments.
>>
>> test/runtime/ErrorHandling/SecondaryErrorTest.java
>>     No comments.
>>
>> test/runtime/Safepoint/AssertSafepointCheckConsistency1.java
>>     No comments.
>>
>> test/runtime/Safepoint/AssertSafepointCheckConsistency2.java
>>     No comments.
>>
>> test/runtime/Safepoint/AssertSafepointCheckConsistency3.java
>>     No comments.
>>
>> test/runtime/Safepoint/AssertSafepointCheckConsistency4.java
>>     No comments.
>>
>> test/runtime/Unsafe/RangeCheck.java
>>     No comments.
>>
>> test/runtime/memory/ReadFromNoaccessArea.java
>>     No comments.
>>
>>
>>>
>>> Thanks
>>> Yumin
>>>
>>>
>>> On 3/23/2015 11:48 AM, Yumin Qi wrote:
>>>> Since Thomas is not a reviewer in open jdk, I need two volunteers 
>>>> (at least one "R"eviewer).
>>>>
>>>> Dan, since you already reviewed previous version, could you have a 
>>>> look?
>>>>
>>>> Thanks
>>>> Yumin
>>>>
>>>> On 3/20/2015 3:20 PM, Yumin Qi wrote:
>>>>> Thomas,
>>>>>
>>>>>  Thanks. Yes, it is webrev04. Also, I have updated webrev04 to add 
>>>>> another test case change: 
>>>>> test/runtime/memory/ReadFromNoaccessArea.java
>>>>>  (Thanks Dan's update)
>>>>>
>>>>> Thanks
>>>>> Yumin
>>>>>
>>>>> On 3/20/2015 11:55 AM, Thomas Stüfe wrote:
>>>>>>
>>>>>> Hi Yumin,
>>>>>>
>>>>>> I think you meant to post webrev.04?
>>>>>>
>>>>>> I looked at 04, and it looks nice. Thank you!
>>>>>>
>>>>>> (still only reviewer with r)
>>>>>>
>>>>>> Thomas
>>>>>>
>>>>>> On Mar 20, 2015 6:20 PM, "Yumin Qi" <yumin.qi at oracle.com 
>>>>>> <mailto:yumin.qi at oracle.com>> wrote:
>>>>>>
>>>>>>     Hi, All
>>>>>>
>>>>>>     http://cr.openjdk.java.net/~minqi/8074354/webrev03/
>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev03/>
>>>>>>
>>>>>>     This version is based on Thomas' suggestion.
>>>>>>     Tests passed vm.runtime.quick.testlist, JPRT and manual tests.
>>>>>>
>>>>>>     Thanks
>>>>>>     Yumin
>>>>>>
>>>>>>     On 3/18/2015 10:06 AM, Yumin Qi wrote:
>>>>>>
>>>>>>
>>>>>>         On 3/18/2015 4:26 AM, Thomas Stüfe wrote:
>>>>>>
>>>>>>             Hi Yumin,
>>>>>>
>>>>>>             - just aesthetics, but on all implementations of
>>>>>>             os_abort(), at least on the UNIX variants, maybe we 
>>>>>> could
>>>>>>             consistently rename the parameters to be "siginfo" and
>>>>>>             "context" instead of using Windows terms?
>>>>>>
>>>>>>         Sure.
>>>>>>
>>>>>>             - On check_create_dump_limit(), you do not need
>>>>>>             exceptionRecord, contextRecord parameters anymore. The
>>>>>>             only parameters now needed are buffer and buffer size,
>>>>>>             which is not even an output parameter, just a scratch
>>>>>>             buffer for this function to use for printf. For 
>>>>>> output, it
>>>>>>             calls VMError::report_coredump_status(...) at the end to
>>>>>>             provide information about the core file.
>>>>>>
>>>>>>             - I would rename that function from
>>>>>>             check_create_dump_limit() to check_dump_limit() - 
>>>>>> nothing
>>>>>>             is created anymore.
>>>>>>
>>>>>>         Yes, no need to carry those two parameters. I tried to keep
>>>>>>         less changes, but decided to change now.
>>>>>>
>>>>>>             - on Windows, in os::abort(), there is no point 
>>>>>> anymore in
>>>>>>             calling VMError::report_coredump_status(...) because 
>>>>>> that
>>>>>>             information is only used during error log writing which
>>>>>>             already happened. It only makes sense to call this
>>>>>>             function in check_create_dump_limit(), which happens
>>>>>>             before error log writing.
>>>>>>             Instead, maybe error messages like "Call to
>>>>>>             MiniDumpWriteDump() failed" should just be written to
>>>>>>             stderr? This would be consistent with Unix, where the OS
>>>>>>             writes a short message on stderr if core file writing 
>>>>>> fails.
>>>>>>
>>>>>>         Sure, will direct output to stderr.
>>>>>>
>>>>>>             - there is now a new test,
>>>>>> test/runtime/ErrorHandling/SafeFetchInErrorHandlingTest.java
>>>>>>             - could you please also add 
>>>>>> "-XX:-CreateCoredumpOnCrash" ?
>>>>>>             I just added the test and did not want to add the option
>>>>>>             before it was available.
>>>>>>
>>>>>>         OK
>>>>>>
>>>>>>         Thanks
>>>>>>         Yumin
>>>>>>
>>>>>>             Thanks for your work!
>>>>>>
>>>>>>             Thomas
>>>>>>
>>>>>>
>>>>>>             On Tue, Mar 17, 2015 at 2:04 AM, Yumin Qi
>>>>>>             <yumin.qi at oracle.com <mailto:yumin.qi at oracle.com>
>>>>>>             <mailto:yumin.qi at oracle.com 
>>>>>> <mailto:yumin.qi at oracle.com>>>
>>>>>>             wrote:
>>>>>>
>>>>>>                 Hi, Dan and all
>>>>>>
>>>>>>                   I have updated webrev at:
>>>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev03/
>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev03/>
>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev03/>
>>>>>>
>>>>>>                   1) bug changed synopsis
>>>>>>                        8074354: Make CreateMinidumpOnCrash a new 
>>>>>> name and
>>>>>>                 available on all platforms
>>>>>>                   2) tests: JPRT, vm.runtime.quick.testlist (on
>>>>>>             Windows), jtreg on
>>>>>>                 Windows and Linux/Unix.
>>>>>>
>>>>>>                   Thanks
>>>>>>                   Yumin
>>>>>>
>>>>>>
>>>>>>                 On 3/12/2015 12:59 PM, Daniel D. Daugherty wrote:
>>>>>>
>>>>>>                     I believe there are both JavaTest/JTREG tests in
>>>>>>             the hotspot repo
>>>>>>                     and there are VM/NSK tests from back when we did
>>>>>>             phone home...
>>>>>>
>>>>>>                     Check with Christian or Misha...
>>>>>>
>>>>>>                     Dan
>>>>>>
>>>>>>
>>>>>>                     On 3/12/15 1:50 PM, Yumin Qi wrote:
>>>>>>
>>>>>>                         Thanks, Dan
>>>>>>
>>>>>>                         Will look at and run those tests. Are they
>>>>>>             under nsk?
>>>>>>
>>>>>>                         Yumin
>>>>>>
>>>>>>                         On 3/12/2015 12:29 PM, Daniel D. 
>>>>>> Daugherty wrote:
>>>>>>
>>>>>>                             Yumin,
>>>>>>
>>>>>>                             There are some error handler tests. You
>>>>>>             should find
>>>>>>                             those and make
>>>>>>                             sure that you run them on Windows 
>>>>>> since you're
>>>>>>                             changing the order
>>>>>>                             there...
>>>>>>
>>>>>>                             Dan
>>>>>>
>>>>>>
>>>>>>                             On 3/12/15 12:42 PM, Yumin Qi wrote:
>>>>>>
>>>>>>                                 Hi, Thomas and all
>>>>>>
>>>>>>                                    The second version of the change:
>>>>>>                                    webrev:
>>>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev02/
>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev02/>
>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev02/>
>>>>>>
>>>>>>                                    In this version:
>>>>>>                                    1) switch order on Windows to 
>>>>>> first
>>>>>>             print log
>>>>>>                                 file as other platform, then dump 
>>>>>> core
>>>>>>             file if
>>>>>>                                 needed. When VMError created after
>>>>>>             crash, siginfo
>>>>>>                                 and context which are 
>>>>>> ExceptionRecord and
>>>>>>                                 ContextRecord on Windows are 
>>>>>> recorded,
>>>>>>             mini (or
>>>>>>                                 full, due to 'or' used so type 
>>>>>> will be
>>>>>>             mini
>>>>>>                                 anyway) dump creates dump file based
>>>>>>             on those two
>>>>>>                                 as before.
>>>>>>                                    2) to make os::abort to get above
>>>>>>             two pointers,
>>>>>>                                 added two more fields to the 
>>>>>> function
>>>>>>             parameters:
>>>>>>                                    3)  changes for
>>>>>> test/runtime/ErrorHandling/SecondaryError.java ---
>>>>>>                                 added "-XX:-CreateCoredumpOnCrash"
>>>>>>
>>>>>>                                     -  static void abort(bool
>>>>>>             dump_core = true);
>>>>>>                                    +  static void abort(bool 
>>>>>> dump_core
>>>>>>             = true,
>>>>>>                                 void *siginfo = NULL, void 
>>>>>> *context =
>>>>>>             NULL);
>>>>>>
>>>>>>                                    Tests: JPRT and manually.
>>>>>>
>>>>>>                                 Thanks
>>>>>>                                 Yumin
>>>>>>
>>>>>>                                 On 3/11/2015 3:47 AM, Thomas 
>>>>>> Stüfe wrote:
>>>>>>
>>>>>>                                     Hi Yumin,
>>>>>>
>>>>>>                                     There is also
>>>>>> test/runtime/ErrorHandling/SecondaryErrorTest.java
>>>>>>                                     - could you please add
>>>>>> "-XX:-CreateCoredumpOnCrash" ?
>>>>>>             Thank you!
>>>>>>
>>>>>>                                     Beyond that, as I wrote in 
>>>>>> the bug
>>>>>>             report
>>>>>>                                     comments:
>>>>>>
>>>>>>                                     "This is also a problem on 
>>>>>> Windows -
>>>>>>                                     MiniDumpWriteDump() may hang
>>>>>>             infinitly. And on
>>>>>>                                     Windows this is worse than under
>>>>>>             UNIX because
>>>>>>                                     we create the Dump before 
>>>>>> writing
>>>>>>             the hs-err
>>>>>>                                     file, so if the Dump hangs, 
>>>>>> we get
>>>>>>             no error
>>>>>>                                     log. I would like to revert the
>>>>>>             order: create
>>>>>>                                     the minidump after writing the
>>>>>>             error log, the
>>>>>>                                     same way Unix does it. We did 
>>>>>> this
>>>>>>             in our JVM
>>>>>>                                     (SAP) because for us, error logs
>>>>>>             are more
>>>>>>                                     useful than minidumps. "
>>>>>>
>>>>>>                                     So, I would like to see 
>>>>>> os::abort
>>>>>>             on Windows
>>>>>>                                     call MiniDumpWriteDump(), and 
>>>>>> thus
>>>>>>             the mini
>>>>>>                                     dump writing moved after the 
>>>>>> error log
>>>>>>                                     writing. This would also make 
>>>>>> the
>>>>>>             code way
>>>>>>                                     cleaner because the control flow
>>>>>>             would be the
>>>>>>                                     same on all platforms.
>>>>>>
>>>>>>                                     I understand that this may be 
>>>>>> out
>>>>>>             of scope for
>>>>>>                                     your change, but I would like to
>>>>>>             know what
>>>>>>                                     others think about this.
>>>>>>
>>>>>>                                     Kind regards, Thomas
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>                                     On Wed, Mar 11, 2015 at 8:02 AM,
>>>>>>             Yumin Qi
>>>>>> <yumin.qi at oracle.com
>>>>>>             <mailto:yumin.qi at oracle.com>
>>>>>> <mailto:yumin.qi at oracle.com
>>>>>>             <mailto:yumin.qi at oracle.com>>
>>>>>> <mailto:yumin.qi at oracle.com
>>>>>>             <mailto:yumin.qi at oracle.com>
>>>>>>             <mailto:yumin.qi at oracle.com
>>>>>> <mailto:yumin.qi at oracle.com>>>> wrote:
>>>>>>
>>>>>>                                         Please review:
>>>>>>
>>>>>>                                         bug:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8074354
>>>>>>                                         webrev:
>>>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev01/
>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev01/>
>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev01/>
>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev01/>
>>>>>>
>>>>>>                                         Summary: Tests timed out 
>>>>>> when
>>>>>>             VM crashes
>>>>>>                                     and dumping core file
>>>>>>                                         which in the test case is 
>>>>>> not
>>>>>>             needed. To
>>>>>>                                     make core not created,
>>>>>>                                         the fix changed
>>>>>>             CreateMinidumpOnCrash to
>>>>>> CreateCoredumpOnCrash,
>>>>>>                                         the former is only used on
>>>>>>             Windows and the
>>>>>>                                     latter for all
>>>>>>                                         platforms. When VM 
>>>>>> crashes on
>>>>>>             non Windows,
>>>>>>                                     core file generated as
>>>>>>                                         default if OS sets core dump
>>>>>>             allowed.
>>>>>>                                     Default value of
>>>>>> CreateCoredumpOnCrash set to
>>>>>>             'true' to
>>>>>>                                     keep same behavior on non
>>>>>>                                         Windows platforms, but 
>>>>>> changed
>>>>>>             for Windows
>>>>>>                                     --- original is false,
>>>>>>                                         not create minidump on
>>>>>>             Windows. With
>>>>>> CreateCoredumpOnCrash turned
>>>>>>                                         off, no core file will be
>>>>>>             generated.
>>>>>> CreateMinidumpOnCrash still
>>>>>>                                         can be used on 
>>>>>> commandline but
>>>>>>             only as
>>>>>>                                     alias for the new flag.
>>>>>>
>>>>>>                                         Tests: JPRT, manual tests 
>>>>>> setting
>>>>>> CreateMinidumpOnCrash on
>>>>>>                                         commandline to verify flag
>>>>>>             change as alias.
>>>>>>
>>>>>>                                         Thanks
>>>>>>                                         Yumin
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>



More information about the hotspot-runtime-dev mailing list