RFR(S): 8074354: tests that intentionally crash the VM can timeout creating a core file

Daniel D. Daugherty daniel.daugherty at oracle.com
Thu Apr 9 13:03:37 UTC 2015


Yumin,

I'm going to hold off on reviewing this webrev until you resolve
Thomas' comments from 2015-04-08 and 2015-04-09. If your resolution
is to move forward with webrev07 I'll review it then.

Dan


On 4/8/15 9:16 PM, Yumin Qi wrote:
> Hi, Dan
>
>   New webrev at
> http://cr.openjdk.java.net/~minqi/8074354/webrev07/
>
>   To make code clean, add os::abort_internal which is only used for 
> Windows.
>   Since passing VMError::coredump_message and its size as parameters 
> across multiple functions makes function have a bigger signature, I 
> change the way to access them -- add two functions in VMError to 
> access them: coredump_msg_buffer() and coredump_msg_buffer_size(). 
> They are static members, so can save stack space for passing around.
>
>   in os_windows.cpp, abort(bool, void*, void*):
>   if error encountered, exit with error messages.
>   If no error, VMError::coredump_message contains the dump path/name. 
> I just use them so no need to have extra space for name allocation.
>
>   Passed JPRT and manual tests.
>
>   Thanks
>   Yumin
>
> On 4/7/2015 9:23 AM, Daniel D. Daugherty wrote:
>> On 4/1/15 10:57 AM, Yumin Qi wrote:
>>> Hi, Dan and Thomas,
>>>
>>>   I have a new webrev at
>>> http://cr.openjdk.java.net/~minqi/8074354/webrev06/
>>
>>
>> src/share/vm/runtime/globals.hpp
>>     Sorry, I didn't notice this before:
>>
>>     L939:   product(bool, CreateCoredumpOnCrash, 
>> true,                                \
>>     L940:           "Create minidump on VM fatal 
>> error")                              \
>>
>>         The "minidump" on L940 should change also. Don't know if you
>>         want to say "core/mini dump" or something else.
>>
>>
>> src/share/vm/runtime/os.hpp
>>     L719:   // On Linux/Solaris it will simply check core dump limits 
>> while on Windows will do nothing.
>>         grammar: 'on Windows will do nothing'
>>               -> 'on Windows it will do nothing'
>>
>>
>> src/share/vm/runtime/arguments.cpp
>>     No comments.
>>
>> src/share/vm/utilities/vmError.hpp
>>     No comments.
>>
>> src/share/vm/utilities/vmError.cpp
>>     No comments.
>>
>> src/os/aix/vm/os_aix.cpp
>>     No comments.
>>
>> src/os/bsd/vm/os_bsd.cpp
>>     No comments.
>>
>> src/os/linux/vm/os_linux.cpp
>>     No comments.
>>
>> src/os/posix/vm/os_posix.cpp
>>     No comments.
>>
>> src/os/solaris/vm/os_solaris.cpp
>>     No comments.
>>
>> src/os/windows/vm/os_windows.cpp
>>     L92: jio_snprintf(buffer, bufferSize, "%s\\hs_err_pid%u.mdmp", 
>> get_current_directory(NULL, 0), current_process_id());
>>         Other non-Windows calls to get_current_directory() do a NULL
>>         check before using the return value. The original Windows
>>         code that called get_current_directory() did not make this
>>         check, but I think it's better to be consistent and safe.
>>
>>     L1007:   static char buffer[O_BUFLEN];
>>     L1008:   char filename[FILENAME_MAX];
>>         Two more potentially large stack variables here. Will we
>>         run into stack overflow on Win* more frequently when what
>>         we really wanted was the reason for the abort() call?
>>
>>         The reason the original code used 'buffer' to construct the
>>         dump file name was to save space. The original code was also
>>         careful to not need the file name in any error messages after
>>         the point at which the dump file was successfully created.
>>
>>         It would be better (more resilient) if you could pass in
>>         an already existing buffer to abort() like the original
>>         code passed into os::check_or_create_dump(). Yes, that
>>         would require visiting all existing calls to abort().
>>
>>     L1015:   assert(out->is_open(), "defaultStream should be open");
>>         Is it possible for os::abort() to be called early enough on
>>         Windows such that VMError::out is not actually setup? Yes,
>>         this would have to be absolutely catastrophic...
>>
>>         See the comment in os_solaris.cpp L1520-1522.
>>
>>     L1018: jio_snprintf(buffer, sizeof(buffer), "%s", "Either 
>> CreateCoredumpOnCrash is disabled from command"
>>     L1019                  " line or coredump is not available");
>>         When you are using jio_snprintf() as a replacement for:
>>
>>             strncpy(buffer, "my string", sizeof(buffer));
>>             buffer[sizeof(buffer) - 1] = '\0';
>>
>>         you don't need the "%s" format string. So this code would be:
>>
>>         jio_snprintf(buffer, sizeof(buffer),
>>                      "Either CreateCoredumpOnCrash is disabled from "
>>                      "command line or coredump is not available");
>>
>>         Same comment for: L1026, L1039
>>
>>     L1020: goto done;
>>         The use of 'goto' makes me cringe. You can refactor much
>>         of this logic into an abort_internal() function that
>>         abort() calls and keep all the original "return" logic
>>         to bail out.
>>
>>     L1045: // Older versions of dbghelp.h doesn't contain all
>>         grammar: 'doesn't' -> 'do not'
>>
>>     L1052:   cwd = get_current_directory(NULL, 0);
>>     L1053: jio_snprintf(filename, sizeof(filename), 
>> "%s\\hs_err_pid%u.mdmp", cwd, current_process_id());
>>         Other non-Windows calls to get_current_directory() do a NULL
>>         check before using the return value. The original Windows
>>         code that called get_current_directory() did not make this
>>         check, but I think it's better to be consistent and safe.
>>
>>     L1092:   // well done.
>>         Like a overcooked steak? :-) Not sure what you're trying
>>         to say here...
>>
>> test/runtime/ErrorHandling/ProblematicFrameTest.java
>>     No comments.
>>
>> test/runtime/ErrorHandling/SafeFetchInErrorHandlingTest.java
>>     No comments.
>>
>> test/runtime/ErrorHandling/SecondaryErrorTest.java
>>     No comments.
>>
>> test/runtime/Safepoint/AssertSafepointCheckConsistency1.java
>> test/runtime/Safepoint/AssertSafepointCheckConsistency2.java
>> test/runtime/Safepoint/AssertSafepointCheckConsistency3.java
>> test/runtime/Safepoint/AssertSafepointCheckConsistency4.java
>>     No comments on these four.
>>
>>
>> Dan
>>
>>
>>>
>>>   What changes:
>>>
>>>    1) os_windows.cpp, os::abort(bool ....):
>>>       a) Since when abort() is called,  the last function called 
>>> before exit process,  VMError::log already closed, which is the file 
>>> logging. defaultStream still available to use, so use it for writing 
>>> error or success messages. VMError::out is the defaultStream, for 
>>> using it, added a function out_stream()  in VMError.
>>>       b) Delete the codes which decide to write core based on 
>>> client/server version with debug. The new code is if 
>>> CreateCoredumpOnCrash set or dump_core is on, the coredump will be 
>>> tried no matter which version it is. There is no special reason not 
>>> to dump core with client version I think. Any concern for this change?
>>>       c) Use of 'goto' in this function. I tried not to use it, but 
>>> using it makes code clear. I remember somewhere suggest not using 
>>> 'goto' in cplusplus.
>>>
>>>     2) changed comments to make them consistent as indicated by Dan.
>>>
>>>     3) Added/modified head version for files.
>>>
>>>     Tests: JPRT, manual test on local Windows.
>>>
>>> Thanks
>>> Yumin
>>>
>>>
>>> On 3/31/2015 1:25 AM, Thomas Stüfe wrote:
>>>> Hi Daniel, Yumin,
>>>>
>>>> sorry, a lot of Yumins changes were based on suggestions from me, 
>>>> so here more background:
>>>>
>>>> Yumin reversed the order of error log writing and core dumping. The 
>>>> reason for that are explained in detail at the start of the mail 
>>>> thread, in short:
>>>>
>>>> - on Windows core file dumping may hang - its rare but it happens - 
>>>> preventing error logs. Depending on who you ask, error logs are 
>>>> more important than minidumps, so it makes sense to first write the 
>>>> erorr log.
>>>> - This also brings windows core paths closer to UNIX code paths. 
>>>> See below.
>>>>
>>>> About writing to stderr in os::abort():
>>>>
>>>> After this change, calling VmError::report_coredump_status() in 
>>>> os::abort() will not do anything because the error log is already 
>>>> written at that point. I suggested to Yumin writing to stderr 
>>>> instead in os::abort()  because that mimicks UNIX behaviour: On 
>>>> UNIX, you call abort(3), which writes a core and aborts. It writes 
>>>> a short message to stderr "Aborted (core dumped)" or just "Aborted".
>>>>
>>>> After Yumins change, on Windows os::abort also writes the core, 
>>>> then aborts. It made sense to me that this function should behave 
>>>> the same way as on UNIX: if core writing fails, write to stderr. 
>>>> There is no way otherwise to communicate a core dump writing 
>>>> failure. After writing the core, process stops.
>>>>
>>>> ---
>>>>
>>>> About the control flows:
>>>>
>>>> Before Yumins change:
>>>>
>>>> On Windows
>>>> 1) os::check_or_create_dump(): we wrote the core dump and then 
>>>> information about the dump (file location, success or error) was 
>>>> handed to the misnamed VmError::report_coredump_status(), which 
>>>> just stored that information.
>>>> 2) VmError::report(): Error log was written and in step 63 
>>>> information about the core (already written at that point) was 
>>>> added to error log.
>>>> 3) os::abort() just did an exit(3)
>>>>
>>>> On Unix,
>>>> 1) os::check_or_create_dump(): checks the limits to guess whether 
>>>> core dumping later will succeed. Again, that information is handed 
>>>> to VmError::report_coredump_status()
>>>> 2) VmError::report(): Error log is written and in step 63, 
>>>> information about the core's probable future location is added to 
>>>> error log. Here, messages use past tense, which is misleading, 
>>>> because the core is not yet written.
>>>> 3) os::abort() calls abort(3), which stops and writes a core.
>>>>
>>>> Yumin pushed core file writing on Windows down to os::abort(). This 
>>>> brings the control flow much closer to Unix:
>>>>
>>>> (1) call os::check_dump_limit() to check the prerequisites for core 
>>>> file writing and gather a bit information to put in the error log: 
>>>> On Unix, limit checks, on windows, so far nothing much but 
>>>> precalculating the error file name.
>>>> (2) VmError::report(): Error log is written and information about 
>>>> the core's probable location is added to error log.
>>>> (3) os::abort() is called, which on all platforms:
>>>>  - dumps the core
>>>>  - if failing, writes a one-liner to stderr
>>>>  - stops the process.
>>>>
>>>> I think, if one agrees that reversing order of core dumping and 
>>>> error log writing on windows is a good thing, this change looks ok 
>>>> to me. Some improvements could make it clearer:
>>>>
>>>> - maybe rename "os::check_dump_limit()" (was my suggestion, sorry) 
>>>> to "os::check_core_prerequisites()" - that leaves room to flesh it 
>>>> out a bit more on Windows, where we do not have UNIX limits.
>>>> - make the messages in VMError::report() future sense: "Will write 
>>>> core file to location.....".
>>>> - Make the messages written in os::abort() on Windows clearer, and 
>>>> shorter, and we also do not need the buffer to be static here. 
>>>> Actually, if we want to be as on UNIX, just dumping "core file 
>>>> writing succeeded" or "failed" - maybe with an error number from 
>>>> GetLastError() - will be enough because the file location is 
>>>> already printed in the error log. Like on UNIX.
>>>>
>>>> Kind Regards, Thomas
>>>>
>>>>
>>>> On Mon, Mar 30, 2015 at 9:51 PM, Daniel D. Daugherty 
>>>> <daniel.daugherty at oracle.com <mailto:daniel.daugherty at oracle.com>> 
>>>> wrote:
>>>>
>>>>     Re: report_coredump_status() is really record_coredump_status()
>>>>
>>>>     The report_coredump_status() function is designed to record two
>>>>     things about core dumps for later reporting by the code that
>>>>     generates the hs_err_pid file. That's why the original
>>>>     report_coredump_status() didn't output anything to stderr.
>>>>
>>>>     By changing the Windows calls to report_coredump_status() into
>>>>     jio_fprintf() calls you have:
>>>>
>>>>     - put output onto stderr that should go to the hs_err_pid file
>>>>     - made the windows code paths work differently than the non-windows
>>>>       code paths
>>>>
>>>>
>>>>     Dan
>>>>
>>>>
>>>>
>>>>     On 3/30/15 1:28 PM, Yumin Qi wrote:
>>>>
>>>>         Thanks Dan
>>>>
>>>>         On 3/30/2015 11:25 AM, Daniel D. Daugherty wrote:
>>>>
>>>>             On 3/25/15 12:11 PM, Yumin Qi wrote:
>>>>
>>>>                 Hi, all
>>>>
>>>>                     I updated the webrev with a new change to test
>>>>                 case:
>>>>                     test/runtime/Unsafe/RangeCheck.java
>>>>
>>>>                     Add -XX:-CreateCoredumpOnCrash to test, no dump
>>>>                 needed on this case.
>>>>
>>>>                     new  webrev:
>>>>                 http://cr.openjdk.java.net/~minqi/8074354/webrev05/
>>>>                 <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev05/>
>>>>
>>>>
>>>>             General nit - please update copyright lines to 2015 as
>>>>             needed
>>>>
>>>>             src/share/vm/runtime/globals.hpp
>>>>                 No comments.
>>>>
>>>>             src/share/vm/runtime/os.hpp
>>>>                 line 720:   // On Windows this will create an
>>>>             actual minidump, on Linux/Solaris it will simply check
>>>>             core dump limits
>>>>                 line 721:   static void check_dump_limit(char*
>>>>             buffer, size_t bufferSize);
>>>>                     The comment on line 720 no longer matches with
>>>>             the function
>>>>                     rename from check_or_create_dump() to
>>>>             check_dump_limit().
>>>>
>>>>                     You have this comment in vmError.cpp:
>>>>                     943     // check core dump limits on
>>>>             Linux/Solaris, nothing on Windows
>>>>                     944     os::check_dump_limit(buffer,
>>>>             sizeof(buffer));
>>>>
>>>>                     so the two comments are out of sync.
>>>>
>>>>         Will convert them to be consistent.
>>>>
>>>>             src/share/vm/runtime/arguments.cpp
>>>>                 line 3252: } else if (match_option(option,
>>>>             "-XX:+CreateMinidumpOnCrash", &tail)) {
>>>>                 line 3256: } else if (match_option(option,
>>>>             "-XX:-CreateMinidumpOnCrash", &tail)) {
>>>>                     These two checks should use the match_option()
>>>>             version
>>>>                     without a '&tail' parameter.
>>>>
>>>>         Will use version w/o tail.
>>>>
>>>>             src/share/vm/utilities/vmError.cpp
>>>>                 line 217: bool VMError::coredump_status = true;   
>>>>                // presume we can dump core file first
>>>>                     I don't think you should init this to true. It
>>>>             confuses
>>>>                     things with VMError::report_coredump_status().
>>>>             It will also
>>>>                     confuse this code:
>>>>
>>>>                     526   STEP(63, "(printing core file information)")
>>>>                     529       if (coredump_status) {
>>>>                     530         st->print("Core dump written.
>>>>             Default location: %s", coredump_message);
>>>>
>>>>                     because coredump_status won't accurately
>>>>             reflect whether
>>>>                     the coredump_message field has been set.
>>>>
>>>>                 line 943: // check core dump limits on
>>>>             Linux/Solaris, nothing on Windows
>>>>                     This updated comment doesn't match the unmodified
>>>>                     comment in os.hpp:
>>>>
>>>>                     720:   // On Windows this will create an actual
>>>>             minidump, on Linux/Solaris it will simply check core
>>>>             dump limits
>>>>
>>>>         I will consider your comments, then revise with a new
>>>>         version. I remember here there is a case, if status not
>>>>         set, it will output an inconsistent message. Will check again.
>>>>
>>>>             src/os/aix/vm/os_aix.cpp
>>>>                 No comments.
>>>>
>>>>             src/os/bsd/vm/os_bsd.cpp
>>>>                 No comments.
>>>>
>>>>             src/os/linux/vm/os_linux.cpp
>>>>                 No comments.
>>>>
>>>>             src/os/posix/vm/os_posix.cpp
>>>>                 No comments.
>>>>
>>>>             src/os/solaris/vm/os_solaris.cpp
>>>>                 No comments.
>>>>
>>>>             src/os/windows/vm/os_windows.cpp
>>>>                 line 1008: static char buffer[O_BUFLEN];
>>>>                     Why switch from a buffer parameter to a static
>>>>             buffer?
>>>>                     What happens with parallel calls to abort()?
>>>>             Will the static
>>>>                     buffer get stomped by the competing threads?
>>>>
>>>>         The original buffer is static too.  Carrying an buffer for
>>>>         abort seems not right. This buffer in fact only for storing
>>>>         file name (based on pid) here.
>>>>         abort() will dump the core file, should we prevent
>>>>         multi-thread calling to this function? I did not check this
>>>>         part, will check if it is MT-safe.
>>>>
>>>>
>>>>                 line 1015:   // If running on a client version of
>>>>             Windows and user has not explicitly enabled dumping
>>>>                 line 1016:   if (!os::win32::is_windows_server() &&
>>>>             !CreateCoredumpOnCrash) {
>>>>                     The default for CreateCoredumpOnCrash is now
>>>>             'true' so the
>>>>                     comment and logic are no longer correct here.
>>>>             The Windows
>>>>                     Client VM will generate minidumps by default.
>>>>
>>>>                 old line 1007:
>>>>             VMError::report_coredump_status("Minidumps are not
>>>>             enabled by default on client versions of Windows", false);
>>>>                 new line 1017: jio_fprintf(stderr, "Minidumps are
>>>>             not enabled by default on client versions of Windows.\n");
>>>>                     There are a number of places where we change from
>>>>                     VMError::report_coredump_status() to jio_fprintf()
>>>>                     and I'm not quite seeing why this was done.
>>>>
>>>>                     Update: VMError::report_coredump_status() is
>>>>             misnamed. It doesn't
>>>>                     report anything in the sense that it doesn't
>>>>             print anything. It
>>>>                     does record two pieces of information about
>>>>             core dumps so maybe
>>>>                     it should be VMError::record_coredump_status().
>>>>
>>>>                 line 1100: jio_fprintf(stderr, "%s.\n", buffer);
>>>>                     At this point, buffer contains the path to the
>>>>                     mini-dump file so the above line simply prints
>>>>                     that on stderr. Why?
>>>>
>>>>                     Yes, I see that the old code did something similar:
>>>>
>>>>                     1090  VMError::report_coredump_status(buffer,
>>>>             true);
>>>>
>>>>                     but report_coredump_status() didn't actually print
>>>>                     anything. It just squirreled away these in
>>>>             vmError.cpp:
>>>>
>>>>                     217 bool VMError::coredump_status;
>>>>                     218 char VMError::coredump_message[O_BUFLEN];
>>>>
>>>>                     See comments above about how
>>>>             report_coredump_status()
>>>>                     is misnamed.
>>>>
>>>>                 At this point, I'm convinced that all the changes from
>>>>                 VMError::report_coredump_status() to jio_fprintf() are
>>>>                 a bad idea.
>>>>
>>>>
>>>>         Changing to jio_fprintf is because report_coredump_status
>>>>         did not output anything, it is just a logging as you
>>>>         indicated. It is reasonable we change it to
>>>>         record_coredump_status instead.  How about add printout to
>>>>         report_coredump_status? So I do not need to use jio_printf
>>>>         here.
>>>>
>>>>         Other consideration of using jio_fprintf after call
>>>>         shutdown called, the static output stream still exists, but
>>>>         not guaranteed to work as expected.
>>>>
>>>>
>>>>         Thanks
>>>>         Yumin
>>>>
>>>>             test/runtime/ErrorHandling/ProblematicFrameTest.java
>>>>                 No comments.
>>>>
>>>>             test/runtime/ErrorHandling/SafeFetchInErrorHandlingTest.java
>>>>                 No comments.
>>>>
>>>>             test/runtime/ErrorHandling/SecondaryErrorTest.java
>>>>                 No comments.
>>>>
>>>>             test/runtime/Safepoint/AssertSafepointCheckConsistency1.java
>>>>                 No comments.
>>>>
>>>>             test/runtime/Safepoint/AssertSafepointCheckConsistency2.java
>>>>                 No comments.
>>>>
>>>>             test/runtime/Safepoint/AssertSafepointCheckConsistency3.java
>>>>                 No comments.
>>>>
>>>>             test/runtime/Safepoint/AssertSafepointCheckConsistency4.java
>>>>                 No comments.
>>>>
>>>>             test/runtime/Unsafe/RangeCheck.java
>>>>                 No comments.
>>>>
>>>>             test/runtime/memory/ReadFromNoaccessArea.java
>>>>                 No comments.
>>>>
>>>>
>>>>
>>>>                 Thanks
>>>>                 Yumin
>>>>
>>>>
>>>>                 On 3/23/2015 11:48 AM, Yumin Qi wrote:
>>>>
>>>>                     Since Thomas is not a reviewer in open jdk, I
>>>>                     need two volunteers (at least one "R"eviewer).
>>>>
>>>>                     Dan, since you already reviewed previous
>>>>                     version, could you have a look?
>>>>
>>>>                     Thanks
>>>>                     Yumin
>>>>
>>>>                     On 3/20/2015 3:20 PM, Yumin Qi wrote:
>>>>
>>>>                         Thomas,
>>>>
>>>>                          Thanks. Yes, it is webrev04. Also, I have
>>>>                         updated webrev04 to add another test case
>>>>                         change:
>>>>                         test/runtime/memory/ReadFromNoaccessArea.java
>>>>                          (Thanks Dan's update)
>>>>
>>>>                         Thanks
>>>>                         Yumin
>>>>
>>>>                         On 3/20/2015 11:55 AM, Thomas Stüfe wrote:
>>>>
>>>>
>>>>                             Hi Yumin,
>>>>
>>>>                             I think you meant to post webrev.04?
>>>>
>>>>                             I looked at 04, and it looks nice.
>>>>                             Thank you!
>>>>
>>>>                             (still only reviewer with r)
>>>>
>>>>                             Thomas
>>>>
>>>>                             On Mar 20, 2015 6:20 PM, "Yumin Qi"
>>>>                             <yumin.qi at oracle.com
>>>>                             <mailto:yumin.qi at oracle.com>
>>>>                             <mailto:yumin.qi at oracle.com
>>>>                             <mailto:yumin.qi at oracle.com>>> wrote:
>>>>
>>>>                                 Hi, All
>>>>
>>>>                             http://cr.openjdk.java.net/~minqi/8074354/webrev03/
>>>>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev03/>
>>>>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev03/>
>>>>
>>>>                                 This version is based on Thomas'
>>>>                             suggestion.
>>>>                                 Tests passed
>>>>                             vm.runtime.quick.testlist, JPRT and
>>>>                             manual tests.
>>>>
>>>>                                 Thanks
>>>>                                 Yumin
>>>>
>>>>                                 On 3/18/2015 10:06 AM, Yumin Qi wrote:
>>>>
>>>>
>>>>                                     On 3/18/2015 4:26 AM, Thomas
>>>>                             Stüfe wrote:
>>>>
>>>>                                         Hi Yumin,
>>>>
>>>>                                         - just aesthetics, but on
>>>>                             all implementations of
>>>>                                         os_abort(), at least on the
>>>>                             UNIX variants, maybe we could
>>>>                                         consistently rename the
>>>>                             parameters to be "siginfo" and
>>>>                                         "context" instead of using
>>>>                             Windows terms?
>>>>
>>>>                                     Sure.
>>>>
>>>>                                         - On
>>>>                             check_create_dump_limit(), you do not need
>>>>                                         exceptionRecord,
>>>>                             contextRecord parameters anymore. The
>>>>                                         only parameters now needed
>>>>                             are buffer and buffer size,
>>>>                                         which is not even an output
>>>>                             parameter, just a scratch
>>>>                                         buffer for this function to
>>>>                             use for printf. For output, it
>>>>                                         calls
>>>>                             VMError::report_coredump_status(...) at
>>>>                             the end to
>>>>                                         provide information about
>>>>                             the core file.
>>>>
>>>>                                         - I would rename that
>>>>                             function from
>>>>                             check_create_dump_limit() to
>>>>                             check_dump_limit() - nothing
>>>>                                         is created anymore.
>>>>
>>>>                                     Yes, no need to carry those two
>>>>                             parameters. I tried to keep
>>>>                                     less changes, but decided to
>>>>                             change now.
>>>>
>>>>                                         - on Windows, in
>>>>                             os::abort(), there is no point anymore in
>>>>                                         calling
>>>>                             VMError::report_coredump_status(...)
>>>>                             because that
>>>>                                         information is only used
>>>>                             during error log writing which
>>>>                                         already happened. It only
>>>>                             makes sense to call this
>>>>                                         function in
>>>>                             check_create_dump_limit(), which happens
>>>>                                         before error log writing.
>>>>                                         Instead, maybe error
>>>>                             messages like "Call to
>>>>                                         MiniDumpWriteDump() failed"
>>>>                             should just be written to
>>>>                                         stderr? This would be
>>>>                             consistent with Unix, where the OS
>>>>                                         writes a short message on
>>>>                             stderr if core file writing fails.
>>>>
>>>>                                     Sure, will direct output to stderr.
>>>>
>>>>                                         - there is now a new test,
>>>>                             test/runtime/ErrorHandling/SafeFetchInErrorHandlingTest.java
>>>>                                         - could you please also add
>>>>                             "-XX:-CreateCoredumpOnCrash" ?
>>>>                                         I just added the test and
>>>>                             did not want to add the option
>>>>                                         before it was available.
>>>>
>>>>                                     OK
>>>>
>>>>                                     Thanks
>>>>                                     Yumin
>>>>
>>>>                                         Thanks for your work!
>>>>
>>>>                                         Thomas
>>>>
>>>>
>>>>                                         On Tue, Mar 17, 2015 at
>>>>                             2:04 AM, Yumin Qi
>>>>                                         <yumin.qi at oracle.com
>>>>                             <mailto:yumin.qi at oracle.com>
>>>>                             <mailto:yumin.qi at oracle.com
>>>>                             <mailto:yumin.qi at oracle.com>>
>>>>                                         <mailto:yumin.qi at oracle.com
>>>>                             <mailto:yumin.qi at oracle.com>
>>>>                             <mailto:yumin.qi at oracle.com
>>>>                             <mailto:yumin.qi at oracle.com>>>>
>>>>                                         wrote:
>>>>
>>>>                                             Hi, Dan and all
>>>>
>>>>                                               I have updated webrev at:
>>>>                             http://cr.openjdk.java.net/~minqi/8074354/webrev03/
>>>>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev03/>
>>>>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev03/>
>>>>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev03/>
>>>>
>>>>                                               1) bug changed synopsis
>>>>                                                    8074354: Make
>>>>                             CreateMinidumpOnCrash a new name and
>>>>                                             available on all platforms
>>>>                                               2) tests: JPRT,
>>>>                             vm.runtime.quick.testlist (on
>>>>                                         Windows), jtreg on
>>>>                                             Windows and Linux/Unix.
>>>>
>>>>                                               Thanks
>>>>                                               Yumin
>>>>
>>>>
>>>>                                             On 3/12/2015 12:59 PM,
>>>>                             Daniel D. Daugherty wrote:
>>>>
>>>>                                                 I believe there are
>>>>                             both JavaTest/JTREG tests in
>>>>                                         the hotspot repo
>>>>                                                 and there are
>>>>                             VM/NSK tests from back when we did
>>>>                                         phone home...
>>>>
>>>>                                                 Check with
>>>>                             Christian or Misha...
>>>>
>>>>                                                 Dan
>>>>
>>>>
>>>>                                                 On 3/12/15 1:50 PM,
>>>>                             Yumin Qi wrote:
>>>>
>>>>                                                     Thanks, Dan
>>>>
>>>>                                                     Will look at
>>>>                             and run those tests. Are they
>>>>                                         under nsk?
>>>>
>>>>                                                     Yumin
>>>>
>>>>                                                     On 3/12/2015
>>>>                             12:29 PM, Daniel D. Daugherty wrote:
>>>>
>>>>                                                         Yumin,
>>>>
>>>>                                                         There are
>>>>                             some error handler tests. You
>>>>                                         should find
>>>>                                                         those and make
>>>>                                                         sure that
>>>>                             you run them on Windows since you're
>>>>                                                         changing
>>>>                             the order
>>>>                                                         there...
>>>>
>>>>                                                         Dan
>>>>
>>>>
>>>>                                                         On 3/12/15
>>>>                             12:42 PM, Yumin Qi wrote:
>>>>
>>>>                                                             Hi,
>>>>                             Thomas and all
>>>>
>>>>                              The second version of the change:
>>>>                              webrev:
>>>>                             http://cr.openjdk.java.net/~minqi/8074354/webrev02/
>>>>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev02/>
>>>>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev02/>
>>>>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev02/>
>>>>
>>>>                              In this version:
>>>>                              1) switch order on Windows to first
>>>>                                         print log
>>>>                                                             file as
>>>>                             other platform, then dump core
>>>>                                         file if
>>>>                             needed. When VMError created after
>>>>                                         crash, siginfo
>>>>                                                             and
>>>>                             context which are ExceptionRecord and
>>>>                             ContextRecord on Windows are recorded,
>>>>                                         mini (or
>>>>                             full, due to 'or' used so type will be
>>>>                                         mini
>>>>                             anyway) dump creates dump file based
>>>>                                         on those two
>>>>                                                             as before.
>>>>                              2) to make os::abort to get above
>>>>                                         two pointers,
>>>>                             added two more fields to the function
>>>>                                         parameters:
>>>>                              3)  changes for
>>>>                             test/runtime/ErrorHandling/SecondaryError.java
>>>>                             ---
>>>>                             added "-XX:-CreateCoredumpOnCrash"
>>>>
>>>>                             -  static void abort(bool
>>>>                                         dump_core = true);
>>>>                              +  static void abort(bool dump_core
>>>>                                         = true,
>>>>                                                             void
>>>>                             *siginfo = NULL, void *context =
>>>>                                         NULL);
>>>>
>>>>                              Tests: JPRT and manually.
>>>>
>>>>                             Thanks
>>>>                             Yumin
>>>>
>>>>                                                             On
>>>>                             3/11/2015 3:47 AM, Thomas Stüfe wrote:
>>>>
>>>>                             Hi Yumin,
>>>>
>>>>                             There is also
>>>>                             test/runtime/ErrorHandling/SecondaryErrorTest.java
>>>>                             - could you please add
>>>>                             "-XX:-CreateCoredumpOnCrash" ?
>>>>                                         Thank you!
>>>>
>>>>                             Beyond that, as I wrote in the bug
>>>>                                         report
>>>>                             comments:
>>>>
>>>>                             "This is also a problem on Windows -
>>>>                             MiniDumpWriteDump() may hang
>>>>                                         infinitly. And on
>>>>                             Windows this is worse than under
>>>>                                         UNIX because
>>>>                             we create the Dump before writing
>>>>                                         the hs-err
>>>>                             file, so if the Dump hangs, we get
>>>>                                         no error
>>>>                             log. I would like to revert the
>>>>                                         order: create
>>>>                             the minidump after writing the
>>>>                                         error log, the
>>>>                             same way Unix does it. We did this
>>>>                                         in our JVM
>>>>                             (SAP) because for us, error logs
>>>>                                         are more
>>>>                             useful than minidumps. "
>>>>
>>>>                             So, I would like to see os::abort
>>>>                                         on Windows
>>>>                             call MiniDumpWriteDump(), and thus
>>>>                                         the mini
>>>>                             dump writing moved after the error log
>>>>                             writing. This would also make the
>>>>                                         code way
>>>>                             cleaner because the control flow
>>>>                                         would be the
>>>>                             same on all platforms.
>>>>
>>>>                             I understand that this may be out
>>>>                                         of scope for
>>>>                             your change, but I would like to
>>>>                                         know what
>>>>                             others think about this.
>>>>
>>>>                             Kind regards, Thomas
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>                             On Wed, Mar 11, 2015 at 8:02 AM,
>>>>                                         Yumin Qi
>>>>                             <yumin.qi at oracle.com
>>>>                             <mailto:yumin.qi at oracle.com>
>>>>                                         <mailto:yumin.qi at oracle.com
>>>>                             <mailto:yumin.qi at oracle.com>>
>>>>                             <mailto:yumin.qi at oracle.com
>>>>                             <mailto:yumin.qi at oracle.com>
>>>>                                         <mailto:yumin.qi at oracle.com
>>>>                             <mailto:yumin.qi at oracle.com>>>
>>>>                             <mailto:yumin.qi at oracle.com
>>>>                             <mailto:yumin.qi at oracle.com>
>>>>                                         <mailto:yumin.qi at oracle.com
>>>>                             <mailto:yumin.qi at oracle.com>>
>>>>                                         <mailto:yumin.qi at oracle.com
>>>>                             <mailto:yumin.qi at oracle.com>
>>>>                             <mailto:yumin.qi at oracle.com
>>>>                             <mailto:yumin.qi at oracle.com>>>>> wrote:
>>>>
>>>>                                 Please review:
>>>>
>>>>                                 bug:
>>>>                             https://bugs.openjdk.java.net/browse/JDK-8074354
>>>>                                 webrev:
>>>>                             http://cr.openjdk.java.net/~minqi/8074354/webrev01/
>>>>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev01/>
>>>>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev01/>
>>>>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev01/>
>>>>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev01/>
>>>>
>>>>                                 Summary: Tests timed out when
>>>>                                         VM crashes
>>>>                             and dumping core file
>>>>                                 which in the test case is not
>>>>                                         needed. To
>>>>                             make core not created,
>>>>                                 the fix changed
>>>>                                         CreateMinidumpOnCrash to
>>>>                             CreateCoredumpOnCrash,
>>>>                                 the former is only used on
>>>>                                         Windows and the
>>>>                             latter for all
>>>>                                 platforms. When VM crashes on
>>>>                                         non Windows,
>>>>                             core file generated as
>>>>                                 default if OS sets core dump
>>>>                                         allowed.
>>>>                             Default value of
>>>>                             CreateCoredumpOnCrash set to
>>>>                                         'true' to
>>>>                             keep same behavior on non
>>>>                                 Windows platforms, but changed
>>>>                                         for Windows
>>>>                             --- original is false,
>>>>                                 not create minidump on
>>>>                                         Windows. With
>>>>                             CreateCoredumpOnCrash turned
>>>>                                 off, no core file will be
>>>>                                         generated.
>>>>                             CreateMinidumpOnCrash still
>>>>                                 can be used on commandline but
>>>>                                         only as
>>>>                             alias for the new flag.
>>>>
>>>>                                 Tests: JPRT, manual tests setting
>>>>                             CreateMinidumpOnCrash on
>>>>                                 commandline to verify flag
>>>>                                         change as alias.
>>>>
>>>>                                 Thanks
>>>>                                 Yumin
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>



More information about the hotspot-runtime-dev mailing list