RFR(S): 8074354: tests that intentionally crash the VM can timeout creating a core file

Daniel D. Daugherty daniel.daugherty at oracle.com
Tue Apr 14 02:04:32 UTC 2015


Yumin,

I would prefer that the handle is properly closed on all the
exit paths. Not closing the handle might even trigger a future
Microsoft sanity check; kind of like when Microsoft added checks
for duplicate close() calls...

Dan


On 4/13/15 4:05 PM, Yumin Qi wrote:
> Dan,
>
> On 4/13/2015 2:09 PM, Daniel D. Daugherty wrote:
>> On 4/13/15 10:58 AM, Yumin Qi wrote:
>>> Thomas,
>>>
>>>   Answers embeded below. The new web is just the same link updated (08):
>>>
>>> http://cr.openjdk.java.net/~minqi/8074354/webrev08/
>>
>> src/share/vm/runtime/globals.hpp
>>     No comments.
>>
>> src/share/vm/runtime/os.hpp
>>     L720: // ON Posix compatible OS it will
>>         typo: 'ON' -> 'On'
>>
>>     L724: ... or a short error message, depends
>>         grammar: 'depends' -> 'depending'
>>
>>     L725: // on checking result.
>>         grammar: 'on checking' -> 'on the checking'
>>
>> src/share/vm/runtime/arguments.cpp
>>     No comments.
>>
>> src/share/vm/utilities/vmError.hpp
>>     No comments.
>>
>> src/share/vm/utilities/vmError.cpp
>>     No comments.
>>
>> src/os/aix/vm/os_aix.cpp
>> src/os/bsd/vm/os_bsd.cpp
>> src/os/linux/vm/os_linux.cpp
>> src/os/posix/vm/os_posix.cpp
>> src/os/solaris/vm/os_solaris.cpp
>>     No comments on the above five files.
>>
>> src/os/windows/vm/os_windows.cpp
>>     L1010: jio_snprintf(buffer, buffsz, "Failed to create coredump 
>> file (0x%x).", GetLastError());
>>         Because this is a Win* specific file you can use "minidump"
>>         here instead of "coredump".
>>
>>         I made a similar comment in the previous round.
>>
> Yes, I will change 'coredump' into 'minidump'
>>     L1028: if (!dump_core || !dumpFile) {
>>         The "!dumpFile" part is an implied boolean which we don't
>>         like to use in HotSpot. Perhaps: dumpFile == NULL
>>
> Will change to Hotspot style.
>>     L1078: win32::exit_process_or_thread(win32::EPT_PROCESS, 1);
>>         The "CloseHandle(dumpFile)" call is missing.
>>
> There is no need to call CloseHandle here explicitly, since we will 
> exit next and all resources will be reclaimed by OS. I remove it at 
> last since think it is not needed here. If you check previous calls to
>   win32::exit_process_or_thread(win32::EPT_PROCESS, 1);
>   I did not place CloseHandle(dumpFile) either.
>   If needed, I think we should put the call before all exit entry.
>> test/runtime/ErrorHandling/ProblematicFrameTest.java
>> test/runtime/ErrorHandling/SafeFetchInErrorHandlingTest.java
>>     No comments on the above two files.
>>
>> test/runtime/ErrorHandling/SecondaryErrorTest.java
>>     No comments.
>>
>> test/runtime/Safepoint/AssertSafepointCheckConsistency1.java
>> test/runtime/Safepoint/AssertSafepointCheckConsistency2.java
>> test/runtime/Safepoint/AssertSafepointCheckConsistency3.java
>> test/runtime/Safepoint/AssertSafepointCheckConsistency4.java
>> test/runtime/Unsafe/RangeCheck.java
>> test/runtime/memory/ReadFromNoaccessArea.java
>>     No comments on the above six files.
> Thanks
> Yumin
>
>>
>>
>> Dan
>>
>>
>>>
>>>
>>> Thanks
>>> Yumin
>>>
>>> On 4/13/2015 12:30 AM, Thomas Stüfe wrote:
>>>> Hi Yumin,
>>>>
>>>> some small things:
>>>>
>>>> os_windows.cpp:1012
>>>>
>>>> +    if (dumpFile == INVALID_HANDLE_VALUE) {
>>>> +      jio_snprintf(buffer, buffsz, "Failed to create coredump file");
>>>> +      status = false;
>>>> +    }
>>>>
>>>> Please print out the error number too:
>>>> +      jio_snprintf(buffer, buffsz, "Failed to create coredump file 
>>>> (0x%x).", ::GetLastError());
>>>>
>>>>
>>> Changed as above. No :: global scope operator needed here.
>>>>
>>>> os_windows.cpp:1036
>>>>
>>>> +  assert(dumpFile != NULL, "dumpFile should not be NULL");
>>>>
>>>> Please remove this assert. I would not do any asserts at this 
>>>> stage, we are dying anyway and a recursive assert would just 
>>>> confuse things. Moreover, CreateFile may have failed for "normal" 
>>>> reasons.
>>>>
>>> removed.
>>>> Instead, if dumpFile is NULL, just skip the whole minidump stuff in 
>>>> os::abort; just behave the same way as for os::abort(false):
>>>>
>>>> +  if (!dump_core || !dumpFile) {
>>>> +  win32::exit_process_or_thread(win32::EPT_PROCESS, 1);
>>>>    }
>>>>
>>>>
>>> changed as expected.
>>>>
>>>> os_windows.cpp: 1076
>>>>
>>>> I would just skip the whole "FormatMessage" stuff. It is just 
>>>> sugar, but it will allocate buffers, which we may not want at this 
>>>> stage. I think the plain error code from GetLastError() is 
>>>> sufficient here. So I would just:
>>>>
>>>> 1076   // Older versions of dbghelp.dll (the one shipped with 
>>>> Win2003 for example) may not support all
>>>> 1077   // the dump types we really want. If first call fails, lets 
>>>> fall back to just use MiniDumpWithFullMemory then.
>>>> 1078   if (_MiniDumpWriteDump(hProcess, processId, dumpFile, 
>>>> dumpType, pmei, NULL, NULL) == false &&
>>>> 1079       _MiniDumpWriteDump(hProcess, processId, dumpFile, 
>>>> (MINIDUMP_TYPE)MiniDumpWithFullMemory, pmei, NULL, NULL) == false) {
>>>>          jio_fprintf(stderr, "Call to MiniDumpWriteDump() failed 
>>>> (Error 0x%x)\n", ::GetLastError());
>>>> 1089   } else {
>>>> 1090    ....
>>>>
>>>>
>>> Agree, this is the last step before exit, with output of last error 
>>> code, user can get what happened --- though a little effort to research.
>>>
>>>> Rest is fine.
>>>> Thomas
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Apr 13, 2015 at 7:59 AM, Yumin Qi <yumin.qi at oracle.com 
>>>> <mailto:yumin.qi at oracle.com>> wrote:
>>>>
>>>>     Thomas and Dan,
>>>>
>>>>       Please review the changes at
>>>>     http://cr.openjdk.java.net/~minqi/8074354/webrev08/
>>>>     <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev08/>
>>>>
>>>>     Thanks
>>>>     Yumin
>>>>
>>>>     On 4/10/2015 10:14 PM, Yumin Qi wrote:
>>>>
>>>>         Thomas,
>>>>
>>>>         Thanks for review again.
>>>>
>>>>         On 4/10/2015 4:20 AM, Thomas Stüfe wrote:
>>>>
>>>>             Hi Yumin,
>>>>
>>>>             Hi Yumin,
>>>>
>>>>             this looks mostly fine for me. I only was able to test
>>>>             it on Linux, but will test it on other OSes next week.
>>>>
>>>>             Here two remaining small points:
>>>>
>>>>             - in os::check_dump_limit, maybe a comment would be
>>>>             helpful, because the contract for this function is a
>>>>             bit difficult to understand. Proposal:
>>>>
>>>>             "Check or prepare a core dump to be taken at a later
>>>>             point in the same thread in os::abort(). Use the caller
>>>>             provided buffer as a scratch buffer. Output status
>>>>             message via VMError::report_cordedump_status() (on
>>>>             success, the core dump file location, on error, a short
>>>>             error message) - this status message which will be
>>>>             written into the error log."
>>>>
>>>>         I will take the comments, new comments:
>>>>
>>>>           // ON Posix compatible OS it will simply check core dump
>>>>         limits while on Windows
>>>>           // it will check if dump file can be created. Check or
>>>>         prepare a core dump to be
>>>>           // taken at a later point in the same thread in
>>>>         os::abort(). Use the caller
>>>>           // provided buffer as a scratch buffer. The status
>>>>         message which will be written
>>>>           // into the error log either is file location or a short
>>>>         error message, depends
>>>>           // on checking result.
>>>>           static void check_dump_limit(char* buffer, size_t
>>>>         bufferSize);
>>>>
>>>>             - It would make sense to improve the output in
>>>>             VMError.report() in STEP(63) a bit:
>>>>
>>>>             Problem 1: we use past-tense, which is misleading
>>>>             ("Core dump written", "Failed to write core"). We did
>>>>             not generate the core dump yet, we just plan to do it,
>>>>             and it still may fail for various reasons (e.g. even if
>>>>             the limits are fine, the current directory may be write
>>>>             protected. There is no way to catch all errors
>>>>             beforehand). If the user reads "Core dump written", he
>>>>             assumes the core dump was written successfully. If
>>>>             abort(3) later fails to write the core dump, user will
>>>>             be confused.
>>>>
>>>>         Will change the message to
>>>>               if (coredump_status) {
>>>>                 st->print("Core dump will be written. %s",
>>>>         coredump_message);
>>>>               } else {
>>>>                 st->print("No core dump will be written. %s",
>>>>         coredump_message);
>>>>               }
>>>>
>>>>         as your suggested.
>>>>         Whether the coredump can be successfully created depends on
>>>>         following core file generation procedure. User can trace
>>>>         error messages if failed to create.
>>>>
>>>>             Problem 2: this is more obscure: on Linux core dumps
>>>>             may not be written to the file system at all but be
>>>>             processed by some other program. On my Ubuntu 14.4
>>>>             machine I see this output:
>>>>
>>>>             #
>>>>             # Core dump written. Default location: Core dumps may
>>>>             be processed with "/usr/share/apport/apport %p %s %c
>>>>             %P" (or dumping to
>>>>             /shared/projects/openjdk/jdk9-hs-rt/output/core.26261)
>>>>             #
>>>>
>>>>             This output is a bit jarring.
>>>>
>>>>         I have not tested on Ubuntu so no comments.
>>>>
>>>>         I will post a new webrev based on your and Dan's comments.
>>>>         Thanks for your patience.
>>>>
>>>>         Thanks
>>>>         Yumin
>>>>
>>>>             My suggestion for better texts would be:
>>>>
>>>>             "Core dump will be written. %s"
>>>>             "No core dump will be written. %s"
>>>>
>>>>             which hopefully fits all cases better. But I don't
>>>>             know, other suggestions are welcome :-)
>>>>
>>>>             Otherwise, this looks fine.
>>>>
>>>>             Thanks for your work!
>>>>
>>>>             Thomas
>>>>
>>>>
>>>>             On Fri, Apr 10, 2015 at 5:18 AM, Yumin Qi
>>>>             <yumin.qi at oracle.com <mailto:yumin.qi at oracle.com>
>>>>             <mailto:yumin.qi at oracle.com
>>>>             <mailto:yumin.qi at oracle.com>>> wrote:
>>>>
>>>>                 Hi, Thomas and Dan
>>>>
>>>>             http://cr.openjdk.java.net/~minqi/8074354/webrev07/
>>>>             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>>>             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>>>
>>>>                 On 4/9/2015 7:51 PM, Yumin Qi wrote:
>>>>
>>>>                     Please hold on, I missed the test cases:
>>>>
>>>>                     *test/runtime/Unsafe/RangeCheck.java
>>>>                     *
>>>>             *test/runtime/memory/ReadFromNoaccessArea.java
>>>>
>>>>                 Added.
>>>>
>>>>                     Also some file heads need to be updated with
>>>>             this year.
>>>>
>>>>                 Changed
>>>>
>>>>                 Thanks
>>>>                 Yumin
>>>>
>>>>                     Thanks
>>>>                     Yumin
>>>>
>>>>
>>>>                     *
>>>>                     On 4/9/2015 7:07 PM, Yumin Qi wrote:
>>>>
>>>>                         Hi, Thomas and Dan
>>>>
>>>>                           The same URL is updated with new version.
>>>>
>>>>             http://cr.openjdk.java.net/~minqi/8074354/webrev07/
>>>>             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>>>             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>>>
>>>>                           I remove the public access to VMError::out,
>>>>                         VMError::coredump_message and the size of
>>>>                         VMError::coredump_message.
>>>>                           Added private static HANDLE in
>>>>             os_windows.cpp for
>>>>                         dumpFile as suggested by Thomas.
>>>>
>>>>                           Tests passes JPRT and manual tests.
>>>>
>>>>                         Thanks
>>>>                         Yumin
>>>>
>>>>
>>>>                         On 4/9/2015 1:32 AM, Thomas Stüfe wrote:
>>>>
>>>>                             Hi Yumin,
>>>>
>>>>                             I do not think exposing
>>>>             VMError::coredump_status
>>>>                             buffer just for the sake of using it as
>>>>             a scratch
>>>>                             buffer in os::abort() is a good idea.
>>>>             It is very
>>>>                             confusing and has nothing to do with
>>>>             the purpose of
>>>>                             VMerror::coredump_status.
>>>>
>>>>                             Also, VMerror::coredump_status is
>>>>             static, so you do
>>>>                             not even gain anything over just using
>>>>             a global or
>>>>                             function scope static buffer. Either we
>>>>             are worried
>>>>                             about thread safety in os::abort(), or
>>>>             we aren't. If
>>>>                             we aren't, just use a local static
>>>>             buffer, it is much
>>>>                             clearer and no need to expose
>>>>             VMError::coredump_status
>>>>                             like this.
>>>>
>>>>                             Apart from that, see my remarks in
>>>>             yesterdays mail.
>>>>
>>>>                             Cheers, Thomas
>>>>
>>>>
>>>>
>>>>
>>>>                             On Thu, Apr 9, 2015 at 5:16 AM, Yumin Qi
>>>>                             <yumin.qi at oracle.com
>>>>             <mailto:yumin.qi at oracle.com>
>>>>             <mailto:yumin.qi at oracle.com <mailto:yumin.qi at oracle.com>>
>>>>                             <mailto:yumin.qi at oracle.com
>>>>             <mailto:yumin.qi at oracle.com>
>>>>                             <mailto:yumin.qi at oracle.com
>>>>             <mailto:yumin.qi at oracle.com>>>> wrote:
>>>>
>>>>                                 Hi, Dan
>>>>
>>>>                                   New webrev at
>>>>             http://cr.openjdk.java.net/~minqi/8074354/webrev07/
>>>>             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>>>             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>>>             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>>>
>>>>                                   To make code clean, add
>>>>             os::abort_internal which
>>>>                             is only used
>>>>                                 for Windows.
>>>>                                   Since passing
>>>>             VMError::coredump_message and its
>>>>                             size as
>>>>                                 parameters across multiple
>>>>             functions makes
>>>>                             function have a bigger
>>>>                                 signature, I change the way to
>>>>             access them -- add
>>>>                             two functions in
>>>>                                 VMError to access them:
>>>>             coredump_msg_buffer() and
>>>>                                 coredump_msg_buffer_size(). They
>>>>             are static
>>>>                             members, so can save
>>>>                                 stack space for passing around.
>>>>
>>>>                                   in os_windows.cpp, abort(bool,
>>>>             void*, void*):
>>>>                                   if error encountered, exit with
>>>>             error messages.
>>>>                                   If no error,
>>>>             VMError::coredump_message contains
>>>>                             the dump
>>>>                                 path/name. I just use them so no
>>>>             need to have
>>>>                             extra space for name
>>>>                                 allocation.
>>>>
>>>>                                   Passed JPRT and manual tests.
>>>>
>>>>                                   Thanks
>>>>                                   Yumin
>>>>
>>>>
>>>>                                 On 4/7/2015 9:23 AM, Daniel D.
>>>>             Daugherty wrote:
>>>>
>>>>                                     On 4/1/15 10:57 AM, Yumin Qi wrote:
>>>>
>>>>                                       Hi, Dan and Thomas,
>>>>
>>>>                                           I have a new webrev at
>>>>             http://cr.openjdk.java.net/~minqi/8074354/webrev06/
>>>>             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev06/>
>>>>             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev06/>
>>>>             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev06/>
>>>>
>>>>
>>>>
>>>>             src/share/vm/runtime/globals.hpp
>>>>                                         Sorry, I didn't notice this
>>>>             before:
>>>>
>>>>                                         L939:   product(bool,
>>>>             CreateCoredumpOnCrash,
>>>>                                     true,           \
>>>>                                         L940:  "Create minidump on
>>>>             VM fatal
>>>>                                     error")           \
>>>>
>>>>                                             The "minidump" on L940
>>>>             should change
>>>>                                 also. Don't know if you
>>>>                                             want to say "core/mini
>>>>             dump" or
>>>>                                 something else.
>>>>
>>>>
>>>>             src/share/vm/runtime/os.hpp
>>>>                                         L719:   // On Linux/Solaris
>>>>             it will simply
>>>>                                 check core dump
>>>>                                     limits while on Windows will do
>>>>             nothing.
>>>>                                             grammar: 'on Windows
>>>>             will do nothing'
>>>>                                                   -> 'on Windows it
>>>>             will do nothing'
>>>>
>>>>
>>>>             src/share/vm/runtime/arguments.cpp
>>>>                                         No comments.
>>>>
>>>>             src/share/vm/utilities/vmError.hpp
>>>>                                         No comments.
>>>>
>>>>             src/share/vm/utilities/vmError.cpp
>>>>                                         No comments.
>>>>
>>>>                                     src/os/aix/vm/os_aix.cpp
>>>>                                         No comments.
>>>>
>>>>                                     src/os/bsd/vm/os_bsd.cpp
>>>>                                         No comments.
>>>>
>>>>             src/os/linux/vm/os_linux.cpp
>>>>                                         No comments.
>>>>
>>>>             src/os/posix/vm/os_posix.cpp
>>>>                                         No comments.
>>>>
>>>>             src/os/solaris/vm/os_solaris.cpp
>>>>                                         No comments.
>>>>
>>>>             src/os/windows/vm/os_windows.cpp
>>>>                                         L92: jio_snprintf(buffer,
>>>>             bufferSize,
>>>>                                     "%s\\hs_err_pid%u.mdmp",
>>>>                                 get_current_directory(NULL, 0),
>>>>                                     current_process_id());
>>>>                                             Other non-Windows calls to
>>>>                                 get_current_directory() do a NULL
>>>>                                             check before using the
>>>>             return value.
>>>>                                 The original Windows
>>>>                                             code that called
>>>>                                 get_current_directory() did not
>>>>             make this
>>>>                                             check, but I think it's
>>>>             better to be
>>>>                                 consistent and safe.
>>>>
>>>>                                         L1007:   static char
>>>>             buffer[O_BUFLEN];
>>>>                                         L1008:   char
>>>>             filename[FILENAME_MAX];
>>>>                                             Two more potentially
>>>>             large stack
>>>>                                 variables here. Will we
>>>>                                             run into stack overflow
>>>>             on Win* more
>>>>                                 frequently when what
>>>>                                             we really wanted was
>>>>             the reason for
>>>>                                 the abort() call?
>>>>
>>>>                                             The reason the original
>>>>             code used
>>>>                                 'buffer' to construct the
>>>>                                             dump file name was to
>>>>             save space. The
>>>>                                 original code was also
>>>>                                             careful to not need the
>>>>             file name in
>>>>                                 any error messages after
>>>>                                             the point at which the
>>>>             dump file was
>>>>                                 successfully created.
>>>>
>>>>                                             It would be better
>>>>             (more resilient) if
>>>>                                 you could pass in
>>>>                                             an already existing
>>>>             buffer to abort()
>>>>                                 like the original
>>>>                                             code passed into
>>>>                                 os::check_or_create_dump(). Yes, that
>>>>                                             would require visiting
>>>>             all existing
>>>>                                 calls to abort().
>>>>
>>>>                                         L1015:  assert(out->is_open(),
>>>>                                 "defaultStream should be open");
>>>>                                             Is it possible for
>>>>             os::abort() to be
>>>>                                 called early enough on
>>>>                                             Windows such that
>>>>             VMError::out is not
>>>>                                 actually setup? Yes,
>>>>                                             this would have to be
>>>>             absolutely
>>>>                                 catastrophic...
>>>>
>>>>                                             See the comment in
>>>>             os_solaris.cpp
>>>>                                 L1520-1522.
>>>>
>>>>                                         L1018: jio_snprintf(buffer,
>>>>                                 sizeof(buffer), "%s", "Either
>>>>                                     CreateCoredumpOnCrash is
>>>>             disabled from command"
>>>>                                         L1019 " line or coredump
>>>>                                 is not available");
>>>>                                             When you are using
>>>>             jio_snprintf() as a
>>>>                                 replacement for:
>>>>
>>>>             strncpy(buffer, "my string",
>>>>                                 sizeof(buffer));
>>>>             buffer[sizeof(buffer) - 1] = '\0';
>>>>
>>>>                                             you don't need the "%s"
>>>>             format string.
>>>>                                 So this code would be:
>>>>
>>>>             jio_snprintf(buffer, sizeof(buffer),
>>>>              "Either
>>>>                                 CreateCoredumpOnCrash is disabled
>>>>             from "
>>>>              "command line or coredump
>>>>                                 is not available");
>>>>
>>>>                                             Same comment for:
>>>>             L1026, L1039
>>>>
>>>>                                         L1020: goto done;
>>>>                                             The use of 'goto' makes
>>>>             me cringe. You
>>>>                                 can refactor much
>>>>                                             of this logic into an
>>>>             abort_internal()
>>>>                                 function that
>>>>                                             abort() calls and keep
>>>>             all the
>>>>                                 original "return" logic
>>>>                                             to bail out.
>>>>
>>>>                                         L1045: // Older versions of
>>>>             dbghelp.h
>>>>                                 doesn't contain all
>>>>                                             grammar: 'doesn't' ->
>>>>             'do not'
>>>>
>>>>                                         L1052:   cwd =
>>>>             get_current_directory(NULL, 0);
>>>>                                         L1053: jio_snprintf(filename,
>>>>                                 sizeof(filename),
>>>>                                     "%s\\hs_err_pid%u.mdmp", cwd,
>>>>                                 current_process_id());
>>>>                                             Other non-Windows calls to
>>>>                                 get_current_directory() do a NULL
>>>>                                             check before using the
>>>>             return value.
>>>>                                 The original Windows
>>>>                                             code that called
>>>>                                 get_current_directory() did not
>>>>             make this
>>>>                                             check, but I think it's
>>>>             better to be
>>>>                                 consistent and safe.
>>>>
>>>>                                         L1092:   // well done.
>>>>                                             Like a overcooked
>>>>             steak? :-) Not sure
>>>>                                 what you're trying
>>>>                                             to say here...
>>>>
>>>>             test/runtime/ErrorHandling/ProblematicFrameTest.java
>>>>                                         No comments.
>>>>
>>>>             test/runtime/ErrorHandling/SafeFetchInErrorHandlingTest.java
>>>>                                         No comments.
>>>>
>>>>             test/runtime/ErrorHandling/SecondaryErrorTest.java
>>>>                                         No comments.
>>>>
>>>>             test/runtime/Safepoint/AssertSafepointCheckConsistency1.java
>>>>             test/runtime/Safepoint/AssertSafepointCheckConsistency2.java
>>>>             test/runtime/Safepoint/AssertSafepointCheckConsistency3.java
>>>>             test/runtime/Safepoint/AssertSafepointCheckConsistency4.java
>>>>                                         No comments on these four.
>>>>
>>>>
>>>>                                     Dan
>>>>
>>>>
>>>>
>>>>                                           What changes:
>>>>
>>>>                                            1) os_windows.cpp,
>>>>             os::abort(bool ....):
>>>>                                               a) Since when abort()
>>>>             is called, the
>>>>                                     last function     called before
>>>>             exit process,
>>>>                                     VMError::log already closed, which
>>>>                                         is the file logging.
>>>>             defaultStream still
>>>>                                     available to use, so
>>>>                                         use it for writing error or
>>>>             success
>>>>                                     messages. VMError::out is
>>>>                                         the defaultStream, for
>>>>             using it, added a
>>>>                                     function out_stream()  in VMError.
>>>>                                               b) Delete the codes
>>>>             which decide to
>>>>                                     write core based on
>>>>                                         client/server version with
>>>>             debug. The new
>>>>                                     code is if
>>>>                                         CreateCoredumpOnCrash set
>>>>             or dump_core is
>>>>                                     on, the coredump will
>>>>                                         be tried no matter which
>>>>             version it is.
>>>>                                     There is no special
>>>>                                         reason not to dump core
>>>>             with client
>>>>                                     version I think. Any concern
>>>>                                         for this change?
>>>>                                               c) Use of 'goto' in
>>>>             this function. I
>>>>                                     tried not to use it,
>>>>                                         but using it makes code
>>>>             clear. I remember
>>>>                                     somewhere suggest not
>>>>                                         using 'goto' in cplusplus.
>>>>
>>>>                                             2) changed comments to
>>>>             make them
>>>>                                     consistent as indicated by Dan.
>>>>
>>>>                                             3) Added/modified head
>>>>             version for files.
>>>>
>>>>                                             Tests: JPRT, manual
>>>>             test on local Windows.
>>>>
>>>>                                         Thanks
>>>>                                         Yumin
>>>>
>>>>
>>>>                                         On 3/31/2015 1:25 AM,
>>>>             Thomas Stüfe wrote:
>>>>
>>>>                                             Hi Daniel, Yumin,
>>>>
>>>>                                             sorry, a lot of Yumins
>>>>             changes were
>>>>                                         based on suggestions from
>>>>                                             me, so here more
>>>>             background:
>>>>
>>>>                                             Yumin reversed the
>>>>             order of error log
>>>>                                         writing and core dumping.
>>>>                                             The reason for that are
>>>>             explained in
>>>>                                         detail at the start of the
>>>>                                             mail thread, in short:
>>>>
>>>>                                             - on Windows core file
>>>>             dumping may
>>>>                                         hang - its rare but it
>>>>                                             happens - preventing
>>>>             error logs.
>>>>                                         Depending on who you ask,
>>>>                                             error logs are more
>>>>             important than
>>>>                                         minidumps, so it makes sense
>>>>                                             to first write the
>>>>             erorr log.
>>>>                                             - This also brings
>>>>             windows core paths
>>>>                                         closer to UNIX code
>>>>                                             paths. See below.
>>>>
>>>>                                             About writing to stderr
>>>>             in os::abort():
>>>>
>>>>                                             After this change, calling
>>>>             VmError::report_coredump_status() in
>>>>                                             os::abort() will not do
>>>>             anything
>>>>                                         because the error log is
>>>>                                             already written at that
>>>>             point. I
>>>>                                         suggested to Yumin writing to
>>>>                                             stderr instead in
>>>>             os::abort() because
>>>>                                         that mimicks UNIX
>>>>                                             behaviour: On UNIX, you
>>>>             call abort(3),
>>>>                                         which writes a core and
>>>>                                             aborts. It writes a
>>>>             short message to
>>>>                                         stderr "Aborted (core
>>>>                                             dumped)" or just "Aborted".
>>>>
>>>>                                             After Yumins change, on
>>>>             Windows
>>>>                                         os::abort also writes the core,
>>>>                                             then aborts. It made
>>>>             sense to me that
>>>>                                         this function should
>>>>                                             behave the same way as
>>>>             on UNIX: if
>>>>                                         core writing fails, write to
>>>>                                             stderr. There is no way
>>>>             otherwise to
>>>>                                         communicate a core dump
>>>>                                             writing failure. After
>>>>             writing the
>>>>                                         core, process stops.
>>>>
>>>>                                             ---
>>>>
>>>>                                             About the control flows:
>>>>
>>>>                                             Before Yumins change:
>>>>
>>>>                                             On Windows
>>>>                                             1)
>>>>             os::check_or_create_dump(): we
>>>>                                         wrote the core dump and then
>>>>                                             information about the
>>>>             dump (file
>>>>                                         location, success or error)
>>>>                                             was handed to the misnamed
>>>>             VmError::report_coredump_status(),
>>>>                                             which just stored that
>>>>             information.
>>>>                                             2) VmError::report():
>>>>             Error log was
>>>>                                         written and in step 63
>>>>                                             information about the
>>>>             core (already
>>>>                                         written at that point) was
>>>>                                             added to error log.
>>>>                                             3) os::abort() just did
>>>>             an exit(3)
>>>>
>>>>                                             On Unix,
>>>>                                             1)
>>>>             os::check_or_create_dump(): checks
>>>>                                         the limits to guess
>>>>                                             whether core dumping
>>>>             later will
>>>>                                         succeed. Again, that
>>>>                                             information is handed to
>>>>             VmError::report_coredump_status()
>>>>                                             2) VmError::report():
>>>>             Error log is
>>>>                                         written and in step 63,
>>>>                                             information about the
>>>>             core's probable
>>>>                                         future location is added
>>>>                                             to error log. Here,
>>>>             messages use past
>>>>                                         tense, which is
>>>>                                             misleading, because the
>>>>             core is not
>>>>                                         yet written.
>>>>                                             3) os::abort() calls
>>>>             abort(3), which
>>>>                                         stops and writes a core.
>>>>
>>>>                                             Yumin pushed core file
>>>>             writing on
>>>>                                         Windows down to os::abort().
>>>>                                             This brings the control
>>>>             flow much
>>>>                                         closer to Unix:
>>>>
>>>>                                             (1) call
>>>>             os::check_dump_limit() to
>>>>                                         check the prerequisites for
>>>>                                             core file writing and
>>>>             gather a bit
>>>>                                         information to put in the
>>>>                                             error log: On Unix,
>>>>             limit checks, on
>>>>                                         windows, so far nothing
>>>>                                             much but precalculating
>>>>             the error file
>>>>                                         name.
>>>>                                             (2) VmError::report():
>>>>             Error log is
>>>>                                         written and information
>>>>                                             about the core's
>>>>             probable location is
>>>>                                         added to error log.
>>>>                                             (3) os::abort() is
>>>>             called, which on
>>>>                                         all platforms:
>>>>                                              - dumps the core
>>>>                                              - if failing, writes a
>>>>             one-liner to
>>>>                                         stderr
>>>>                                              - stops the process.
>>>>
>>>>                                             I think, if one agrees
>>>>             that reversing
>>>>                                         order of core dumping and
>>>>                                             error log writing on
>>>>             windows is a good
>>>>                                         thing, this change looks
>>>>                                             ok to me. Some
>>>>             improvements could make
>>>>                                         it clearer:
>>>>
>>>>                                             - maybe rename
>>>>             "os::check_dump_limit()" (was my suggestion,
>>>>                                             sorry) to
>>>>             "os::check_core_prerequisites()" - that
>>>>                                         leaves room
>>>>                                             to flesh it out a bit
>>>>             more on Windows,
>>>>                                         where we do not have
>>>>                                             UNIX limits.
>>>>                                             - make the messages in
>>>>                                         VMError::report() future
>>>>             sense: "Will
>>>>                                             write core file to
>>>>             location.....".
>>>>                                             - Make the messages
>>>>             written in
>>>>                                         os::abort() on Windows clearer,
>>>>                                             and shorter, and we
>>>>             also do not need
>>>>                                         the buffer to be static
>>>>                                             here. Actually, if we
>>>>             want to be as on
>>>>                                         UNIX, just dumping "core
>>>>                                             file writing succeeded"
>>>>             or "failed" -
>>>>                                         maybe with an error
>>>>                                             number from
>>>>             GetLastError() - will be
>>>>                                         enough because the file
>>>>                                             location is already
>>>>             printed in the
>>>>                                         error log. Like on UNIX.
>>>>
>>>>                                             Kind Regards, Thomas
>>>>
>>>>
>>>>                                             On Mon, Mar 30, 2015 at
>>>>             9:51 PM,
>>>>                                         Daniel D. Daugherty
>>>>                                            
>>>>             <daniel.daugherty at oracle.com
>>>>             <mailto:daniel.daugherty at oracle.com>
>>>>             <mailto:daniel.daugherty at oracle.com
>>>>             <mailto:daniel.daugherty at oracle.com>>
>>>>             <mailto:daniel.daugherty at oracle.com
>>>>             <mailto:daniel.daugherty at oracle.com>
>>>>             <mailto:daniel.daugherty at oracle.com
>>>>             <mailto:daniel.daugherty at oracle.com>>>> wrote:
>>>>
>>>>                                                 Re:
>>>>             report_coredump_status() is
>>>>                                         really record_coredump_status()
>>>>
>>>>                                                 The
>>>>             report_coredump_status()
>>>>                                         function is designed to
>>>>             record two
>>>>                                                 things about core
>>>>             dumps for later
>>>>                                         reporting by the code that
>>>>                                                 generates the
>>>>             hs_err_pid file.
>>>>                                         That's why the original
>>>>             report_coredump_status() didn't
>>>>                                         output anything to stderr.
>>>>
>>>>                                                 By changing the
>>>>             Windows calls to
>>>>             report_coredump_status() into
>>>>                                                 jio_fprintf() calls
>>>>             you have:
>>>>
>>>>                                                 - put output onto
>>>>             stderr that
>>>>                                         should go to the hs_err_pid
>>>>             file
>>>>                                                 - made the windows
>>>>             code paths work
>>>>                                         differently than the
>>>>                                                 non-windows
>>>>                                                   code paths
>>>>
>>>>
>>>>                                                 Dan
>>>>
>>>>
>>>>
>>>>                                                 On 3/30/15 1:28 PM,
>>>>             Yumin Qi wrote:
>>>>
>>>>                                                     Thanks Dan
>>>>
>>>>                                                     On 3/30/2015
>>>>             11:25 AM, Daniel
>>>>                                         D. Daugherty wrote:
>>>>
>>>>                                                         On 3/25/15
>>>>             12:11 PM, Yumin
>>>>                                         Qi wrote:
>>>>
>>>>             Hi, all
>>>>
>>>>               I updated the
>>>>                                         webrev with a new change to
>>>>             test case:
>>>>             test/runtime/Unsafe/RangeCheck.java
>>>>
>>>>               Add
>>>>             -XX:-CreateCoredumpOnCrash to test, no
>>>>             dump needed on this case.
>>>>
>>>>               new  webrev:
>>>>             http://cr.openjdk.java.net/~minqi/8074354/webrev05/
>>>>             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev05/>
>>>>             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev05/>
>>>>             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev05/>
>>>>
>>>>
>>>>             General nit - please
>>>>                                         update copyright lines to 2015
>>>>                                                         as needed
>>>>
>>>>             src/share/vm/runtime/globals.hpp
>>>>                                                             No
>>>>             comments.
>>>>
>>>>             src/share/vm/runtime/os.hpp
>>>>             line 720:   // On
>>>>                                         Windows this will create an
>>>>                                                         actual
>>>>             minidump, on
>>>>                                         Linux/Solaris it will simply
>>>>                                                         check core
>>>>             dump limits
>>>>             line 721: static
>>>>                                         void check_dump_limit(char*
>>>>             buffer, size_t bufferSize);
>>>>               The comment on
>>>>                                         line 720 no longer matches
>>>>                                                         with the
>>>>             function
>>>>               rename from
>>>>                                         check_or_create_dump() to
>>>>             check_dump_limit().
>>>>
>>>>               You have this
>>>>                                         comment in vmError.cpp:
>>>>               943     // check
>>>>                                         core dump limits on
>>>>             Linux/Solaris, nothing on
>>>>                                         Windows
>>>>               944
>>>>             os::check_dump_limit(buffer,
>>>>             sizeof(buffer));
>>>>
>>>>               so the two
>>>>                                         comments are out of sync.
>>>>
>>>>                                                     Will convert
>>>>             them to be
>>>>                                         consistent.
>>>>
>>>>             src/share/vm/runtime/arguments.cpp
>>>>             line 3252: } else if
>>>>                                         (match_option(option,
>>>>             "-XX:+CreateMinidumpOnCrash", &tail)) {
>>>>             line 3256: } else if
>>>>                                         (match_option(option,
>>>>             "-XX:-CreateMinidumpOnCrash", &tail)) {
>>>>               These two checks
>>>>                                         should use the
>>>>             match_option() version
>>>>               without a '&tail'
>>>>                                         parameter.
>>>>
>>>>                                                     Will use
>>>>             version w/o tail.
>>>>
>>>>             src/share/vm/utilities/vmError.cpp
>>>>             line 217: bool
>>>>             VMError::coredump_status =
>>>>                                                         true;    
>>>>              // presume we
>>>>                                         can dump core file first
>>>>               I don't think you
>>>>                                         should init this to true.
>>>>                                                         It confuses
>>>>               things with
>>>>             VMError::report_coredump_status(). It will
>>>>                                         also
>>>>               confuse this code:
>>>>
>>>>               526 STEP(63,
>>>>                                         "(printing core file
>>>>             information)")
>>>>               529       if
>>>>                                         (coredump_status) {
>>>>               530
>>>>                                         st->print("Core dump
>>>>             written. Default
>>>>             location: %s",
>>>>                                         coredump_message);
>>>>
>>>>               because
>>>>                                         coredump_status won't
>>>>             accurately
>>>>             reflect whether
>>>>               the
>>>>                                         coredump_message field has
>>>>             been set.
>>>>
>>>>             line 943: // check
>>>>                                         core dump limits on
>>>>             Linux/Solaris, nothing on
>>>>                                         Windows
>>>>               This updated
>>>>                                         comment doesn't match the
>>>>             unmodified
>>>>               comment in os.hpp:
>>>>
>>>>               720:   // On
>>>>                                         Windows this will create an
>>>>                                                         actual
>>>>             minidump, on
>>>>                                         Linux/Solaris it will simply
>>>>                                                         check core
>>>>             dump limits
>>>>
>>>>                                                     I will consider
>>>>             your comments,
>>>>                                         then revise with a new
>>>>                                                     version. I
>>>>             remember here there
>>>>                                         is a case, if status not
>>>>                                                     set, it will
>>>>             output an
>>>>                                         inconsistent message. Will
>>>>             check
>>>>                                                     again.
>>>>
>>>>             src/os/aix/vm/os_aix.cpp
>>>>                                                             No
>>>>             comments.
>>>>
>>>>             src/os/bsd/vm/os_bsd.cpp
>>>>                                                             No
>>>>             comments.
>>>>
>>>>             src/os/linux/vm/os_linux.cpp
>>>>                                                             No
>>>>             comments.
>>>>
>>>>             src/os/posix/vm/os_posix.cpp
>>>>                                                             No
>>>>             comments.
>>>>
>>>>             src/os/solaris/vm/os_solaris.cpp
>>>>                                                             No
>>>>             comments.
>>>>
>>>>             src/os/windows/vm/os_windows.cpp
>>>>             line 1008: static char
>>>>                                         buffer[O_BUFLEN];
>>>>               Why switch from a
>>>>                                         buffer parameter to a
>>>>                                                         static buffer?
>>>>               What happens with
>>>>                                         parallel calls to
>>>>             abort()? Will the static
>>>>               buffer get stomped
>>>>                                         by the competing threads?
>>>>
>>>>                                                     The original
>>>>             buffer is static
>>>>                                         too.  Carrying an buffer
>>>>                                                     for abort seems
>>>>             not right.
>>>>                                         This buffer in fact only for
>>>>                                                     storing file
>>>>             name (based on
>>>>                                         pid) here.
>>>>                                                     abort() will
>>>>             dump the core
>>>>                                         file, should we prevent
>>>>             multi-thread calling to this
>>>>                                         function? I did not check
>>>>                                                     this part, will
>>>>             check if it is
>>>>                                         MT-safe.
>>>>
>>>>
>>>>             line 1015:   // If
>>>>                                         running on a client version
>>>>                                                         of Windows
>>>>             and user has
>>>>                                         not explicitly enabled dumping
>>>>             line 1016:   if
>>>>             (!os::win32::is_windows_server() &&
>>>>             !CreateCoredumpOnCrash) {
>>>>               The default for
>>>>                                         CreateCoredumpOnCrash is
>>>>                                                         now 'true'
>>>>             so the
>>>>               comment and logic
>>>>                                         are no longer correct
>>>>                                                         here. The
>>>>             Windows
>>>>               Client VM will
>>>>                                         generate minidumps by default.
>>>>
>>>>             old line 1007:
>>>>             VMError::report_coredump_status("Minidumps
>>>>                                         are not
>>>>             enabled by default on
>>>>                                         client versions of Windows",
>>>>             false);
>>>>             new line 1017:
>>>>                                         jio_fprintf(stderr, "Minidumps
>>>>                                                         are not
>>>>             enabled by default
>>>>                                         on client versions of
>>>>             Windows.\n");
>>>>               There are a number
>>>>                                         of places where we
>>>>                                                         change from
>>>>             VMError::report_coredump_status() to
>>>>                                         jio_fprintf()
>>>>               and I'm not quite
>>>>                                         seeing why this was done.
>>>>
>>>>               Update:
>>>>             VMError::report_coredump_status()
>>>>                                                         is
>>>>             misnamed. It doesn't
>>>>               report anything in
>>>>                                         the sense that it
>>>>             doesn't print anything. It
>>>>               does record two
>>>>                                         pieces of information about
>>>>                                                         core dumps
>>>>             so maybe
>>>>               it should be
>>>>             VMError::record_coredump_status().
>>>>
>>>>             line 1100:
>>>>                                         jio_fprintf(stderr,
>>>>             "%s.\n", buffer);
>>>>               At this point,
>>>>                                         buffer contains the path to the
>>>>               mini-dump file so
>>>>                                         the above line simply prints
>>>>               that on stderr. Why?
>>>>
>>>>               Yes, I see that
>>>>                                         the old code did something
>>>>             similar:
>>>>
>>>>               1090
>>>>             VMError::report_coredump_status(buffer,
>>>>                                         true);
>>>>
>>>>               but
>>>>             report_coredump_status() didn't
>>>>             actually print
>>>>               anything. It just
>>>>                                         squirreled away these in
>>>>             vmError.cpp:
>>>>
>>>>               217 bool
>>>>             VMError::coredump_status;
>>>>               218 char
>>>>             VMError::coredump_message[O_BUFLEN];
>>>>
>>>>               See comments above
>>>>                                         about how
>>>>             report_coredump_status()
>>>>               is misnamed.
>>>>
>>>>                                                             At this
>>>>             point, I'm
>>>>                                         convinced that all the
>>>>             changes from
>>>>             VMError::report_coredump_status() to
>>>>                                         jio_fprintf() are
>>>>                                                             a bad idea.
>>>>
>>>>
>>>>                                                     Changing to
>>>>             jio_fprintf is because
>>>>             report_coredump_status did not
>>>>                                         output anything, it is
>>>>                                                     just a logging
>>>>             as you
>>>>                                         indicated. It is reasonable we
>>>>                                                     change it to
>>>>                                         record_coredump_status
>>>>             instead. How about
>>>>                                                     add printout to
>>>>             report_coredump_status? So I do not
>>>>                                                     need to use
>>>>             jio_printf here.
>>>>
>>>>                                                     Other
>>>>             consideration of using
>>>>                                         jio_fprintf after call
>>>>                                                     shutdown
>>>>             called, the static
>>>>                                         output stream still exists,
>>>>                                                     but not
>>>>             guaranteed to work as
>>>>                                         expected.
>>>>
>>>>
>>>>                                                     Thanks
>>>>                                                     Yumin
>>>>
>>>>             test/runtime/ErrorHandling/ProblematicFrameTest.java
>>>>                                                             No
>>>>             comments.
>>>>
>>>>             test/runtime/ErrorHandling/SafeFetchInErrorHandlingTest.java
>>>>                                                             No
>>>>             comments.
>>>>
>>>>             test/runtime/ErrorHandling/SecondaryErrorTest.java
>>>>                                                             No
>>>>             comments.
>>>>
>>>>             test/runtime/Safepoint/AssertSafepointCheckConsistency1.java
>>>>                                                             No
>>>>             comments.
>>>>
>>>>             test/runtime/Safepoint/AssertSafepointCheckConsistency2.java
>>>>                                                             No
>>>>             comments.
>>>>
>>>>             test/runtime/Safepoint/AssertSafepointCheckConsistency3.java
>>>>                                                             No
>>>>             comments.
>>>>
>>>>             test/runtime/Safepoint/AssertSafepointCheckConsistency4.java
>>>>                                                             No
>>>>             comments.
>>>>
>>>>             test/runtime/Unsafe/RangeCheck.java
>>>>                                                             No
>>>>             comments.
>>>>
>>>>             test/runtime/memory/ReadFromNoaccessArea.java
>>>>                                                             No
>>>>             comments.
>>>>
>>>>
>>>>
>>>>             Thanks
>>>>             Yumin
>>>>
>>>>
>>>>                                                             On
>>>>             3/23/2015 11:48 AM,
>>>>                                         Yumin Qi wrote:
>>>>
>>>>               Since Thomas is
>>>>                                         not a reviewer in open jdk,
>>>>               I need two
>>>>                                         volunteers (at least one
>>>>               "R"eviewer).
>>>>
>>>>               Dan, since you
>>>>                                         already reviewed previous
>>>>               version, could you
>>>>                                         have a look?
>>>>
>>>>               Thanks
>>>>               Yumin
>>>>
>>>>               On 3/20/2015 3:20
>>>>                                         PM, Yumin Qi wrote:
>>>>
>>>>                   Thomas,
>>>>
>>>>                    Thanks. Yes,
>>>>                                         it is webrev04. Also, I
>>>>                   have updated
>>>>                                         webrev04 to add another
>>>>                   test case change:
>>>>             test/runtime/memory/ReadFromNoaccessArea.java
>>>>                    (Thanks Dan's
>>>>                                         update)
>>>>
>>>>                   Thanks
>>>>                   Yumin
>>>>
>>>>                   On 3/20/2015
>>>>                                         11:55 AM, Thomas Stüfe wrote:
>>>>
>>>>
>>>>                       Hi Yumin,
>>>>
>>>>
>>>
>>
>



More information about the hotspot-runtime-dev mailing list