RFR(S): 8074354: tests that intentionally crash the VM can timeout creating a core file

Yumin Qi yumin.qi at oracle.com
Sat Apr 11 05:14:36 UTC 2015


Thomas,

Thanks for review again.

On 4/10/2015 4:20 AM, Thomas Stüfe wrote:
> Hi Yumin,
>
> Hi Yumin,
>
> this looks mostly fine for me. I only was able to test it on Linux, 
> but will test it on other OSes next week.
>
> Here two remaining small points:
>
> - in os::check_dump_limit, maybe a comment would be helpful, because 
> the contract for this function is a bit difficult to understand. 
> Proposal:
>
> "Check or prepare a core dump to be taken at a later point in the same 
> thread in os::abort(). Use the caller provided buffer as a scratch 
> buffer. Output status message via VMError::report_cordedump_status() 
> (on success, the core dump file location, on error, a short error 
> message) - this status message which will be written into the error log."
>
I will take the comments, new comments:

   // ON Posix compatible OS it will simply check core dump limits while 
on Windows
   // it will check if dump file can be created. Check or prepare a core 
dump to be
   // taken at a later point in the same thread in os::abort(). Use the 
caller
   // provided buffer as a scratch buffer. The status message which will 
be written
   // into the error log either is file location or a short error 
message, depends
   // on checking result.
   static void check_dump_limit(char* buffer, size_t bufferSize);

> - It would make sense to improve the output in VMError.report() in 
> STEP(63) a bit:
>
> Problem 1: we use past-tense, which is misleading ("Core dump 
> written", "Failed to write core"). We did not generate the core dump 
> yet, we just plan to do it, and it still may fail for various reasons 
> (e.g. even if the limits are fine, the current directory may be write 
> protected. There is no way to catch all errors beforehand). If the 
> user reads "Core dump written", he assumes the core dump was written 
> successfully. If abort(3) later fails to write the core dump, user 
> will be confused.
>
Will change the message to
       if (coredump_status) {
         st->print("Core dump will be written. %s", coredump_message);
       } else {
         st->print("No core dump will be written. %s", coredump_message);
       }

as your suggested.
Whether the coredump can be successfully created depends on following 
core file generation procedure. User can trace error messages if failed 
to create.
> Problem 2: this is more obscure: on Linux core dumps may not be 
> written to the file system at all but be processed by some other 
> program. On my Ubuntu 14.4 machine I see this output:
>
> #
> # Core dump written. Default location: Core dumps may be processed 
> with "/usr/share/apport/apport %p %s %c %P" (or dumping to 
> /shared/projects/openjdk/jdk9-hs-rt/output/core.26261)
> #
>
> This output is a bit jarring.
>
I have not tested on Ubuntu so no comments.

I will post a new webrev based on your and Dan's comments. Thanks for 
your patience.

Thanks
Yumin
> My suggestion for better texts would be:
>
> "Core dump will be written. %s"
> "No core dump will be written. %s"
>
> which hopefully fits all cases better. But I don't know, other 
> suggestions are welcome :-)
>
> Otherwise, this looks fine.
>
> Thanks for your work!
>
> Thomas
>
>
> On Fri, Apr 10, 2015 at 5:18 AM, Yumin Qi <yumin.qi at oracle.com 
> <mailto:yumin.qi at oracle.com>> wrote:
>
>     Hi, Thomas and Dan
>
>     http://cr.openjdk.java.net/~minqi/8074354/webrev07/
>     <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>
>     On 4/9/2015 7:51 PM, Yumin Qi wrote:
>
>         Please hold on, I missed the test cases:
>
>         *test/runtime/Unsafe/RangeCheck.java
>         *
>         *test/runtime/memory/ReadFromNoaccessArea.java
>
>     Added.
>
>         Also some file heads need to be updated with this year.
>
>     Changed
>
>     Thanks
>     Yumin
>
>         Thanks
>         Yumin
>
>
>         *
>         On 4/9/2015 7:07 PM, Yumin Qi wrote:
>
>             Hi, Thomas and Dan
>
>               The same URL is updated with new version.
>
>             http://cr.openjdk.java.net/~minqi/8074354/webrev07/
>             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>
>               I remove the public access to VMError::out,
>             VMError::coredump_message and the size of
>             VMError::coredump_message.
>               Added private static HANDLE in os_windows.cpp for
>             dumpFile as suggested by Thomas.
>
>               Tests passes JPRT and manual tests.
>
>             Thanks
>             Yumin
>
>
>             On 4/9/2015 1:32 AM, Thomas Stüfe wrote:
>
>                 Hi Yumin,
>
>                 I do not think exposing VMError::coredump_status
>                 buffer just for the sake of using it as a scratch
>                 buffer in os::abort() is a good idea. It is very
>                 confusing and has nothing to do with the purpose of
>                 VMerror::coredump_status.
>
>                 Also, VMerror::coredump_status is static, so you do
>                 not even gain anything over just using a global or
>                 function scope static buffer. Either we are worried
>                 about thread safety in os::abort(), or we aren't. If
>                 we aren't, just use a local static buffer, it is much
>                 clearer and no need to expose VMError::coredump_status
>                 like this.
>
>                 Apart from that, see my remarks in yesterdays mail.
>
>                 Cheers, Thomas
>
>
>
>
>                 On Thu, Apr 9, 2015 at 5:16 AM, Yumin Qi
>                 <yumin.qi at oracle.com <mailto:yumin.qi at oracle.com>
>                 <mailto:yumin.qi at oracle.com
>                 <mailto:yumin.qi at oracle.com>>> wrote:
>
>                     Hi, Dan
>
>                       New webrev at
>                 http://cr.openjdk.java.net/~minqi/8074354/webrev07/
>                 <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>                 <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>
>                       To make code clean, add os::abort_internal which
>                 is only used
>                     for Windows.
>                       Since passing VMError::coredump_message and its
>                 size as
>                     parameters across multiple functions makes
>                 function have a bigger
>                     signature, I change the way to access them -- add
>                 two functions in
>                     VMError to access them: coredump_msg_buffer() and
>                     coredump_msg_buffer_size(). They are static
>                 members, so can save
>                     stack space for passing around.
>
>                       in os_windows.cpp, abort(bool, void*, void*):
>                       if error encountered, exit with error messages.
>                       If no error, VMError::coredump_message contains
>                 the dump
>                     path/name. I just use them so no need to have
>                 extra space for name
>                     allocation.
>
>                       Passed JPRT and manual tests.
>
>                       Thanks
>                       Yumin
>
>
>                     On 4/7/2015 9:23 AM, Daniel D. Daugherty wrote:
>
>                         On 4/1/15 10:57 AM, Yumin Qi wrote:
>
>                           Hi, Dan and Thomas,
>
>                               I have a new webrev at
>                         http://cr.openjdk.java.net/~minqi/8074354/webrev06/
>                         <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev06/>
>                         <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev06/>
>
>
>
>                         src/share/vm/runtime/globals.hpp
>                             Sorry, I didn't notice this before:
>
>                             L939:   product(bool, CreateCoredumpOnCrash,
>                         true,                                \
>                             L940:           "Create minidump on VM fatal
>                         error")                              \
>
>                                 The "minidump" on L940 should change
>                     also. Don't know if you
>                                 want to say "core/mini dump" or
>                     something else.
>
>
>                         src/share/vm/runtime/os.hpp
>                             L719:   // On Linux/Solaris it will simply
>                     check core dump
>                         limits while on Windows will do nothing.
>                                 grammar: 'on Windows will do nothing'
>                                       -> 'on Windows it will do nothing'
>
>
>                         src/share/vm/runtime/arguments.cpp
>                             No comments.
>
>                         src/share/vm/utilities/vmError.hpp
>                             No comments.
>
>                         src/share/vm/utilities/vmError.cpp
>                             No comments.
>
>                         src/os/aix/vm/os_aix.cpp
>                             No comments.
>
>                         src/os/bsd/vm/os_bsd.cpp
>                             No comments.
>
>                         src/os/linux/vm/os_linux.cpp
>                             No comments.
>
>                         src/os/posix/vm/os_posix.cpp
>                             No comments.
>
>                         src/os/solaris/vm/os_solaris.cpp
>                             No comments.
>
>                         src/os/windows/vm/os_windows.cpp
>                             L92: jio_snprintf(buffer, bufferSize,
>                         "%s\\hs_err_pid%u.mdmp",
>                     get_current_directory(NULL, 0),
>                         current_process_id());
>                                 Other non-Windows calls to
>                     get_current_directory() do a NULL
>                                 check before using the return value.
>                     The original Windows
>                                 code that called
>                     get_current_directory() did not make this
>                                 check, but I think it's better to be
>                     consistent and safe.
>
>                             L1007:   static char buffer[O_BUFLEN];
>                             L1008:   char filename[FILENAME_MAX];
>                                 Two more potentially large stack
>                     variables here. Will we
>                                 run into stack overflow on Win* more
>                     frequently when what
>                                 we really wanted was the reason for
>                     the abort() call?
>
>                                 The reason the original code used
>                     'buffer' to construct the
>                                 dump file name was to save space. The
>                     original code was also
>                                 careful to not need the file name in
>                     any error messages after
>                                 the point at which the dump file was
>                     successfully created.
>
>                                 It would be better (more resilient) if
>                     you could pass in
>                                 an already existing buffer to abort()
>                     like the original
>                                 code passed into
>                     os::check_or_create_dump(). Yes, that
>                                 would require visiting all existing
>                     calls to abort().
>
>                             L1015:   assert(out->is_open(),
>                     "defaultStream should be open");
>                                 Is it possible for os::abort() to be
>                     called early enough on
>                                 Windows such that VMError::out is not
>                     actually setup? Yes,
>                                 this would have to be absolutely
>                     catastrophic...
>
>                                 See the comment in os_solaris.cpp
>                     L1520-1522.
>
>                             L1018: jio_snprintf(buffer,
>                     sizeof(buffer), "%s", "Either
>                         CreateCoredumpOnCrash is disabled from command"
>                             L1019                  " line or coredump
>                     is not available");
>                                 When you are using jio_snprintf() as a
>                     replacement for:
>
>                                     strncpy(buffer, "my string",
>                     sizeof(buffer));
>                                     buffer[sizeof(buffer) - 1] = '\0';
>
>                                 you don't need the "%s" format string.
>                     So this code would be:
>
>                                 jio_snprintf(buffer, sizeof(buffer),
>                                              "Either
>                     CreateCoredumpOnCrash is disabled from "
>                                              "command line or coredump
>                     is not available");
>
>                                 Same comment for: L1026, L1039
>
>                             L1020: goto done;
>                                 The use of 'goto' makes me cringe. You
>                     can refactor much
>                                 of this logic into an abort_internal()
>                     function that
>                                 abort() calls and keep all the
>                     original "return" logic
>                                 to bail out.
>
>                             L1045: // Older versions of dbghelp.h
>                     doesn't contain all
>                                 grammar: 'doesn't' -> 'do not'
>
>                             L1052:   cwd = get_current_directory(NULL, 0);
>                             L1053: jio_snprintf(filename,
>                     sizeof(filename),
>                         "%s\\hs_err_pid%u.mdmp", cwd,
>                     current_process_id());
>                                 Other non-Windows calls to
>                     get_current_directory() do a NULL
>                                 check before using the return value.
>                     The original Windows
>                                 code that called
>                     get_current_directory() did not make this
>                                 check, but I think it's better to be
>                     consistent and safe.
>
>                             L1092:   // well done.
>                                 Like a overcooked steak? :-) Not sure
>                     what you're trying
>                                 to say here...
>
>                        
>                     test/runtime/ErrorHandling/ProblematicFrameTest.java
>                             No comments.
>
>                     test/runtime/ErrorHandling/SafeFetchInErrorHandlingTest.java
>                             No comments.
>
>                         test/runtime/ErrorHandling/SecondaryErrorTest.java
>                             No comments.
>
>                     test/runtime/Safepoint/AssertSafepointCheckConsistency1.java
>                     test/runtime/Safepoint/AssertSafepointCheckConsistency2.java
>                     test/runtime/Safepoint/AssertSafepointCheckConsistency3.java
>                     test/runtime/Safepoint/AssertSafepointCheckConsistency4.java
>                             No comments on these four.
>
>
>                         Dan
>
>
>
>                               What changes:
>
>                                1) os_windows.cpp, os::abort(bool ....):
>                                   a) Since when abort() is called, the
>                         last function     called before exit process,
>                         VMError::log already closed, which
>                             is the file logging. defaultStream still
>                         available to use, so
>                             use it for writing error or success
>                         messages. VMError::out is
>                             the defaultStream, for using it, added a
>                         function out_stream()     in VMError.
>                                   b) Delete the codes which decide to
>                         write core based on
>                             client/server version with debug. The new
>                         code is if
>                             CreateCoredumpOnCrash set or dump_core is
>                         on, the coredump will
>                             be tried no matter which version it is.
>                         There is no special
>                             reason not to dump core with client
>                         version I think. Any concern
>                             for this change?
>                                   c) Use of 'goto' in this function. I
>                         tried not to use it,
>                             but using it makes code clear. I remember
>                         somewhere suggest not
>                             using 'goto' in cplusplus.
>
>                                 2) changed comments to make them
>                         consistent as indicated by Dan.
>
>                                 3) Added/modified head version for files.
>
>                                 Tests: JPRT, manual test on local Windows.
>
>                             Thanks
>                             Yumin
>
>
>                             On 3/31/2015 1:25 AM, Thomas Stüfe wrote:
>
>                                 Hi Daniel, Yumin,
>
>                                 sorry, a lot of Yumins changes were
>                             based on suggestions from
>                                 me, so here more background:
>
>                                 Yumin reversed the order of error log
>                             writing and core dumping.
>                                 The reason for that are explained in
>                             detail at the start of the
>                                 mail thread, in short:
>
>                                 - on Windows core file dumping may
>                             hang - its rare but it
>                                 happens - preventing error logs.
>                             Depending on who you ask,
>                                 error logs are more important than
>                             minidumps, so it makes sense
>                                 to first write the erorr log.
>                                 - This also brings windows core paths
>                             closer to UNIX code
>                                 paths. See below.
>
>                                 About writing to stderr in os::abort():
>
>                                 After this change, calling
>                             VmError::report_coredump_status() in
>                                 os::abort() will not do anything
>                             because the error log is
>                                 already written at that point. I
>                             suggested to Yumin writing to
>                                 stderr instead in os::abort()  because
>                             that mimicks UNIX
>                                 behaviour: On UNIX, you call abort(3),
>                             which writes a core and
>                                 aborts. It writes a short message to
>                             stderr "Aborted (core
>                                 dumped)" or just "Aborted".
>
>                                 After Yumins change, on Windows
>                             os::abort also writes the core,
>                                 then aborts. It made sense to me that
>                             this function should
>                                 behave the same way as on UNIX: if
>                             core writing fails, write to
>                                 stderr. There is no way otherwise to
>                             communicate a core dump
>                                 writing failure. After writing the
>                             core, process stops.
>
>                                 ---
>
>                                 About the control flows:
>
>                                 Before Yumins change:
>
>                                 On Windows
>                                 1) os::check_or_create_dump(): we
>                             wrote the core dump and then
>                                 information about the dump (file
>                             location, success or error)
>                                 was handed to the misnamed
>                             VmError::report_coredump_status(),
>                                 which just stored that information.
>                                 2) VmError::report(): Error log was
>                             written and in step 63
>                                 information about the core (already
>                             written at that point) was
>                                 added to error log.
>                                 3) os::abort() just did an exit(3)
>
>                                 On Unix,
>                                 1) os::check_or_create_dump(): checks
>                             the limits to guess
>                                 whether core dumping later will
>                             succeed. Again, that
>                                 information is handed to
>                             VmError::report_coredump_status()
>                                 2) VmError::report(): Error log is
>                             written and in step 63,
>                                 information about the core's probable
>                             future location is added
>                                 to error log. Here, messages use past
>                             tense, which is
>                                 misleading, because the core is not
>                             yet written.
>                                 3) os::abort() calls abort(3), which
>                             stops and writes a core.
>
>                                 Yumin pushed core file writing on
>                             Windows down to os::abort().
>                                 This brings the control flow much
>                             closer to Unix:
>
>                                 (1) call os::check_dump_limit() to
>                             check the prerequisites for
>                                 core file writing and gather a bit
>                             information to put in the
>                                 error log: On Unix, limit checks, on
>                             windows, so far nothing
>                                 much but precalculating the error file
>                             name.
>                                 (2) VmError::report(): Error log is
>                             written and information
>                                 about the core's probable location is
>                             added to error log.
>                                 (3) os::abort() is called, which on
>                             all platforms:
>                                  - dumps the core
>                                  - if failing, writes a one-liner to
>                             stderr
>                                  - stops the process.
>
>                                 I think, if one agrees that reversing
>                             order of core dumping and
>                                 error log writing on windows is a good
>                             thing, this change looks
>                                 ok to me. Some improvements could make
>                             it clearer:
>
>                                 - maybe rename
>                             "os::check_dump_limit()" (was my suggestion,
>                                 sorry) to
>                             "os::check_core_prerequisites()" - that
>                             leaves room
>                                 to flesh it out a bit more on Windows,
>                             where we do not have
>                                 UNIX limits.
>                                 - make the messages in
>                             VMError::report() future sense: "Will
>                                 write core file to location.....".
>                                 - Make the messages written in
>                             os::abort() on Windows clearer,
>                                 and shorter, and we also do not need
>                             the buffer to be static
>                                 here. Actually, if we want to be as on
>                             UNIX, just dumping "core
>                                 file writing succeeded" or "failed" -
>                             maybe with an error
>                                 number from GetLastError() - will be
>                             enough because the file
>                                 location is already printed in the
>                             error log. Like on UNIX.
>
>                                 Kind Regards, Thomas
>
>
>                                 On Mon, Mar 30, 2015 at 9:51 PM,
>                             Daniel D. Daugherty
>                                 <daniel.daugherty at oracle.com
>                             <mailto:daniel.daugherty at oracle.com>
>                                 <mailto:daniel.daugherty at oracle.com
>                             <mailto:daniel.daugherty at oracle.com>>> wrote:
>
>                                     Re: report_coredump_status() is
>                             really record_coredump_status()
>
>                                     The report_coredump_status()
>                             function is designed to record two
>                                     things about core dumps for later
>                             reporting by the code that
>                                     generates the hs_err_pid file.
>                             That's why the original
>                                     report_coredump_status() didn't
>                             output anything to stderr.
>
>                                     By changing the Windows calls to
>                             report_coredump_status() into
>                                     jio_fprintf() calls you have:
>
>                                     - put output onto stderr that
>                             should go to the hs_err_pid file
>                                     - made the windows code paths work
>                             differently than the
>                                     non-windows
>                                       code paths
>
>
>                                     Dan
>
>
>
>                                     On 3/30/15 1:28 PM, Yumin Qi wrote:
>
>                                         Thanks Dan
>
>                                         On 3/30/2015 11:25 AM, Daniel
>                             D. Daugherty wrote:
>
>                                             On 3/25/15 12:11 PM, Yumin
>                             Qi wrote:
>
>                                                 Hi, all
>
>                                                     I updated the
>                             webrev with a new change to
>                                                 test case:
>                                                
>                             test/runtime/Unsafe/RangeCheck.java
>
>                                                     Add
>                             -XX:-CreateCoredumpOnCrash to test, no
>                                                 dump needed on this case.
>
>                                                     new  webrev:
>                             http://cr.openjdk.java.net/~minqi/8074354/webrev05/
>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev05/>
>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev05/>
>
>
>                                             General nit - please
>                             update copyright lines to 2015
>                                             as needed
>
>                             src/share/vm/runtime/globals.hpp
>                                                 No comments.
>
>                             src/share/vm/runtime/os.hpp
>                                                 line 720:   // On
>                             Windows this will create an
>                                             actual minidump, on
>                             Linux/Solaris it will simply
>                                             check core dump limits
>                                                 line 721:   static
>                             void check_dump_limit(char*
>                                             buffer, size_t bufferSize);
>                                                     The comment on
>                             line 720 no longer matches
>                                             with the function
>                                                     rename from
>                             check_or_create_dump() to
>                                             check_dump_limit().
>
>                                                     You have this
>                             comment in vmError.cpp:
>                                                     943     // check
>                             core dump limits on
>                                             Linux/Solaris, nothing on
>                             Windows
>                                                     944
>                             os::check_dump_limit(buffer,
>                                             sizeof(buffer));
>
>                                                     so the two
>                             comments are out of sync.
>
>                                         Will convert them to be
>                             consistent.
>
>                                            
>                             src/share/vm/runtime/arguments.cpp
>                                                 line 3252: } else if
>                             (match_option(option,
>                             "-XX:+CreateMinidumpOnCrash", &tail)) {
>                                                 line 3256: } else if
>                             (match_option(option,
>                             "-XX:-CreateMinidumpOnCrash", &tail)) {
>                                                     These two checks
>                             should use the
>                                             match_option() version
>                                                     without a '&tail'
>                             parameter.
>
>                                         Will use version w/o tail.
>
>                                            
>                             src/share/vm/utilities/vmError.cpp
>                                                 line 217: bool
>                             VMError::coredump_status =
>                                             true;       // presume we
>                             can dump core file first
>                                                     I don't think you
>                             should init this to true.
>                                             It confuses
>                                                     things with
>                                            
>                             VMError::report_coredump_status(). It will
>                             also
>                                                     confuse this code:
>
>                                                     526   STEP(63,
>                             "(printing core file
>                                             information)")
>                                                     529       if
>                             (coredump_status) {
>                                                     530
>                             st->print("Core dump written. Default
>                                             location: %s",
>                             coredump_message);
>
>                                                     because
>                             coredump_status won't accurately
>                                             reflect whether
>                                                     the
>                             coredump_message field has been set.
>
>                                                 line 943: // check
>                             core dump limits on
>                                             Linux/Solaris, nothing on
>                             Windows
>                                                     This updated
>                             comment doesn't match the
>                                             unmodified
>                                                     comment in os.hpp:
>
>                                                     720:   // On
>                             Windows this will create an
>                                             actual minidump, on
>                             Linux/Solaris it will simply
>                                             check core dump limits
>
>                                         I will consider your comments,
>                             then revise with a new
>                                         version. I remember here there
>                             is a case, if status not
>                                         set, it will output an
>                             inconsistent message. Will check
>                                         again.
>
>                                             src/os/aix/vm/os_aix.cpp
>                                                 No comments.
>
>                                             src/os/bsd/vm/os_bsd.cpp
>                                                 No comments.
>
>                             src/os/linux/vm/os_linux.cpp
>                                                 No comments.
>
>                             src/os/posix/vm/os_posix.cpp
>                                                 No comments.
>
>                             src/os/solaris/vm/os_solaris.cpp
>                                                 No comments.
>
>                             src/os/windows/vm/os_windows.cpp
>                                                 line 1008: static char
>                             buffer[O_BUFLEN];
>                                                     Why switch from a
>                             buffer parameter to a
>                                             static buffer?
>                                                     What happens with
>                             parallel calls to
>                                             abort()? Will the static
>                                                     buffer get stomped
>                             by the competing threads?
>
>                                         The original buffer is static
>                             too.  Carrying an buffer
>                                         for abort seems not right.
>                             This buffer in fact only for
>                                         storing file name (based on
>                             pid) here.
>                                         abort() will dump the core
>                             file, should we prevent
>                                         multi-thread calling to this
>                             function? I did not check
>                                         this part, will check if it is
>                             MT-safe.
>
>
>                                                 line 1015:   // If
>                             running on a client version
>                                             of Windows and user has
>                             not explicitly enabled dumping
>                                                 line 1016:   if
>                                            
>                             (!os::win32::is_windows_server() &&
>                                             !CreateCoredumpOnCrash) {
>                                                     The default for
>                             CreateCoredumpOnCrash is
>                                             now 'true' so the
>                                                     comment and logic
>                             are no longer correct
>                                             here. The Windows
>                                                     Client VM will
>                             generate minidumps by default.
>
>                                                 old line 1007:
>                             VMError::report_coredump_status("Minidumps
>                             are not
>                                             enabled by default on
>                             client versions of Windows",
>                                             false);
>                                                 new line 1017:
>                             jio_fprintf(stderr, "Minidumps
>                                             are not enabled by default
>                             on client versions of
>                                             Windows.\n");
>                                                     There are a number
>                             of places where we
>                                             change from
>                                            
>                             VMError::report_coredump_status() to
>                             jio_fprintf()
>                                                     and I'm not quite
>                             seeing why this was done.
>
>                                                     Update:
>                             VMError::report_coredump_status()
>                                             is misnamed. It doesn't
>                                                     report anything in
>                             the sense that it
>                                             doesn't print anything. It
>                                                     does record two
>                             pieces of information about
>                                             core dumps so maybe
>                                                     it should be
>                             VMError::record_coredump_status().
>
>                                                 line 1100:
>                             jio_fprintf(stderr, "%s.\n", buffer);
>                                                     At this point,
>                             buffer contains the path to the
>                                                     mini-dump file so
>                             the above line simply prints
>                                                     that on stderr. Why?
>
>                                                     Yes, I see that
>                             the old code did something
>                                             similar:
>
>                                                     1090
>                                            
>                              VMError::report_coredump_status(buffer,
>                             true);
>
>                                                     but
>                             report_coredump_status() didn't
>                                             actually print
>                                                     anything. It just
>                             squirreled away these in
>                                             vmError.cpp:
>
>                                                     217 bool
>                             VMError::coredump_status;
>                                                     218 char
>                             VMError::coredump_message[O_BUFLEN];
>
>                                                     See comments above
>                             about how
>                                             report_coredump_status()
>                                                     is misnamed.
>
>                                                 At this point, I'm
>                             convinced that all the
>                                             changes from
>                                            
>                             VMError::report_coredump_status() to
>                             jio_fprintf() are
>                                                 a bad idea.
>
>
>                                         Changing to jio_fprintf is because
>                                         report_coredump_status did not
>                             output anything, it is
>                                         just a logging as you
>                             indicated. It is reasonable we
>                                         change it to
>                             record_coredump_status instead. How about
>                                         add printout to
>                             report_coredump_status? So I do not
>                                         need to use jio_printf here.
>
>                                         Other consideration of using
>                             jio_fprintf after call
>                                         shutdown called, the static
>                             output stream still exists,
>                                         but not guaranteed to work as
>                             expected.
>
>
>                                         Thanks
>                                         Yumin
>
>                             test/runtime/ErrorHandling/ProblematicFrameTest.java
>                                                 No comments.
>
>                             test/runtime/ErrorHandling/SafeFetchInErrorHandlingTest.java
>                                                 No comments.
>
>                             test/runtime/ErrorHandling/SecondaryErrorTest.java
>                                                 No comments.
>
>                             test/runtime/Safepoint/AssertSafepointCheckConsistency1.java
>                                                 No comments.
>
>                             test/runtime/Safepoint/AssertSafepointCheckConsistency2.java
>                                                 No comments.
>
>                             test/runtime/Safepoint/AssertSafepointCheckConsistency3.java
>                                                 No comments.
>
>                             test/runtime/Safepoint/AssertSafepointCheckConsistency4.java
>                                                 No comments.
>
>                                            
>                             test/runtime/Unsafe/RangeCheck.java
>                                                 No comments.
>
>                             test/runtime/memory/ReadFromNoaccessArea.java
>                                                 No comments.
>
>
>
>                                                 Thanks
>                                                 Yumin
>
>
>                                                 On 3/23/2015 11:48 AM,
>                             Yumin Qi wrote:
>
>                                                     Since Thomas is
>                             not a reviewer in open jdk,
>                                                     I need two
>                             volunteers (at least one
>                                                     "R"eviewer).
>
>                                                     Dan, since you
>                             already reviewed previous
>                                                     version, could you
>                             have a look?
>
>                                                     Thanks
>                                                     Yumin
>
>                                                     On 3/20/2015 3:20
>                             PM, Yumin Qi wrote:
>
>                                                         Thomas,
>
>                                                          Thanks. Yes,
>                             it is webrev04. Also, I
>                                                         have updated
>                             webrev04 to add another
>                                                         test case change:
>                             test/runtime/memory/ReadFromNoaccessArea.java
>                                                          (Thanks Dan's
>                             update)
>
>                                                         Thanks
>                                                         Yumin
>
>                                                         On 3/20/2015
>                             11:55 AM, Thomas Stüfe wrote:
>
>
>                                                             Hi Yumin,
>
>                                                             I think
>                             you meant to post webrev.04?
>
>                                                             I looked
>                             at 04, and it looks nice.
>                                                             Thank you!
>
>                                                             (still
>                             only reviewer with r)
>
>                                                             Thomas
>
>                                                             On Mar 20,
>                             2015 6:20 PM, "Yumin Qi"
>                                                            
>                             <yumin.qi at oracle.com
>                             <mailto:yumin.qi at oracle.com>
>                             <mailto:yumin.qi at oracle.com
>                             <mailto:yumin.qi at oracle.com>>
>                             <mailto:yumin.qi at oracle.com
>                             <mailto:yumin.qi at oracle.com>
>                             <mailto:yumin.qi at oracle.com
>                             <mailto:yumin.qi at oracle.com>>>> wrote:
>
>                                                                 Hi, All
>
>                             http://cr.openjdk.java.net/~minqi/8074354/webrev03/
>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev03/>
>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev03/>
>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev03/>
>
>                                                                 This
>                             version is based on
>                                                             Thomas'
>                             suggestion.
>                                                                 Tests
>                             passed
>                             vm.runtime.quick.testlist, JPRT and
>                                                             manual tests.
>
>                                                                 Thanks
>                                                                 Yumin
>
>                                                                 On
>                             3/18/2015 10:06 AM, Yumin Qi
>                                                             wrote:
>
>
>                                                                     On
>                             3/18/2015 4:26 AM,
>                                                             Thomas
>                             Stüfe wrote:
>
>                               Hi Yumin,
>
>                               - just aesthetics, but
>                                                             on all
>                             implementations of
>                               os_abort(), at least on
>                                                             the UNIX
>                             variants, maybe we could
>                               consistently rename the
>                                                             parameters
>                             to be "siginfo" and
>                               "context" instead of
>                                                             using
>                             Windows terms?
>
>                             Sure.
>
>                               - On
>                             check_create_dump_limit(), you do
>                                                             not need
>                             exceptionRecord, contextRecord
>                                                             parameters
>                             anymore. The
>                               only parameters now
>                                                             needed are
>                             buffer and buffer size,
>                               which is not even an
>                                                             output
>                             parameter, just a scratch
>                               buffer for this
>                                                             function
>                             to use for printf. For
>                                                             output, it
>                               calls
>                             VMError::report_coredump_status(...) at
>                                                             the end to
>                               provide information
>                                                             about the
>                             core file.
>
>                               - I would rename that
>                                                             function from
>                             check_create_dump_limit() to
>                             check_dump_limit() - nothing
>                               is created anymore.
>
>                             Yes, no need to carry those
>                                                             two
>                             parameters. I tried to keep
>                             less changes, but decided
>                                                             to change now.
>
>                               - on Windows, in
>                             os::abort(), there is no point
>                                                             anymore in
>                               calling
>                             VMError::report_coredump_status(...) because
>                                                             that
>                               information is only
>                                                             used
>                             during error log writing which
>                               already happened. It
>                                                             only makes
>                             sense to call this
>                               function in
>                             check_create_dump_limit(), which
>                                                             happens
>                               before error log writing.
>                               Instead, maybe error
>                                                             messages
>                             like "Call to
>                             MiniDumpWriteDump() failed" should
>                                                             just be
>                             written to
>                               stderr? This would be
>                                                             consistent
>                             with Unix, where the OS
>                               writes a short message
>                                                             on stderr
>                             if core file writing fails.
>
>                             Sure, will direct output to
>                                                             stderr.
>
>                               - there is now a new test,
>                             test/runtime/ErrorHandling/SafeFetchInErrorHandlingTest.java
>                               - could you please also
>                                                             add
>                             "-XX:-CreateCoredumpOnCrash" ?
>                               I just added the test
>                                                             and did
>                             not want to add the option
>                               before it was available.
>
>                                                                     OK
>
>                             Thanks
>                             Yumin
>
>                               Thanks for your work!
>
>                               Thomas
>
>
>                               On Tue, Mar 17, 2015 at
>                                                             2:04 AM,
>                             Yumin Qi
>                             <yumin.qi at oracle.com
>                             <mailto:yumin.qi at oracle.com>
>                             <mailto:yumin.qi at oracle.com
>                             <mailto:yumin.qi at oracle.com>>
>                             <mailto:yumin.qi at oracle.com
>                             <mailto:yumin.qi at oracle.com>
>                             <mailto:yumin.qi at oracle.com
>                             <mailto:yumin.qi at oracle.com>>>
>                             <mailto:yumin.qi at oracle.com
>                             <mailto:yumin.qi at oracle.com>
>                             <mailto:yumin.qi at oracle.com
>                             <mailto:yumin.qi at oracle.com>>
>                             <mailto:yumin.qi at oracle.com
>                             <mailto:yumin.qi at oracle.com>
>                             <mailto:yumin.qi at oracle.com
>                             <mailto:yumin.qi at oracle.com>>>>>
>                               wrote:
>
>                                   Hi, Dan and all
>
>                                     I have updated
>                                                             webrev at:
>                             http://cr.openjdk.java.net/~minqi/8074354/webrev03/
>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev03/>
>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev03/>
>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev03/>
>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev03/>
>
>                                     1) bug changed
>                                                             synopsis
>                                                              8074354: Make
>                             CreateMinidumpOnCrash a new name and
>                                                             available
>                             on all platforms
>                                     2) tests: JPRT,
>                             vm.runtime.quick.testlist (on
>                               Windows), jtreg on
>                                   Windows and Linux/Unix.
>
>                                     Thanks
>                                     Yumin
>
>
>                                   On 3/12/2015 12:59
>                                                             PM, Daniel
>                             D. Daugherty wrote:
>
>                                       I believe there
>                                                             are both
>                             JavaTest/JTREG tests in
>                               the hotspot repo
>                                       and there are
>                                                             VM/NSK
>                             tests from back when we did
>                               phone home...
>
>                                                             Check with
>                             Christian or Misha...
>
>                                       Dan
>
>
>                                       On 3/12/15 1:50
>                                                             PM, Yumin
>                             Qi wrote:
>
>                                                             Thanks, Dan
>
>                                                             Will look
>                             at and run those tests.
>                                                             Are they
>                               under nsk?
>
>                                                             Yumin
>
>                                                             On
>                             3/12/2015 12:29 PM, Daniel D.
>                                                             Daugherty
>                             wrote:
>
>                                                                 Yumin,
>
>                                                                 There
>                             are some error handler
>                                                             tests. You
>                               should find
>                                                                 those
>                             and make
>                                                                 sure
>                             that you run them on
>                                                             Windows
>                             since you're
>                             changing the order
>                             there...
>
>                                                                 Dan
>
>
>                                                                 On
>                             3/12/15 12:42 PM, Yumin Qi
>                                                             wrote:
>
>                             Hi, Thomas and all
>
>                              The second version of
>                                                             the change:
>                              webrev:
>                             http://cr.openjdk.java.net/~minqi/8074354/webrev02/
>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev02/>
>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev02/>
>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev02/>
>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev02/>
>
>                              In this version:
>                              1) switch order on
>                                                             Windows to
>                             first
>                               print log
>                             file as other platform,
>                                                             then dump core
>                               file if
>                             needed. When VMError
>                                                             created after
>                               crash, siginfo
>                             and context which are
>                             ExceptionRecord and
>                             ContextRecord on Windows
>                                                             are recorded,
>                               mini (or
>                             full, due to 'or' used so
>                                                             type will be
>                               mini
>                             anyway) dump creates dump
>                                                             file based
>                               on those two
>                                                                     as
>                             before.
>                              2) to make os::abort to
>                                                             get above
>                               two pointers,
>                             added two more fields to
>                                                             the function
>                               parameters:
>                              3)  changes for
>                             test/runtime/ErrorHandling/SecondaryError.java
>                                                             ---
>                             added
>                             "-XX:-CreateCoredumpOnCrash"
>
>                               -  static void abort(bool
>                               dump_core = true);
>                              +  static void
>                                                             abort(bool
>                             dump_core
>                               = true,
>                             void *siginfo = NULL, void
>                                                             *context =
>                               NULL);
>
>                              Tests: JPRT and manually.
>
>                             Thanks
>                             Yumin
>
>                                                                     On
>                             3/11/2015 3:47 AM,
>                                                             Thomas
>                             Stüfe wrote:
>
>                               Hi Yumin,
>
>                               There is also
>                             test/runtime/ErrorHandling/SecondaryErrorTest.java
>                               - could you please add
>                             "-XX:-CreateCoredumpOnCrash" ?
>                               Thank you!
>
>                               Beyond that, as I wrote
>                                                             in the bug
>                               report
>                               comments:
>
>                               "This is also a problem
>                                                             on Windows -
>                             MiniDumpWriteDump() may hang
>                               infinitly. And on
>                               Windows this is worse
>                                                             than under
>                               UNIX because
>                               we create the Dump
>                                                             before writing
>                               the hs-err
>                               file, so if the Dump
>                                                             hangs, we get
>                               no error
>                               log. I would like to
>                                                             revert the
>                               order: create
>                               the minidump after
>                                                             writing the
>                               error log, the
>                               same way Unix does it.
>                                                             We did this
>                               in our JVM
>                               (SAP) because for us,
>                                                             error logs
>                               are more
>                               useful than minidumps. "
>
>                               So, I would like to see
>                                                             os::abort
>                               on Windows
>                               call
>                             MiniDumpWriteDump(), and thus
>                               the mini
>                               dump writing moved
>                                                             after the
>                             error log
>                               writing. This would
>                                                             also make the
>                               code way
>                               cleaner because the
>                                                             control flow
>                               would be the
>                               same on all platforms.
>
>                               I understand that this
>                                                             may be out
>                               of scope for
>                               your change, but I
>                                                             would like to
>                               know what
>                               others think about this.
>
>                               Kind regards, Thomas
>
>
>
>
>
>
>
>                               On Wed, Mar 11, 2015 at
>                                                             8:02 AM,
>                               Yumin Qi
>                                                            
>                             <yumin.qi at oracle.com
>                             <mailto:yumin.qi at oracle.com>
>                             <mailto:yumin.qi at oracle.com
>                             <mailto:yumin.qi at oracle.com>>
>                             <mailto:yumin.qi at oracle.com
>                             <mailto:yumin.qi at oracle.com>
>                             <mailto:yumin.qi at oracle.com
>                             <mailto:yumin.qi at oracle.com>>>
>                             <mailto:yumin.qi at oracle.com
>                             <mailto:yumin.qi at oracle.com>
>                             <mailto:yumin.qi at oracle.com
>                             <mailto:yumin.qi at oracle.com>>
>                             <mailto:yumin.qi at oracle.com
>                             <mailto:yumin.qi at oracle.com>
>                             <mailto:yumin.qi at oracle.com
>                             <mailto:yumin.qi at oracle.com>>>>
>                             <mailto:yumin.qi at oracle.com
>                             <mailto:yumin.qi at oracle.com>
>                             <mailto:yumin.qi at oracle.com
>                             <mailto:yumin.qi at oracle.com>>
>                             <mailto:yumin.qi at oracle.com
>                             <mailto:yumin.qi at oracle.com>
>                             <mailto:yumin.qi at oracle.com
>                             <mailto:yumin.qi at oracle.com>>>
>                             <mailto:yumin.qi at oracle.com
>                             <mailto:yumin.qi at oracle.com>
>                             <mailto:yumin.qi at oracle.com
>                             <mailto:yumin.qi at oracle.com>>
>                             <mailto:yumin.qi at oracle.com
>                             <mailto:yumin.qi at oracle.com>
>                             <mailto:yumin.qi at oracle.com
>                             <mailto:yumin.qi at oracle.com>>>>>> wrote:
>
>                                   Please review:
>
>                                   bug:
>                             https://bugs.openjdk.java.net/browse/JDK-8074354
>                                   webrev:
>                             http://cr.openjdk.java.net/~minqi/8074354/webrev01/
>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev01/>
>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev01/>
>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev01/>
>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev01/>
>                             <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev01/>
>
>                                   Summary: Tests
>                                                             timed out when
>                               VM crashes
>                               and dumping core file
>                                   which in the test
>                                                             case is not
>                               needed. To
>                               make core not created,
>                                   the fix changed
>                             CreateMinidumpOnCrash to
>                             CreateCoredumpOnCrash,
>                                   the former is only
>                                                             used on
>                               Windows and the
>                               latter for all
>                                                             platforms.
>                             When VM crashes on
>                               non Windows,
>                               core file generated as
>                                   default if OS sets
>                                                             core dump
>                               allowed.
>                               Default value of
>                             CreateCoredumpOnCrash set to
>                               'true' to
>                               keep same behavior on non
>                                   Windows platforms,
>                                                             but changed
>                               for Windows
>                               --- original is false,
>                                   not create minidump on
>                               Windows. With
>                             CreateCoredumpOnCrash turned
>                                   off, no core file
>                                                             will be
>                               generated.
>                             CreateMinidumpOnCrash still
>                                 can be used on
>                                                            
>                             commandline but
>                             only as
>                             alias for the new flag.
>
>                                 Tests: JPRT, manual
>                                                             tests setting
>                             CreateMinidumpOnCrash on
>                                                            
>                             commandline to verify flag
>                             change as alias.
>
>                                 Thanks
>                                 Yumin
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>



More information about the hotspot-runtime-dev mailing list