RFR(S): 8074354: tests that intentionally crash the VM can timeout creating a core file
Yumin Qi
yumin.qi at oracle.com
Fri Apr 10 02:07:01 UTC 2015
Hi, Thomas and Dan
The same URL is updated with new version.
http://cr.openjdk.java.net/~minqi/8074354/webrev07/
I remove the public access to VMError::out, VMError::coredump_message
and the size of VMError::coredump_message.
Added private static HANDLE in os_windows.cpp for dumpFile as
suggested by Thomas.
Tests passes JPRT and manual tests.
Thanks
Yumin
On 4/9/2015 1:32 AM, Thomas Stüfe wrote:
> Hi Yumin,
>
> I do not think exposing VMError::coredump_status buffer just for the
> sake of using it as a scratch buffer in os::abort() is a good idea. It
> is very confusing and has nothing to do with the purpose of
> VMerror::coredump_status.
>
> Also, VMerror::coredump_status is static, so you do not even gain
> anything over just using a global or function scope static buffer.
> Either we are worried about thread safety in os::abort(), or we
> aren't. If we aren't, just use a local static buffer, it is much
> clearer and no need to expose VMError::coredump_status like this.
>
> Apart from that, see my remarks in yesterdays mail.
>
> Cheers, Thomas
>
>
>
>
> On Thu, Apr 9, 2015 at 5:16 AM, Yumin Qi <yumin.qi at oracle.com
> <mailto:yumin.qi at oracle.com>> wrote:
>
> Hi, Dan
>
> New webrev at
> http://cr.openjdk.java.net/~minqi/8074354/webrev07/
> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>
> To make code clean, add os::abort_internal which is only used
> for Windows.
> Since passing VMError::coredump_message and its size as
> parameters across multiple functions makes function have a bigger
> signature, I change the way to access them -- add two functions in
> VMError to access them: coredump_msg_buffer() and
> coredump_msg_buffer_size(). They are static members, so can save
> stack space for passing around.
>
> in os_windows.cpp, abort(bool, void*, void*):
> if error encountered, exit with error messages.
> If no error, VMError::coredump_message contains the dump
> path/name. I just use them so no need to have extra space for name
> allocation.
>
> Passed JPRT and manual tests.
>
> Thanks
> Yumin
>
>
> On 4/7/2015 9:23 AM, Daniel D. Daugherty wrote:
>> On 4/1/15 10:57 AM, Yumin Qi wrote:
>>> Hi, Dan and Thomas,
>>>
>>> I have a new webrev at
>>> http://cr.openjdk.java.net/~minqi/8074354/webrev06/
>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev06/>
>>
>>
>> src/share/vm/runtime/globals.hpp
>> Sorry, I didn't notice this before:
>>
>> L939: product(bool, CreateCoredumpOnCrash,
>> true, \
>> L940: "Create minidump on VM fatal
>> error") \
>>
>> The "minidump" on L940 should change also. Don't know if you
>> want to say "core/mini dump" or something else.
>>
>>
>> src/share/vm/runtime/os.hpp
>> L719: // On Linux/Solaris it will simply check core dump
>> limits while on Windows will do nothing.
>> grammar: 'on Windows will do nothing'
>> -> 'on Windows it will do nothing'
>>
>>
>> src/share/vm/runtime/arguments.cpp
>> No comments.
>>
>> src/share/vm/utilities/vmError.hpp
>> No comments.
>>
>> src/share/vm/utilities/vmError.cpp
>> No comments.
>>
>> src/os/aix/vm/os_aix.cpp
>> No comments.
>>
>> src/os/bsd/vm/os_bsd.cpp
>> No comments.
>>
>> src/os/linux/vm/os_linux.cpp
>> No comments.
>>
>> src/os/posix/vm/os_posix.cpp
>> No comments.
>>
>> src/os/solaris/vm/os_solaris.cpp
>> No comments.
>>
>> src/os/windows/vm/os_windows.cpp
>> L92: jio_snprintf(buffer, bufferSize,
>> "%s\\hs_err_pid%u.mdmp", get_current_directory(NULL, 0),
>> current_process_id());
>> Other non-Windows calls to get_current_directory() do a NULL
>> check before using the return value. The original Windows
>> code that called get_current_directory() did not make this
>> check, but I think it's better to be consistent and safe.
>>
>> L1007: static char buffer[O_BUFLEN];
>> L1008: char filename[FILENAME_MAX];
>> Two more potentially large stack variables here. Will we
>> run into stack overflow on Win* more frequently when what
>> we really wanted was the reason for the abort() call?
>>
>> The reason the original code used 'buffer' to construct the
>> dump file name was to save space. The original code was also
>> careful to not need the file name in any error messages after
>> the point at which the dump file was successfully created.
>>
>> It would be better (more resilient) if you could pass in
>> an already existing buffer to abort() like the original
>> code passed into os::check_or_create_dump(). Yes, that
>> would require visiting all existing calls to abort().
>>
>> L1015: assert(out->is_open(), "defaultStream should be open");
>> Is it possible for os::abort() to be called early enough on
>> Windows such that VMError::out is not actually setup? Yes,
>> this would have to be absolutely catastrophic...
>>
>> See the comment in os_solaris.cpp L1520-1522.
>>
>> L1018: jio_snprintf(buffer, sizeof(buffer), "%s", "Either
>> CreateCoredumpOnCrash is disabled from command"
>> L1019 " line or coredump is not available");
>> When you are using jio_snprintf() as a replacement for:
>>
>> strncpy(buffer, "my string", sizeof(buffer));
>> buffer[sizeof(buffer) - 1] = '\0';
>>
>> you don't need the "%s" format string. So this code would be:
>>
>> jio_snprintf(buffer, sizeof(buffer),
>> "Either CreateCoredumpOnCrash is disabled from "
>> "command line or coredump is not available");
>>
>> Same comment for: L1026, L1039
>>
>> L1020: goto done;
>> The use of 'goto' makes me cringe. You can refactor much
>> of this logic into an abort_internal() function that
>> abort() calls and keep all the original "return" logic
>> to bail out.
>>
>> L1045: // Older versions of dbghelp.h doesn't contain all
>> grammar: 'doesn't' -> 'do not'
>>
>> L1052: cwd = get_current_directory(NULL, 0);
>> L1053: jio_snprintf(filename, sizeof(filename),
>> "%s\\hs_err_pid%u.mdmp", cwd, current_process_id());
>> Other non-Windows calls to get_current_directory() do a NULL
>> check before using the return value. The original Windows
>> code that called get_current_directory() did not make this
>> check, but I think it's better to be consistent and safe.
>>
>> L1092: // well done.
>> Like a overcooked steak? :-) Not sure what you're trying
>> to say here...
>>
>> test/runtime/ErrorHandling/ProblematicFrameTest.java
>> No comments.
>>
>> test/runtime/ErrorHandling/SafeFetchInErrorHandlingTest.java
>> No comments.
>>
>> test/runtime/ErrorHandling/SecondaryErrorTest.java
>> No comments.
>>
>> test/runtime/Safepoint/AssertSafepointCheckConsistency1.java
>> test/runtime/Safepoint/AssertSafepointCheckConsistency2.java
>> test/runtime/Safepoint/AssertSafepointCheckConsistency3.java
>> test/runtime/Safepoint/AssertSafepointCheckConsistency4.java
>> No comments on these four.
>>
>>
>> Dan
>>
>>
>>>
>>> What changes:
>>>
>>> 1) os_windows.cpp, os::abort(bool ....):
>>> a) Since when abort() is called, the last function
>>> called before exit process, VMError::log already closed, which
>>> is the file logging. defaultStream still available to use, so
>>> use it for writing error or success messages. VMError::out is
>>> the defaultStream, for using it, added a function out_stream()
>>> in VMError.
>>> b) Delete the codes which decide to write core based on
>>> client/server version with debug. The new code is if
>>> CreateCoredumpOnCrash set or dump_core is on, the coredump will
>>> be tried no matter which version it is. There is no special
>>> reason not to dump core with client version I think. Any concern
>>> for this change?
>>> c) Use of 'goto' in this function. I tried not to use it,
>>> but using it makes code clear. I remember somewhere suggest not
>>> using 'goto' in cplusplus.
>>>
>>> 2) changed comments to make them consistent as indicated by Dan.
>>>
>>> 3) Added/modified head version for files.
>>>
>>> Tests: JPRT, manual test on local Windows.
>>>
>>> Thanks
>>> Yumin
>>>
>>>
>>> On 3/31/2015 1:25 AM, Thomas Stüfe wrote:
>>>> Hi Daniel, Yumin,
>>>>
>>>> sorry, a lot of Yumins changes were based on suggestions from
>>>> me, so here more background:
>>>>
>>>> Yumin reversed the order of error log writing and core dumping.
>>>> The reason for that are explained in detail at the start of the
>>>> mail thread, in short:
>>>>
>>>> - on Windows core file dumping may hang - its rare but it
>>>> happens - preventing error logs. Depending on who you ask,
>>>> error logs are more important than minidumps, so it makes sense
>>>> to first write the erorr log.
>>>> - This also brings windows core paths closer to UNIX code
>>>> paths. See below.
>>>>
>>>> About writing to stderr in os::abort():
>>>>
>>>> After this change, calling VmError::report_coredump_status() in
>>>> os::abort() will not do anything because the error log is
>>>> already written at that point. I suggested to Yumin writing to
>>>> stderr instead in os::abort() because that mimicks UNIX
>>>> behaviour: On UNIX, you call abort(3), which writes a core and
>>>> aborts. It writes a short message to stderr "Aborted (core
>>>> dumped)" or just "Aborted".
>>>>
>>>> After Yumins change, on Windows os::abort also writes the core,
>>>> then aborts. It made sense to me that this function should
>>>> behave the same way as on UNIX: if core writing fails, write to
>>>> stderr. There is no way otherwise to communicate a core dump
>>>> writing failure. After writing the core, process stops.
>>>>
>>>> ---
>>>>
>>>> About the control flows:
>>>>
>>>> Before Yumins change:
>>>>
>>>> On Windows
>>>> 1) os::check_or_create_dump(): we wrote the core dump and then
>>>> information about the dump (file location, success or error)
>>>> was handed to the misnamed VmError::report_coredump_status(),
>>>> which just stored that information.
>>>> 2) VmError::report(): Error log was written and in step 63
>>>> information about the core (already written at that point) was
>>>> added to error log.
>>>> 3) os::abort() just did an exit(3)
>>>>
>>>> On Unix,
>>>> 1) os::check_or_create_dump(): checks the limits to guess
>>>> whether core dumping later will succeed. Again, that
>>>> information is handed to VmError::report_coredump_status()
>>>> 2) VmError::report(): Error log is written and in step 63,
>>>> information about the core's probable future location is added
>>>> to error log. Here, messages use past tense, which is
>>>> misleading, because the core is not yet written.
>>>> 3) os::abort() calls abort(3), which stops and writes a core.
>>>>
>>>> Yumin pushed core file writing on Windows down to os::abort().
>>>> This brings the control flow much closer to Unix:
>>>>
>>>> (1) call os::check_dump_limit() to check the prerequisites for
>>>> core file writing and gather a bit information to put in the
>>>> error log: On Unix, limit checks, on windows, so far nothing
>>>> much but precalculating the error file name.
>>>> (2) VmError::report(): Error log is written and information
>>>> about the core's probable location is added to error log.
>>>> (3) os::abort() is called, which on all platforms:
>>>> - dumps the core
>>>> - if failing, writes a one-liner to stderr
>>>> - stops the process.
>>>>
>>>> I think, if one agrees that reversing order of core dumping and
>>>> error log writing on windows is a good thing, this change looks
>>>> ok to me. Some improvements could make it clearer:
>>>>
>>>> - maybe rename "os::check_dump_limit()" (was my suggestion,
>>>> sorry) to "os::check_core_prerequisites()" - that leaves room
>>>> to flesh it out a bit more on Windows, where we do not have
>>>> UNIX limits.
>>>> - make the messages in VMError::report() future sense: "Will
>>>> write core file to location.....".
>>>> - Make the messages written in os::abort() on Windows clearer,
>>>> and shorter, and we also do not need the buffer to be static
>>>> here. Actually, if we want to be as on UNIX, just dumping "core
>>>> file writing succeeded" or "failed" - maybe with an error
>>>> number from GetLastError() - will be enough because the file
>>>> location is already printed in the error log. Like on UNIX.
>>>>
>>>> Kind Regards, Thomas
>>>>
>>>>
>>>> On Mon, Mar 30, 2015 at 9:51 PM, Daniel D. Daugherty
>>>> <daniel.daugherty at oracle.com
>>>> <mailto:daniel.daugherty at oracle.com>> wrote:
>>>>
>>>> Re: report_coredump_status() is really record_coredump_status()
>>>>
>>>> The report_coredump_status() function is designed to record two
>>>> things about core dumps for later reporting by the code that
>>>> generates the hs_err_pid file. That's why the original
>>>> report_coredump_status() didn't output anything to stderr.
>>>>
>>>> By changing the Windows calls to report_coredump_status() into
>>>> jio_fprintf() calls you have:
>>>>
>>>> - put output onto stderr that should go to the hs_err_pid file
>>>> - made the windows code paths work differently than the
>>>> non-windows
>>>> code paths
>>>>
>>>>
>>>> Dan
>>>>
>>>>
>>>>
>>>> On 3/30/15 1:28 PM, Yumin Qi wrote:
>>>>
>>>> Thanks Dan
>>>>
>>>> On 3/30/2015 11:25 AM, Daniel D. Daugherty wrote:
>>>>
>>>> On 3/25/15 12:11 PM, Yumin Qi wrote:
>>>>
>>>> Hi, all
>>>>
>>>> I updated the webrev with a new change to
>>>> test case:
>>>> test/runtime/Unsafe/RangeCheck.java
>>>>
>>>> Add -XX:-CreateCoredumpOnCrash to test, no
>>>> dump needed on this case.
>>>>
>>>> new webrev:
>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev05/
>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev05/>
>>>>
>>>>
>>>> General nit - please update copyright lines to 2015
>>>> as needed
>>>>
>>>> src/share/vm/runtime/globals.hpp
>>>> No comments.
>>>>
>>>> src/share/vm/runtime/os.hpp
>>>> line 720: // On Windows this will create an
>>>> actual minidump, on Linux/Solaris it will simply
>>>> check core dump limits
>>>> line 721: static void check_dump_limit(char*
>>>> buffer, size_t bufferSize);
>>>> The comment on line 720 no longer matches
>>>> with the function
>>>> rename from check_or_create_dump() to
>>>> check_dump_limit().
>>>>
>>>> You have this comment in vmError.cpp:
>>>> 943 // check core dump limits on
>>>> Linux/Solaris, nothing on Windows
>>>> 944 os::check_dump_limit(buffer,
>>>> sizeof(buffer));
>>>>
>>>> so the two comments are out of sync.
>>>>
>>>> Will convert them to be consistent.
>>>>
>>>> src/share/vm/runtime/arguments.cpp
>>>> line 3252: } else if (match_option(option,
>>>> "-XX:+CreateMinidumpOnCrash", &tail)) {
>>>> line 3256: } else if (match_option(option,
>>>> "-XX:-CreateMinidumpOnCrash", &tail)) {
>>>> These two checks should use the
>>>> match_option() version
>>>> without a '&tail' parameter.
>>>>
>>>> Will use version w/o tail.
>>>>
>>>> src/share/vm/utilities/vmError.cpp
>>>> line 217: bool VMError::coredump_status =
>>>> true; // presume we can dump core file first
>>>> I don't think you should init this to true.
>>>> It confuses
>>>> things with
>>>> VMError::report_coredump_status(). It will also
>>>> confuse this code:
>>>>
>>>> 526 STEP(63, "(printing core file
>>>> information)")
>>>> 529 if (coredump_status) {
>>>> 530 st->print("Core dump written. Default
>>>> location: %s", coredump_message);
>>>>
>>>> because coredump_status won't accurately
>>>> reflect whether
>>>> the coredump_message field has been set.
>>>>
>>>> line 943: // check core dump limits on
>>>> Linux/Solaris, nothing on Windows
>>>> This updated comment doesn't match the
>>>> unmodified
>>>> comment in os.hpp:
>>>>
>>>> 720: // On Windows this will create an
>>>> actual minidump, on Linux/Solaris it will simply
>>>> check core dump limits
>>>>
>>>> I will consider your comments, then revise with a new
>>>> version. I remember here there is a case, if status not
>>>> set, it will output an inconsistent message. Will check
>>>> again.
>>>>
>>>> src/os/aix/vm/os_aix.cpp
>>>> No comments.
>>>>
>>>> src/os/bsd/vm/os_bsd.cpp
>>>> No comments.
>>>>
>>>> src/os/linux/vm/os_linux.cpp
>>>> No comments.
>>>>
>>>> src/os/posix/vm/os_posix.cpp
>>>> No comments.
>>>>
>>>> src/os/solaris/vm/os_solaris.cpp
>>>> No comments.
>>>>
>>>> src/os/windows/vm/os_windows.cpp
>>>> line 1008: static char buffer[O_BUFLEN];
>>>> Why switch from a buffer parameter to a
>>>> static buffer?
>>>> What happens with parallel calls to
>>>> abort()? Will the static
>>>> buffer get stomped by the competing threads?
>>>>
>>>> The original buffer is static too. Carrying an buffer
>>>> for abort seems not right. This buffer in fact only for
>>>> storing file name (based on pid) here.
>>>> abort() will dump the core file, should we prevent
>>>> multi-thread calling to this function? I did not check
>>>> this part, will check if it is MT-safe.
>>>>
>>>>
>>>> line 1015: // If running on a client version
>>>> of Windows and user has not explicitly enabled dumping
>>>> line 1016: if
>>>> (!os::win32::is_windows_server() &&
>>>> !CreateCoredumpOnCrash) {
>>>> The default for CreateCoredumpOnCrash is
>>>> now 'true' so the
>>>> comment and logic are no longer correct
>>>> here. The Windows
>>>> Client VM will generate minidumps by default.
>>>>
>>>> old line 1007:
>>>> VMError::report_coredump_status("Minidumps are not
>>>> enabled by default on client versions of Windows",
>>>> false);
>>>> new line 1017: jio_fprintf(stderr, "Minidumps
>>>> are not enabled by default on client versions of
>>>> Windows.\n");
>>>> There are a number of places where we
>>>> change from
>>>> VMError::report_coredump_status() to jio_fprintf()
>>>> and I'm not quite seeing why this was done.
>>>>
>>>> Update: VMError::report_coredump_status()
>>>> is misnamed. It doesn't
>>>> report anything in the sense that it
>>>> doesn't print anything. It
>>>> does record two pieces of information about
>>>> core dumps so maybe
>>>> it should be VMError::record_coredump_status().
>>>>
>>>> line 1100: jio_fprintf(stderr, "%s.\n", buffer);
>>>> At this point, buffer contains the path to the
>>>> mini-dump file so the above line simply prints
>>>> that on stderr. Why?
>>>>
>>>> Yes, I see that the old code did something
>>>> similar:
>>>>
>>>> 1090
>>>> VMError::report_coredump_status(buffer, true);
>>>>
>>>> but report_coredump_status() didn't
>>>> actually print
>>>> anything. It just squirreled away these in
>>>> vmError.cpp:
>>>>
>>>> 217 bool VMError::coredump_status;
>>>> 218 char VMError::coredump_message[O_BUFLEN];
>>>>
>>>> See comments above about how
>>>> report_coredump_status()
>>>> is misnamed.
>>>>
>>>> At this point, I'm convinced that all the
>>>> changes from
>>>> VMError::report_coredump_status() to jio_fprintf() are
>>>> a bad idea.
>>>>
>>>>
>>>> Changing to jio_fprintf is because
>>>> report_coredump_status did not output anything, it is
>>>> just a logging as you indicated. It is reasonable we
>>>> change it to record_coredump_status instead. How about
>>>> add printout to report_coredump_status? So I do not
>>>> need to use jio_printf here.
>>>>
>>>> Other consideration of using jio_fprintf after call
>>>> shutdown called, the static output stream still exists,
>>>> but not guaranteed to work as expected.
>>>>
>>>>
>>>> Thanks
>>>> Yumin
>>>>
>>>> test/runtime/ErrorHandling/ProblematicFrameTest.java
>>>> No comments.
>>>>
>>>> test/runtime/ErrorHandling/SafeFetchInErrorHandlingTest.java
>>>> No comments.
>>>>
>>>> test/runtime/ErrorHandling/SecondaryErrorTest.java
>>>> No comments.
>>>>
>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency1.java
>>>> No comments.
>>>>
>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency2.java
>>>> No comments.
>>>>
>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency3.java
>>>> No comments.
>>>>
>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency4.java
>>>> No comments.
>>>>
>>>> test/runtime/Unsafe/RangeCheck.java
>>>> No comments.
>>>>
>>>> test/runtime/memory/ReadFromNoaccessArea.java
>>>> No comments.
>>>>
>>>>
>>>>
>>>> Thanks
>>>> Yumin
>>>>
>>>>
>>>> On 3/23/2015 11:48 AM, Yumin Qi wrote:
>>>>
>>>> Since Thomas is not a reviewer in open jdk,
>>>> I need two volunteers (at least one
>>>> "R"eviewer).
>>>>
>>>> Dan, since you already reviewed previous
>>>> version, could you have a look?
>>>>
>>>> Thanks
>>>> Yumin
>>>>
>>>> On 3/20/2015 3:20 PM, Yumin Qi wrote:
>>>>
>>>> Thomas,
>>>>
>>>> Thanks. Yes, it is webrev04. Also, I
>>>> have updated webrev04 to add another
>>>> test case change:
>>>> test/runtime/memory/ReadFromNoaccessArea.java
>>>> (Thanks Dan's update)
>>>>
>>>> Thanks
>>>> Yumin
>>>>
>>>> On 3/20/2015 11:55 AM, Thomas Stüfe wrote:
>>>>
>>>>
>>>> Hi Yumin,
>>>>
>>>> I think you meant to post webrev.04?
>>>>
>>>> I looked at 04, and it looks nice.
>>>> Thank you!
>>>>
>>>> (still only reviewer with r)
>>>>
>>>> Thomas
>>>>
>>>> On Mar 20, 2015 6:20 PM, "Yumin Qi"
>>>> <yumin.qi at oracle.com
>>>> <mailto:yumin.qi at oracle.com>
>>>> <mailto:yumin.qi at oracle.com
>>>> <mailto:yumin.qi at oracle.com>>> wrote:
>>>>
>>>> Hi, All
>>>>
>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev03/
>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev03/>
>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev03/>
>>>>
>>>> This version is based on
>>>> Thomas' suggestion.
>>>> Tests passed
>>>> vm.runtime.quick.testlist, JPRT and
>>>> manual tests.
>>>>
>>>> Thanks
>>>> Yumin
>>>>
>>>> On 3/18/2015 10:06 AM, Yumin Qi
>>>> wrote:
>>>>
>>>>
>>>> On 3/18/2015 4:26 AM,
>>>> Thomas Stüfe wrote:
>>>>
>>>> Hi Yumin,
>>>>
>>>> - just aesthetics, but
>>>> on all implementations of
>>>> os_abort(), at least on
>>>> the UNIX variants, maybe we could
>>>> consistently rename the
>>>> parameters to be "siginfo" and
>>>> "context" instead of
>>>> using Windows terms?
>>>>
>>>> Sure.
>>>>
>>>> - On
>>>> check_create_dump_limit(), you do
>>>> not need
>>>> exceptionRecord, contextRecord
>>>> parameters anymore. The
>>>> only parameters now
>>>> needed are buffer and buffer size,
>>>> which is not even an
>>>> output parameter, just a scratch
>>>> buffer for this
>>>> function to use for printf. For
>>>> output, it
>>>> calls
>>>> VMError::report_coredump_status(...) at
>>>> the end to
>>>> provide information
>>>> about the core file.
>>>>
>>>> - I would rename that
>>>> function from
>>>> check_create_dump_limit() to
>>>> check_dump_limit() - nothing
>>>> is created anymore.
>>>>
>>>> Yes, no need to carry those
>>>> two parameters. I tried to keep
>>>> less changes, but decided
>>>> to change now.
>>>>
>>>> - on Windows, in
>>>> os::abort(), there is no point
>>>> anymore in
>>>> calling
>>>> VMError::report_coredump_status(...) because
>>>> that
>>>> information is only
>>>> used during error log writing which
>>>> already happened. It
>>>> only makes sense to call this
>>>> function in
>>>> check_create_dump_limit(), which
>>>> happens
>>>> before error log writing.
>>>> Instead, maybe error
>>>> messages like "Call to
>>>> MiniDumpWriteDump() failed" should
>>>> just be written to
>>>> stderr? This would be
>>>> consistent with Unix, where the OS
>>>> writes a short message
>>>> on stderr if core file writing fails.
>>>>
>>>> Sure, will direct output to
>>>> stderr.
>>>>
>>>> - there is now a new test,
>>>> test/runtime/ErrorHandling/SafeFetchInErrorHandlingTest.java
>>>> - could you please also
>>>> add "-XX:-CreateCoredumpOnCrash" ?
>>>> I just added the test
>>>> and did not want to add the option
>>>> before it was available.
>>>>
>>>> OK
>>>>
>>>> Thanks
>>>> Yumin
>>>>
>>>> Thanks for your work!
>>>>
>>>> Thomas
>>>>
>>>>
>>>> On Tue, Mar 17, 2015 at
>>>> 2:04 AM, Yumin Qi
>>>> <yumin.qi at oracle.com
>>>> <mailto:yumin.qi at oracle.com>
>>>> <mailto:yumin.qi at oracle.com
>>>> <mailto:yumin.qi at oracle.com>>
>>>>
>>>> <mailto:yumin.qi at oracle.com
>>>> <mailto:yumin.qi at oracle.com>
>>>> <mailto:yumin.qi at oracle.com
>>>> <mailto:yumin.qi at oracle.com>>>>
>>>> wrote:
>>>>
>>>> Hi, Dan and all
>>>>
>>>> I have updated
>>>> webrev at:
>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev03/
>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev03/>
>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev03/>
>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev03/>
>>>>
>>>> 1) bug changed
>>>> synopsis
>>>> 8074354: Make
>>>> CreateMinidumpOnCrash a new name and
>>>> available on all platforms
>>>> 2) tests: JPRT,
>>>> vm.runtime.quick.testlist (on
>>>> Windows), jtreg on
>>>> Windows and Linux/Unix.
>>>>
>>>> Thanks
>>>> Yumin
>>>>
>>>>
>>>> On 3/12/2015 12:59
>>>> PM, Daniel D. Daugherty wrote:
>>>>
>>>> I believe there
>>>> are both JavaTest/JTREG tests in
>>>> the hotspot repo
>>>> and there are
>>>> VM/NSK tests from back when we did
>>>> phone home...
>>>>
>>>> Check with Christian or Misha...
>>>>
>>>> Dan
>>>>
>>>>
>>>> On 3/12/15 1:50
>>>> PM, Yumin Qi wrote:
>>>>
>>>> Thanks, Dan
>>>>
>>>> Will look at and run those tests.
>>>> Are they
>>>> under nsk?
>>>>
>>>> Yumin
>>>>
>>>> On 3/12/2015 12:29 PM, Daniel D.
>>>> Daugherty wrote:
>>>>
>>>> Yumin,
>>>>
>>>> There are some error handler
>>>> tests. You
>>>> should find
>>>> those and make
>>>> sure that you run them on
>>>> Windows since you're
>>>> changing the order
>>>> there...
>>>>
>>>> Dan
>>>>
>>>>
>>>> On 3/12/15 12:42 PM, Yumin Qi
>>>> wrote:
>>>>
>>>> Hi, Thomas and all
>>>>
>>>> The second version of
>>>> the change:
>>>> webrev:
>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev02/
>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev02/>
>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev02/>
>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev02/>
>>>>
>>>> In this version:
>>>> 1) switch order on
>>>> Windows to first
>>>> print log
>>>> file as other platform,
>>>> then dump core
>>>> file if
>>>> needed. When VMError
>>>> created after
>>>> crash, siginfo
>>>> and context which are
>>>> ExceptionRecord and
>>>> ContextRecord on Windows
>>>> are recorded,
>>>> mini (or
>>>> full, due to 'or' used so
>>>> type will be
>>>> mini
>>>> anyway) dump creates dump
>>>> file based
>>>> on those two
>>>> as before.
>>>> 2) to make os::abort to
>>>> get above
>>>> two pointers,
>>>> added two more fields to
>>>> the function
>>>> parameters:
>>>> 3) changes for
>>>> test/runtime/ErrorHandling/SecondaryError.java
>>>> ---
>>>> added
>>>> "-XX:-CreateCoredumpOnCrash"
>>>>
>>>> - static void abort(bool
>>>> dump_core = true);
>>>> + static void
>>>> abort(bool dump_core
>>>> = true,
>>>> void *siginfo = NULL, void
>>>> *context =
>>>> NULL);
>>>>
>>>> Tests: JPRT and manually.
>>>>
>>>> Thanks
>>>> Yumin
>>>>
>>>> On 3/11/2015 3:47 AM,
>>>> Thomas Stüfe wrote:
>>>>
>>>> Hi Yumin,
>>>>
>>>> There is also
>>>> test/runtime/ErrorHandling/SecondaryErrorTest.java
>>>> - could you please add
>>>> "-XX:-CreateCoredumpOnCrash" ?
>>>> Thank you!
>>>>
>>>> Beyond that, as I wrote
>>>> in the bug
>>>> report
>>>> comments:
>>>>
>>>> "This is also a problem
>>>> on Windows -
>>>> MiniDumpWriteDump() may hang
>>>> infinitly. And on
>>>> Windows this is worse
>>>> than under
>>>> UNIX because
>>>> we create the Dump
>>>> before writing
>>>> the hs-err
>>>> file, so if the Dump
>>>> hangs, we get
>>>> no error
>>>> log. I would like to
>>>> revert the
>>>> order: create
>>>> the minidump after
>>>> writing the
>>>> error log, the
>>>> same way Unix does it.
>>>> We did this
>>>> in our JVM
>>>> (SAP) because for us,
>>>> error logs
>>>> are more
>>>> useful than minidumps. "
>>>>
>>>> So, I would like to see
>>>> os::abort
>>>> on Windows
>>>> call
>>>> MiniDumpWriteDump(), and thus
>>>> the mini
>>>> dump writing moved
>>>> after the error log
>>>> writing. This would
>>>> also make the
>>>> code way
>>>> cleaner because the
>>>> control flow
>>>> would be the
>>>> same on all platforms.
>>>>
>>>> I understand that this
>>>> may be out
>>>> of scope for
>>>> your change, but I
>>>> would like to
>>>> know what
>>>> others think about this.
>>>>
>>>> Kind regards, Thomas
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Mar 11, 2015 at
>>>> 8:02 AM,
>>>> Yumin Qi
>>>> <yumin.qi at oracle.com
>>>> <mailto:yumin.qi at oracle.com>
>>>>
>>>> <mailto:yumin.qi at oracle.com
>>>> <mailto:yumin.qi at oracle.com>>
>>>> <mailto:yumin.qi at oracle.com
>>>> <mailto:yumin.qi at oracle.com>
>>>>
>>>> <mailto:yumin.qi at oracle.com
>>>> <mailto:yumin.qi at oracle.com>>>
>>>> <mailto:yumin.qi at oracle.com
>>>> <mailto:yumin.qi at oracle.com>
>>>>
>>>> <mailto:yumin.qi at oracle.com
>>>> <mailto:yumin.qi at oracle.com>>
>>>>
>>>> <mailto:yumin.qi at oracle.com
>>>> <mailto:yumin.qi at oracle.com>
>>>> <mailto:yumin.qi at oracle.com
>>>> <mailto:yumin.qi at oracle.com>>>>> wrote:
>>>>
>>>> Please review:
>>>>
>>>> bug:
>>>> https://bugs.openjdk.java.net/browse/JDK-8074354
>>>> webrev:
>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev01/
>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev01/>
>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev01/>
>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev01/>
>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev01/>
>>>>
>>>> Summary: Tests
>>>> timed out when
>>>> VM crashes
>>>> and dumping core file
>>>> which in the test
>>>> case is not
>>>> needed. To
>>>> make core not created,
>>>> the fix changed
>>>> CreateMinidumpOnCrash to
>>>> CreateCoredumpOnCrash,
>>>> the former is only
>>>> used on
>>>> Windows and the
>>>> latter for all
>>>> platforms. When VM crashes on
>>>> non Windows,
>>>> core file generated as
>>>> default if OS sets
>>>> core dump
>>>> allowed.
>>>> Default value of
>>>> CreateCoredumpOnCrash set to
>>>> 'true' to
>>>> keep same behavior on non
>>>> Windows platforms,
>>>> but changed
>>>> for Windows
>>>> --- original is false,
>>>> not create minidump on
>>>> Windows. With
>>>> CreateCoredumpOnCrash turned
>>>> off, no core file
>>>> will be
>>>> generated.
>>>> CreateMinidumpOnCrash still
>>>> can be used on
>>>> commandline but
>>>> only as
>>>> alias for the new flag.
>>>>
>>>> Tests: JPRT, manual
>>>> tests setting
>>>> CreateMinidumpOnCrash on
>>>> commandline to verify flag
>>>> change as alias.
>>>>
>>>> Thanks
>>>> Yumin
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>
>
More information about the hotspot-runtime-dev
mailing list