RFR(S): 8074354: tests that intentionally crash the VM can timeout creating a core file
Yumin Qi
yumin.qi at oracle.com
Fri Apr 10 02:51:11 UTC 2015
Please hold on, I missed the test cases:
*test/runtime/Unsafe/RangeCheck.java
*
*test/runtime/memory/ReadFromNoaccessArea.java
Also some file heads need to be updated with this year.
Thanks
Yumin
*
On 4/9/2015 7:07 PM, Yumin Qi wrote:
> Hi, Thomas and Dan
>
> The same URL is updated with new version.
>
> http://cr.openjdk.java.net/~minqi/8074354/webrev07/
>
> I remove the public access to VMError::out,
> VMError::coredump_message and the size of VMError::coredump_message.
> Added private static HANDLE in os_windows.cpp for dumpFile as
> suggested by Thomas.
>
> Tests passes JPRT and manual tests.
>
> Thanks
> Yumin
>
>
> On 4/9/2015 1:32 AM, Thomas Stüfe wrote:
>> Hi Yumin,
>>
>> I do not think exposing VMError::coredump_status buffer just for the
>> sake of using it as a scratch buffer in os::abort() is a good idea.
>> It is very confusing and has nothing to do with the purpose of
>> VMerror::coredump_status.
>>
>> Also, VMerror::coredump_status is static, so you do not even gain
>> anything over just using a global or function scope static buffer.
>> Either we are worried about thread safety in os::abort(), or we
>> aren't. If we aren't, just use a local static buffer, it is much
>> clearer and no need to expose VMError::coredump_status like this.
>>
>> Apart from that, see my remarks in yesterdays mail.
>>
>> Cheers, Thomas
>>
>>
>>
>>
>> On Thu, Apr 9, 2015 at 5:16 AM, Yumin Qi <yumin.qi at oracle.com
>> <mailto:yumin.qi at oracle.com>> wrote:
>>
>> Hi, Dan
>>
>> New webrev at
>> http://cr.openjdk.java.net/~minqi/8074354/webrev07/
>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>
>> To make code clean, add os::abort_internal which is only used
>> for Windows.
>> Since passing VMError::coredump_message and its size as
>> parameters across multiple functions makes function have a bigger
>> signature, I change the way to access them -- add two functions in
>> VMError to access them: coredump_msg_buffer() and
>> coredump_msg_buffer_size(). They are static members, so can save
>> stack space for passing around.
>>
>> in os_windows.cpp, abort(bool, void*, void*):
>> if error encountered, exit with error messages.
>> If no error, VMError::coredump_message contains the dump
>> path/name. I just use them so no need to have extra space for name
>> allocation.
>>
>> Passed JPRT and manual tests.
>>
>> Thanks
>> Yumin
>>
>>
>> On 4/7/2015 9:23 AM, Daniel D. Daugherty wrote:
>>> On 4/1/15 10:57 AM, Yumin Qi wrote:
>>>> Hi, Dan and Thomas,
>>>>
>>>> I have a new webrev at
>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev06/
>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev06/>
>>>
>>>
>>> src/share/vm/runtime/globals.hpp
>>> Sorry, I didn't notice this before:
>>>
>>> L939: product(bool, CreateCoredumpOnCrash,
>>> true, \
>>> L940: "Create minidump on VM fatal
>>> error") \
>>>
>>> The "minidump" on L940 should change also. Don't know if
>>> you
>>> want to say "core/mini dump" or something else.
>>>
>>>
>>> src/share/vm/runtime/os.hpp
>>> L719: // On Linux/Solaris it will simply check core dump
>>> limits while on Windows will do nothing.
>>> grammar: 'on Windows will do nothing'
>>> -> 'on Windows it will do nothing'
>>>
>>>
>>> src/share/vm/runtime/arguments.cpp
>>> No comments.
>>>
>>> src/share/vm/utilities/vmError.hpp
>>> No comments.
>>>
>>> src/share/vm/utilities/vmError.cpp
>>> No comments.
>>>
>>> src/os/aix/vm/os_aix.cpp
>>> No comments.
>>>
>>> src/os/bsd/vm/os_bsd.cpp
>>> No comments.
>>>
>>> src/os/linux/vm/os_linux.cpp
>>> No comments.
>>>
>>> src/os/posix/vm/os_posix.cpp
>>> No comments.
>>>
>>> src/os/solaris/vm/os_solaris.cpp
>>> No comments.
>>>
>>> src/os/windows/vm/os_windows.cpp
>>> L92: jio_snprintf(buffer, bufferSize,
>>> "%s\\hs_err_pid%u.mdmp", get_current_directory(NULL, 0),
>>> current_process_id());
>>> Other non-Windows calls to get_current_directory() do a
>>> NULL
>>> check before using the return value. The original Windows
>>> code that called get_current_directory() did not make this
>>> check, but I think it's better to be consistent and safe.
>>>
>>> L1007: static char buffer[O_BUFLEN];
>>> L1008: char filename[FILENAME_MAX];
>>> Two more potentially large stack variables here. Will we
>>> run into stack overflow on Win* more frequently when what
>>> we really wanted was the reason for the abort() call?
>>>
>>> The reason the original code used 'buffer' to construct the
>>> dump file name was to save space. The original code was
>>> also
>>> careful to not need the file name in any error messages
>>> after
>>> the point at which the dump file was successfully created.
>>>
>>> It would be better (more resilient) if you could pass in
>>> an already existing buffer to abort() like the original
>>> code passed into os::check_or_create_dump(). Yes, that
>>> would require visiting all existing calls to abort().
>>>
>>> L1015: assert(out->is_open(), "defaultStream should be
>>> open");
>>> Is it possible for os::abort() to be called early enough on
>>> Windows such that VMError::out is not actually setup? Yes,
>>> this would have to be absolutely catastrophic...
>>>
>>> See the comment in os_solaris.cpp L1520-1522.
>>>
>>> L1018: jio_snprintf(buffer, sizeof(buffer), "%s", "Either
>>> CreateCoredumpOnCrash is disabled from command"
>>> L1019 " line or coredump is not available");
>>> When you are using jio_snprintf() as a replacement for:
>>>
>>> strncpy(buffer, "my string", sizeof(buffer));
>>> buffer[sizeof(buffer) - 1] = '\0';
>>>
>>> you don't need the "%s" format string. So this code
>>> would be:
>>>
>>> jio_snprintf(buffer, sizeof(buffer),
>>> "Either CreateCoredumpOnCrash is disabled
>>> from "
>>> "command line or coredump is not available");
>>>
>>> Same comment for: L1026, L1039
>>>
>>> L1020: goto done;
>>> The use of 'goto' makes me cringe. You can refactor much
>>> of this logic into an abort_internal() function that
>>> abort() calls and keep all the original "return" logic
>>> to bail out.
>>>
>>> L1045: // Older versions of dbghelp.h doesn't contain all
>>> grammar: 'doesn't' -> 'do not'
>>>
>>> L1052: cwd = get_current_directory(NULL, 0);
>>> L1053: jio_snprintf(filename, sizeof(filename),
>>> "%s\\hs_err_pid%u.mdmp", cwd, current_process_id());
>>> Other non-Windows calls to get_current_directory() do a
>>> NULL
>>> check before using the return value. The original Windows
>>> code that called get_current_directory() did not make this
>>> check, but I think it's better to be consistent and safe.
>>>
>>> L1092: // well done.
>>> Like a overcooked steak? :-) Not sure what you're trying
>>> to say here...
>>>
>>> test/runtime/ErrorHandling/ProblematicFrameTest.java
>>> No comments.
>>>
>>> test/runtime/ErrorHandling/SafeFetchInErrorHandlingTest.java
>>> No comments.
>>>
>>> test/runtime/ErrorHandling/SecondaryErrorTest.java
>>> No comments.
>>>
>>> test/runtime/Safepoint/AssertSafepointCheckConsistency1.java
>>> test/runtime/Safepoint/AssertSafepointCheckConsistency2.java
>>> test/runtime/Safepoint/AssertSafepointCheckConsistency3.java
>>> test/runtime/Safepoint/AssertSafepointCheckConsistency4.java
>>> No comments on these four.
>>>
>>>
>>> Dan
>>>
>>>
>>>>
>>>> What changes:
>>>>
>>>> 1) os_windows.cpp, os::abort(bool ....):
>>>> a) Since when abort() is called, the last function
>>>> called before exit process, VMError::log already closed, which
>>>> is the file logging. defaultStream still available to use, so
>>>> use it for writing error or success messages. VMError::out is
>>>> the defaultStream, for using it, added a function out_stream()
>>>> in VMError.
>>>> b) Delete the codes which decide to write core based on
>>>> client/server version with debug. The new code is if
>>>> CreateCoredumpOnCrash set or dump_core is on, the coredump will
>>>> be tried no matter which version it is. There is no special
>>>> reason not to dump core with client version I think. Any concern
>>>> for this change?
>>>> c) Use of 'goto' in this function. I tried not to use it,
>>>> but using it makes code clear. I remember somewhere suggest not
>>>> using 'goto' in cplusplus.
>>>>
>>>> 2) changed comments to make them consistent as indicated by
>>>> Dan.
>>>>
>>>> 3) Added/modified head version for files.
>>>>
>>>> Tests: JPRT, manual test on local Windows.
>>>>
>>>> Thanks
>>>> Yumin
>>>>
>>>>
>>>> On 3/31/2015 1:25 AM, Thomas Stüfe wrote:
>>>>> Hi Daniel, Yumin,
>>>>>
>>>>> sorry, a lot of Yumins changes were based on suggestions from
>>>>> me, so here more background:
>>>>>
>>>>> Yumin reversed the order of error log writing and core dumping.
>>>>> The reason for that are explained in detail at the start of the
>>>>> mail thread, in short:
>>>>>
>>>>> - on Windows core file dumping may hang - its rare but it
>>>>> happens - preventing error logs. Depending on who you ask,
>>>>> error logs are more important than minidumps, so it makes sense
>>>>> to first write the erorr log.
>>>>> - This also brings windows core paths closer to UNIX code
>>>>> paths. See below.
>>>>>
>>>>> About writing to stderr in os::abort():
>>>>>
>>>>> After this change, calling VmError::report_coredump_status() in
>>>>> os::abort() will not do anything because the error log is
>>>>> already written at that point. I suggested to Yumin writing to
>>>>> stderr instead in os::abort() because that mimicks UNIX
>>>>> behaviour: On UNIX, you call abort(3), which writes a core and
>>>>> aborts. It writes a short message to stderr "Aborted (core
>>>>> dumped)" or just "Aborted".
>>>>>
>>>>> After Yumins change, on Windows os::abort also writes the core,
>>>>> then aborts. It made sense to me that this function should
>>>>> behave the same way as on UNIX: if core writing fails, write to
>>>>> stderr. There is no way otherwise to communicate a core dump
>>>>> writing failure. After writing the core, process stops.
>>>>>
>>>>> ---
>>>>>
>>>>> About the control flows:
>>>>>
>>>>> Before Yumins change:
>>>>>
>>>>> On Windows
>>>>> 1) os::check_or_create_dump(): we wrote the core dump and then
>>>>> information about the dump (file location, success or error)
>>>>> was handed to the misnamed VmError::report_coredump_status(),
>>>>> which just stored that information.
>>>>> 2) VmError::report(): Error log was written and in step 63
>>>>> information about the core (already written at that point) was
>>>>> added to error log.
>>>>> 3) os::abort() just did an exit(3)
>>>>>
>>>>> On Unix,
>>>>> 1) os::check_or_create_dump(): checks the limits to guess
>>>>> whether core dumping later will succeed. Again, that
>>>>> information is handed to VmError::report_coredump_status()
>>>>> 2) VmError::report(): Error log is written and in step 63,
>>>>> information about the core's probable future location is added
>>>>> to error log. Here, messages use past tense, which is
>>>>> misleading, because the core is not yet written.
>>>>> 3) os::abort() calls abort(3), which stops and writes a core.
>>>>>
>>>>> Yumin pushed core file writing on Windows down to os::abort().
>>>>> This brings the control flow much closer to Unix:
>>>>>
>>>>> (1) call os::check_dump_limit() to check the prerequisites for
>>>>> core file writing and gather a bit information to put in the
>>>>> error log: On Unix, limit checks, on windows, so far nothing
>>>>> much but precalculating the error file name.
>>>>> (2) VmError::report(): Error log is written and information
>>>>> about the core's probable location is added to error log.
>>>>> (3) os::abort() is called, which on all platforms:
>>>>> - dumps the core
>>>>> - if failing, writes a one-liner to stderr
>>>>> - stops the process.
>>>>>
>>>>> I think, if one agrees that reversing order of core dumping and
>>>>> error log writing on windows is a good thing, this change looks
>>>>> ok to me. Some improvements could make it clearer:
>>>>>
>>>>> - maybe rename "os::check_dump_limit()" (was my suggestion,
>>>>> sorry) to "os::check_core_prerequisites()" - that leaves room
>>>>> to flesh it out a bit more on Windows, where we do not have
>>>>> UNIX limits.
>>>>> - make the messages in VMError::report() future sense: "Will
>>>>> write core file to location.....".
>>>>> - Make the messages written in os::abort() on Windows clearer,
>>>>> and shorter, and we also do not need the buffer to be static
>>>>> here. Actually, if we want to be as on UNIX, just dumping "core
>>>>> file writing succeeded" or "failed" - maybe with an error
>>>>> number from GetLastError() - will be enough because the file
>>>>> location is already printed in the error log. Like on UNIX.
>>>>>
>>>>> Kind Regards, Thomas
>>>>>
>>>>>
>>>>> On Mon, Mar 30, 2015 at 9:51 PM, Daniel D. Daugherty
>>>>> <daniel.daugherty at oracle.com
>>>>> <mailto:daniel.daugherty at oracle.com>> wrote:
>>>>>
>>>>> Re: report_coredump_status() is really
>>>>> record_coredump_status()
>>>>>
>>>>> The report_coredump_status() function is designed to
>>>>> record two
>>>>> things about core dumps for later reporting by the code that
>>>>> generates the hs_err_pid file. That's why the original
>>>>> report_coredump_status() didn't output anything to stderr.
>>>>>
>>>>> By changing the Windows calls to report_coredump_status()
>>>>> into
>>>>> jio_fprintf() calls you have:
>>>>>
>>>>> - put output onto stderr that should go to the hs_err_pid
>>>>> file
>>>>> - made the windows code paths work differently than the
>>>>> non-windows
>>>>> code paths
>>>>>
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>>
>>>>> On 3/30/15 1:28 PM, Yumin Qi wrote:
>>>>>
>>>>> Thanks Dan
>>>>>
>>>>> On 3/30/2015 11:25 AM, Daniel D. Daugherty wrote:
>>>>>
>>>>> On 3/25/15 12:11 PM, Yumin Qi wrote:
>>>>>
>>>>> Hi, all
>>>>>
>>>>> I updated the webrev with a new change to
>>>>> test case:
>>>>> test/runtime/Unsafe/RangeCheck.java
>>>>>
>>>>> Add -XX:-CreateCoredumpOnCrash to test, no
>>>>> dump needed on this case.
>>>>>
>>>>> new webrev:
>>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev05/
>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev05/>
>>>>>
>>>>>
>>>>> General nit - please update copyright lines to 2015
>>>>> as needed
>>>>>
>>>>> src/share/vm/runtime/globals.hpp
>>>>> No comments.
>>>>>
>>>>> src/share/vm/runtime/os.hpp
>>>>> line 720: // On Windows this will create an
>>>>> actual minidump, on Linux/Solaris it will simply
>>>>> check core dump limits
>>>>> line 721: static void check_dump_limit(char*
>>>>> buffer, size_t bufferSize);
>>>>> The comment on line 720 no longer matches
>>>>> with the function
>>>>> rename from check_or_create_dump() to
>>>>> check_dump_limit().
>>>>>
>>>>> You have this comment in vmError.cpp:
>>>>> 943 // check core dump limits on
>>>>> Linux/Solaris, nothing on Windows
>>>>> 944 os::check_dump_limit(buffer,
>>>>> sizeof(buffer));
>>>>>
>>>>> so the two comments are out of sync.
>>>>>
>>>>> Will convert them to be consistent.
>>>>>
>>>>> src/share/vm/runtime/arguments.cpp
>>>>> line 3252: } else if (match_option(option,
>>>>> "-XX:+CreateMinidumpOnCrash", &tail)) {
>>>>> line 3256: } else if (match_option(option,
>>>>> "-XX:-CreateMinidumpOnCrash", &tail)) {
>>>>> These two checks should use the
>>>>> match_option() version
>>>>> without a '&tail' parameter.
>>>>>
>>>>> Will use version w/o tail.
>>>>>
>>>>> src/share/vm/utilities/vmError.cpp
>>>>> line 217: bool VMError::coredump_status =
>>>>> true; // presume we can dump core file first
>>>>> I don't think you should init this to true.
>>>>> It confuses
>>>>> things with
>>>>> VMError::report_coredump_status(). It will also
>>>>> confuse this code:
>>>>>
>>>>> 526 STEP(63, "(printing core file
>>>>> information)")
>>>>> 529 if (coredump_status) {
>>>>> 530 st->print("Core dump written. Default
>>>>> location: %s", coredump_message);
>>>>>
>>>>> because coredump_status won't accurately
>>>>> reflect whether
>>>>> the coredump_message field has been set.
>>>>>
>>>>> line 943: // check core dump limits on
>>>>> Linux/Solaris, nothing on Windows
>>>>> This updated comment doesn't match the
>>>>> unmodified
>>>>> comment in os.hpp:
>>>>>
>>>>> 720: // On Windows this will create an
>>>>> actual minidump, on Linux/Solaris it will simply
>>>>> check core dump limits
>>>>>
>>>>> I will consider your comments, then revise with a new
>>>>> version. I remember here there is a case, if status not
>>>>> set, it will output an inconsistent message. Will check
>>>>> again.
>>>>>
>>>>> src/os/aix/vm/os_aix.cpp
>>>>> No comments.
>>>>>
>>>>> src/os/bsd/vm/os_bsd.cpp
>>>>> No comments.
>>>>>
>>>>> src/os/linux/vm/os_linux.cpp
>>>>> No comments.
>>>>>
>>>>> src/os/posix/vm/os_posix.cpp
>>>>> No comments.
>>>>>
>>>>> src/os/solaris/vm/os_solaris.cpp
>>>>> No comments.
>>>>>
>>>>> src/os/windows/vm/os_windows.cpp
>>>>> line 1008: static char buffer[O_BUFLEN];
>>>>> Why switch from a buffer parameter to a
>>>>> static buffer?
>>>>> What happens with parallel calls to
>>>>> abort()? Will the static
>>>>> buffer get stomped by the competing threads?
>>>>>
>>>>> The original buffer is static too. Carrying an buffer
>>>>> for abort seems not right. This buffer in fact only for
>>>>> storing file name (based on pid) here.
>>>>> abort() will dump the core file, should we prevent
>>>>> multi-thread calling to this function? I did not check
>>>>> this part, will check if it is MT-safe.
>>>>>
>>>>>
>>>>> line 1015: // If running on a client version
>>>>> of Windows and user has not explicitly enabled
>>>>> dumping
>>>>> line 1016: if
>>>>> (!os::win32::is_windows_server() &&
>>>>> !CreateCoredumpOnCrash) {
>>>>> The default for CreateCoredumpOnCrash is
>>>>> now 'true' so the
>>>>> comment and logic are no longer correct
>>>>> here. The Windows
>>>>> Client VM will generate minidumps by default.
>>>>>
>>>>> old line 1007:
>>>>> VMError::report_coredump_status("Minidumps are not
>>>>> enabled by default on client versions of Windows",
>>>>> false);
>>>>> new line 1017: jio_fprintf(stderr, "Minidumps
>>>>> are not enabled by default on client versions of
>>>>> Windows.\n");
>>>>> There are a number of places where we
>>>>> change from
>>>>> VMError::report_coredump_status() to jio_fprintf()
>>>>> and I'm not quite seeing why this was done.
>>>>>
>>>>> Update: VMError::report_coredump_status()
>>>>> is misnamed. It doesn't
>>>>> report anything in the sense that it
>>>>> doesn't print anything. It
>>>>> does record two pieces of information about
>>>>> core dumps so maybe
>>>>> it should be
>>>>> VMError::record_coredump_status().
>>>>>
>>>>> line 1100: jio_fprintf(stderr, "%s.\n", buffer);
>>>>> At this point, buffer contains the path to
>>>>> the
>>>>> mini-dump file so the above line simply
>>>>> prints
>>>>> that on stderr. Why?
>>>>>
>>>>> Yes, I see that the old code did something
>>>>> similar:
>>>>>
>>>>> 1090
>>>>> VMError::report_coredump_status(buffer, true);
>>>>>
>>>>> but report_coredump_status() didn't
>>>>> actually print
>>>>> anything. It just squirreled away these in
>>>>> vmError.cpp:
>>>>>
>>>>> 217 bool VMError::coredump_status;
>>>>> 218 char VMError::coredump_message[O_BUFLEN];
>>>>>
>>>>> See comments above about how
>>>>> report_coredump_status()
>>>>> is misnamed.
>>>>>
>>>>> At this point, I'm convinced that all the
>>>>> changes from
>>>>> VMError::report_coredump_status() to jio_fprintf()
>>>>> are
>>>>> a bad idea.
>>>>>
>>>>>
>>>>> Changing to jio_fprintf is because
>>>>> report_coredump_status did not output anything, it is
>>>>> just a logging as you indicated. It is reasonable we
>>>>> change it to record_coredump_status instead. How about
>>>>> add printout to report_coredump_status? So I do not
>>>>> need to use jio_printf here.
>>>>>
>>>>> Other consideration of using jio_fprintf after call
>>>>> shutdown called, the static output stream still exists,
>>>>> but not guaranteed to work as expected.
>>>>>
>>>>>
>>>>> Thanks
>>>>> Yumin
>>>>>
>>>>> test/runtime/ErrorHandling/ProblematicFrameTest.java
>>>>> No comments.
>>>>>
>>>>> test/runtime/ErrorHandling/SafeFetchInErrorHandlingTest.java
>>>>> No comments.
>>>>>
>>>>> test/runtime/ErrorHandling/SecondaryErrorTest.java
>>>>> No comments.
>>>>>
>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency1.java
>>>>> No comments.
>>>>>
>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency2.java
>>>>> No comments.
>>>>>
>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency3.java
>>>>> No comments.
>>>>>
>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency4.java
>>>>> No comments.
>>>>>
>>>>> test/runtime/Unsafe/RangeCheck.java
>>>>> No comments.
>>>>>
>>>>> test/runtime/memory/ReadFromNoaccessArea.java
>>>>> No comments.
>>>>>
>>>>>
>>>>>
>>>>> Thanks
>>>>> Yumin
>>>>>
>>>>>
>>>>> On 3/23/2015 11:48 AM, Yumin Qi wrote:
>>>>>
>>>>> Since Thomas is not a reviewer in open jdk,
>>>>> I need two volunteers (at least one
>>>>> "R"eviewer).
>>>>>
>>>>> Dan, since you already reviewed previous
>>>>> version, could you have a look?
>>>>>
>>>>> Thanks
>>>>> Yumin
>>>>>
>>>>> On 3/20/2015 3:20 PM, Yumin Qi wrote:
>>>>>
>>>>> Thomas,
>>>>>
>>>>> Thanks. Yes, it is webrev04. Also, I
>>>>> have updated webrev04 to add another
>>>>> test case change:
>>>>> test/runtime/memory/ReadFromNoaccessArea.java
>>>>> (Thanks Dan's update)
>>>>>
>>>>> Thanks
>>>>> Yumin
>>>>>
>>>>> On 3/20/2015 11:55 AM, Thomas Stüfe
>>>>> wrote:
>>>>>
>>>>>
>>>>> Hi Yumin,
>>>>>
>>>>> I think you meant to post webrev.04?
>>>>>
>>>>> I looked at 04, and it looks nice.
>>>>> Thank you!
>>>>>
>>>>> (still only reviewer with r)
>>>>>
>>>>> Thomas
>>>>>
>>>>> On Mar 20, 2015 6:20 PM, "Yumin Qi"
>>>>> <yumin.qi at oracle.com
>>>>> <mailto:yumin.qi at oracle.com>
>>>>> <mailto:yumin.qi at oracle.com
>>>>> <mailto:yumin.qi at oracle.com>>> wrote:
>>>>>
>>>>> Hi, All
>>>>>
>>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev03/
>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev03/>
>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev03/>
>>>>>
>>>>> This version is based on
>>>>> Thomas' suggestion.
>>>>> Tests passed
>>>>> vm.runtime.quick.testlist, JPRT and
>>>>> manual tests.
>>>>>
>>>>> Thanks
>>>>> Yumin
>>>>>
>>>>> On 3/18/2015 10:06 AM, Yumin Qi
>>>>> wrote:
>>>>>
>>>>>
>>>>> On 3/18/2015 4:26 AM,
>>>>> Thomas Stüfe wrote:
>>>>>
>>>>> Hi Yumin,
>>>>>
>>>>> - just aesthetics, but
>>>>> on all implementations of
>>>>> os_abort(), at least on
>>>>> the UNIX variants, maybe we could
>>>>> consistently rename the
>>>>> parameters to be "siginfo" and
>>>>> "context" instead of
>>>>> using Windows terms?
>>>>>
>>>>> Sure.
>>>>>
>>>>> - On
>>>>> check_create_dump_limit(), you do
>>>>> not need
>>>>> exceptionRecord, contextRecord
>>>>> parameters anymore. The
>>>>> only parameters now
>>>>> needed are buffer and buffer size,
>>>>> which is not even an
>>>>> output parameter, just a scratch
>>>>> buffer for this
>>>>> function to use for printf. For
>>>>> output, it
>>>>> calls
>>>>> VMError::report_coredump_status(...) at
>>>>> the end to
>>>>> provide information
>>>>> about the core file.
>>>>>
>>>>> - I would rename that
>>>>> function from
>>>>> check_create_dump_limit() to
>>>>> check_dump_limit() - nothing
>>>>> is created anymore.
>>>>>
>>>>> Yes, no need to carry those
>>>>> two parameters. I tried to keep
>>>>> less changes, but decided
>>>>> to change now.
>>>>>
>>>>> - on Windows, in
>>>>> os::abort(), there is no point
>>>>> anymore in
>>>>> calling
>>>>> VMError::report_coredump_status(...) because
>>>>> that
>>>>> information is only
>>>>> used during error log writing which
>>>>> already happened. It
>>>>> only makes sense to call this
>>>>> function in
>>>>> check_create_dump_limit(), which
>>>>> happens
>>>>> before error log writing.
>>>>> Instead, maybe error
>>>>> messages like "Call to
>>>>> MiniDumpWriteDump() failed" should
>>>>> just be written to
>>>>> stderr? This would be
>>>>> consistent with Unix, where the OS
>>>>> writes a short message
>>>>> on stderr if core file writing fails.
>>>>>
>>>>> Sure, will direct output to
>>>>> stderr.
>>>>>
>>>>> - there is now a new
>>>>> test,
>>>>> test/runtime/ErrorHandling/SafeFetchInErrorHandlingTest.java
>>>>> - could you please also
>>>>> add "-XX:-CreateCoredumpOnCrash" ?
>>>>> I just added the test
>>>>> and did not want to add the option
>>>>> before it was available.
>>>>>
>>>>> OK
>>>>>
>>>>> Thanks
>>>>> Yumin
>>>>>
>>>>> Thanks for your work!
>>>>>
>>>>> Thomas
>>>>>
>>>>>
>>>>> On Tue, Mar 17, 2015 at
>>>>> 2:04 AM, Yumin Qi
>>>>> <yumin.qi at oracle.com
>>>>> <mailto:yumin.qi at oracle.com>
>>>>> <mailto:yumin.qi at oracle.com
>>>>> <mailto:yumin.qi at oracle.com>>
>>>>> <mailto:yumin.qi at oracle.com
>>>>> <mailto:yumin.qi at oracle.com>
>>>>> <mailto:yumin.qi at oracle.com
>>>>> <mailto:yumin.qi at oracle.com>>>>
>>>>> wrote:
>>>>>
>>>>> Hi, Dan and all
>>>>>
>>>>> I have updated
>>>>> webrev at:
>>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev03/
>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev03/>
>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev03/>
>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev03/>
>>>>>
>>>>> 1) bug changed
>>>>> synopsis
>>>>> 8074354: Make
>>>>> CreateMinidumpOnCrash a new name and
>>>>> available on all platforms
>>>>> 2) tests: JPRT,
>>>>> vm.runtime.quick.testlist (on
>>>>> Windows), jtreg on
>>>>> Windows and
>>>>> Linux/Unix.
>>>>>
>>>>> Thanks
>>>>> Yumin
>>>>>
>>>>>
>>>>> On 3/12/2015 12:59
>>>>> PM, Daniel D. Daugherty wrote:
>>>>>
>>>>> I believe there
>>>>> are both JavaTest/JTREG tests in
>>>>> the hotspot repo
>>>>> and there are
>>>>> VM/NSK tests from back when we did
>>>>> phone home...
>>>>>
>>>>> Check with Christian or Misha...
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>> On 3/12/15 1:50
>>>>> PM, Yumin Qi wrote:
>>>>>
>>>>> Thanks, Dan
>>>>>
>>>>> Will look at and run those tests.
>>>>> Are they
>>>>> under nsk?
>>>>>
>>>>> Yumin
>>>>>
>>>>> On 3/12/2015 12:29 PM, Daniel D.
>>>>> Daugherty wrote:
>>>>>
>>>>> Yumin,
>>>>>
>>>>> There are some error handler
>>>>> tests. You
>>>>> should find
>>>>> those and make
>>>>> sure that you run them on
>>>>> Windows since you're
>>>>> changing the order
>>>>> there...
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>> On 3/12/15 12:42 PM, Yumin Qi
>>>>> wrote:
>>>>>
>>>>> Hi, Thomas and all
>>>>>
>>>>> The second version of
>>>>> the change:
>>>>> webrev:
>>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev02/
>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev02/>
>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev02/>
>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev02/>
>>>>>
>>>>> In this version:
>>>>> 1) switch order on
>>>>> Windows to first
>>>>> print log
>>>>> file as other platform,
>>>>> then dump core
>>>>> file if
>>>>> needed. When VMError
>>>>> created after
>>>>> crash, siginfo
>>>>> and context which are
>>>>> ExceptionRecord and
>>>>> ContextRecord on Windows
>>>>> are recorded,
>>>>> mini (or
>>>>> full, due to 'or' used so
>>>>> type will be
>>>>> mini
>>>>> anyway) dump creates dump
>>>>> file based
>>>>> on those two
>>>>> as before.
>>>>> 2) to make os::abort to
>>>>> get above
>>>>> two pointers,
>>>>> added two more fields to
>>>>> the function
>>>>> parameters:
>>>>> 3) changes for
>>>>> test/runtime/ErrorHandling/SecondaryError.java
>>>>> ---
>>>>> added
>>>>> "-XX:-CreateCoredumpOnCrash"
>>>>>
>>>>> - static void abort(bool
>>>>> dump_core = true);
>>>>> + static void
>>>>> abort(bool dump_core
>>>>> = true,
>>>>> void *siginfo = NULL, void
>>>>> *context =
>>>>> NULL);
>>>>>
>>>>> Tests: JPRT and manually.
>>>>>
>>>>> Thanks
>>>>> Yumin
>>>>>
>>>>> On 3/11/2015 3:47 AM,
>>>>> Thomas Stüfe wrote:
>>>>>
>>>>> Hi Yumin,
>>>>>
>>>>> There is also
>>>>> test/runtime/ErrorHandling/SecondaryErrorTest.java
>>>>> - could you please add
>>>>> "-XX:-CreateCoredumpOnCrash" ?
>>>>> Thank you!
>>>>>
>>>>> Beyond that, as I wrote
>>>>> in the bug
>>>>> report
>>>>> comments:
>>>>>
>>>>> "This is also a problem
>>>>> on Windows -
>>>>> MiniDumpWriteDump() may hang
>>>>> infinitly. And on
>>>>> Windows this is worse
>>>>> than under
>>>>> UNIX because
>>>>> we create the Dump
>>>>> before writing
>>>>> the hs-err
>>>>> file, so if the Dump
>>>>> hangs, we get
>>>>> no error
>>>>> log. I would like to
>>>>> revert the
>>>>> order: create
>>>>> the minidump after
>>>>> writing the
>>>>> error log, the
>>>>> same way Unix does it.
>>>>> We did this
>>>>> in our JVM
>>>>> (SAP) because for us,
>>>>> error logs
>>>>> are more
>>>>> useful than minidumps. "
>>>>>
>>>>> So, I would like to see
>>>>> os::abort
>>>>> on Windows
>>>>> call
>>>>> MiniDumpWriteDump(), and thus
>>>>> the mini
>>>>> dump writing moved
>>>>> after the error log
>>>>> writing. This would
>>>>> also make the
>>>>> code way
>>>>> cleaner because the
>>>>> control flow
>>>>> would be the
>>>>> same on all platforms.
>>>>>
>>>>> I understand that this
>>>>> may be out
>>>>> of scope for
>>>>> your change, but I
>>>>> would like to
>>>>> know what
>>>>> others think about this.
>>>>>
>>>>> Kind regards, Thomas
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Mar 11, 2015 at
>>>>> 8:02 AM,
>>>>> Yumin Qi
>>>>> <yumin.qi at oracle.com
>>>>> <mailto:yumin.qi at oracle.com>
>>>>> <mailto:yumin.qi at oracle.com
>>>>> <mailto:yumin.qi at oracle.com>>
>>>>> <mailto:yumin.qi at oracle.com
>>>>> <mailto:yumin.qi at oracle.com>
>>>>> <mailto:yumin.qi at oracle.com
>>>>> <mailto:yumin.qi at oracle.com>>>
>>>>> <mailto:yumin.qi at oracle.com
>>>>> <mailto:yumin.qi at oracle.com>
>>>>> <mailto:yumin.qi at oracle.com
>>>>> <mailto:yumin.qi at oracle.com>>
>>>>> <mailto:yumin.qi at oracle.com
>>>>> <mailto:yumin.qi at oracle.com>
>>>>> <mailto:yumin.qi at oracle.com
>>>>> <mailto:yumin.qi at oracle.com>>>>> wrote:
>>>>>
>>>>> Please review:
>>>>>
>>>>> bug:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8074354
>>>>> webrev:
>>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev01/
>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev01/>
>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev01/>
>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev01/>
>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev01/>
>>>>>
>>>>> Summary: Tests
>>>>> timed out when
>>>>> VM crashes
>>>>> and dumping core file
>>>>> which in the test
>>>>> case is not
>>>>> needed. To
>>>>> make core not created,
>>>>> the fix changed
>>>>> CreateMinidumpOnCrash to
>>>>> CreateCoredumpOnCrash,
>>>>> the former is only
>>>>> used on
>>>>> Windows and the
>>>>> latter for all
>>>>> platforms. When VM crashes on
>>>>> non Windows,
>>>>> core file generated as
>>>>> default if OS sets
>>>>> core dump
>>>>> allowed.
>>>>> Default value of
>>>>> CreateCoredumpOnCrash set to
>>>>> 'true' to
>>>>> keep same behavior on non
>>>>> Windows platforms,
>>>>> but changed
>>>>> for Windows
>>>>> --- original is false,
>>>>> not create
>>>>> minidump on
>>>>> Windows. With
>>>>> CreateCoredumpOnCrash turned
>>>>> off, no core file
>>>>> will be
>>>>> generated.
>>>>> CreateMinidumpOnCrash still
>>>>> can be used on
>>>>> commandline but
>>>>> only as
>>>>> alias for the new flag.
>>>>>
>>>>> Tests: JPRT, manual
>>>>> tests setting
>>>>> CreateMinidumpOnCrash on
>>>>> commandline to verify flag
>>>>> change as alias.
>>>>>
>>>>> Thanks
>>>>> Yumin
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>
More information about the hotspot-runtime-dev
mailing list