RFR(S): 8074354: tests that intentionally crash the VM can timeout creating a core file
Yumin Qi
yumin.qi at oracle.com
Tue Apr 14 14:14:30 UTC 2015
Dan,
I missed your comments, so will change again. Sorry for that.
Will update soon.
Thanks
Yumin
On 4/13/2015 10:13 PM, Yumin Qi wrote:
> Dan, Thanks.
>
> Changed to add CloseHandle(dumpFile) before all exit paths.
> Please review at same link:
>
> http://cr.openjdk.java.net/~minqi/8074354/webrev08/
>
> Yumin
>
>
> On 4/13/2015 7:04 PM, Daniel D. Daugherty wrote:
>> Yumin,
>>
>> I would prefer that the handle is properly closed on all the
>> exit paths. Not closing the handle might even trigger a future
>> Microsoft sanity check; kind of like when Microsoft added checks
>> for duplicate close() calls...
>>
>> Dan
>>
>>
>> On 4/13/15 4:05 PM, Yumin Qi wrote:
>>> Dan,
>>>
>>> On 4/13/2015 2:09 PM, Daniel D. Daugherty wrote:
>>>> On 4/13/15 10:58 AM, Yumin Qi wrote:
>>>>> Thomas,
>>>>>
>>>>> Answers embeded below. The new web is just the same link updated
>>>>> (08):
>>>>>
>>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev08/
>>>>
>>>> src/share/vm/runtime/globals.hpp
>>>> No comments.
>>>>
>>>> src/share/vm/runtime/os.hpp
>>>> L720: // ON Posix compatible OS it will
>>>> typo: 'ON' -> 'On'
>>>>
>>>> L724: ... or a short error message, depends
>>>> grammar: 'depends' -> 'depending'
>>>>
>>>> L725: // on checking result.
>>>> grammar: 'on checking' -> 'on the checking'
>>>>
>>>> src/share/vm/runtime/arguments.cpp
>>>> No comments.
>>>>
>>>> src/share/vm/utilities/vmError.hpp
>>>> No comments.
>>>>
>>>> src/share/vm/utilities/vmError.cpp
>>>> No comments.
>>>>
>>>> src/os/aix/vm/os_aix.cpp
>>>> src/os/bsd/vm/os_bsd.cpp
>>>> src/os/linux/vm/os_linux.cpp
>>>> src/os/posix/vm/os_posix.cpp
>>>> src/os/solaris/vm/os_solaris.cpp
>>>> No comments on the above five files.
>>>>
>>>> src/os/windows/vm/os_windows.cpp
>>>> L1010: jio_snprintf(buffer, buffsz, "Failed to create coredump
>>>> file (0x%x).", GetLastError());
>>>> Because this is a Win* specific file you can use "minidump"
>>>> here instead of "coredump".
>>>>
>>>> I made a similar comment in the previous round.
>>>>
>>> Yes, I will change 'coredump' into 'minidump'
>>>> L1028: if (!dump_core || !dumpFile) {
>>>> The "!dumpFile" part is an implied boolean which we don't
>>>> like to use in HotSpot. Perhaps: dumpFile == NULL
>>>>
>>> Will change to Hotspot style.
>>>> L1078: win32::exit_process_or_thread(win32::EPT_PROCESS, 1);
>>>> The "CloseHandle(dumpFile)" call is missing.
>>>>
>>> There is no need to call CloseHandle here explicitly, since we will
>>> exit next and all resources will be reclaimed by OS. I remove it at
>>> last since think it is not needed here. If you check previous calls to
>>> win32::exit_process_or_thread(win32::EPT_PROCESS, 1);
>>> I did not place CloseHandle(dumpFile) either.
>>> If needed, I think we should put the call before all exit entry.
>>>> test/runtime/ErrorHandling/ProblematicFrameTest.java
>>>> test/runtime/ErrorHandling/SafeFetchInErrorHandlingTest.java
>>>> No comments on the above two files.
>>>>
>>>> test/runtime/ErrorHandling/SecondaryErrorTest.java
>>>> No comments.
>>>>
>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency1.java
>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency2.java
>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency3.java
>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency4.java
>>>> test/runtime/Unsafe/RangeCheck.java
>>>> test/runtime/memory/ReadFromNoaccessArea.java
>>>> No comments on the above six files.
>>> Thanks
>>> Yumin
>>>
>>>>
>>>>
>>>> Dan
>>>>
>>>>
>>>>>
>>>>>
>>>>> Thanks
>>>>> Yumin
>>>>>
>>>>> On 4/13/2015 12:30 AM, Thomas Stüfe wrote:
>>>>>> Hi Yumin,
>>>>>>
>>>>>> some small things:
>>>>>>
>>>>>> os_windows.cpp:1012
>>>>>>
>>>>>> + if (dumpFile == INVALID_HANDLE_VALUE) {
>>>>>> + jio_snprintf(buffer, buffsz, "Failed to create coredump
>>>>>> file");
>>>>>> + status = false;
>>>>>> + }
>>>>>>
>>>>>> Please print out the error number too:
>>>>>> + jio_snprintf(buffer, buffsz, "Failed to create coredump
>>>>>> file (0x%x).", ::GetLastError());
>>>>>>
>>>>>>
>>>>> Changed as above. No :: global scope operator needed here.
>>>>>>
>>>>>> os_windows.cpp:1036
>>>>>>
>>>>>> + assert(dumpFile != NULL, "dumpFile should not be NULL");
>>>>>>
>>>>>> Please remove this assert. I would not do any asserts at this
>>>>>> stage, we are dying anyway and a recursive assert would just
>>>>>> confuse things. Moreover, CreateFile may have failed for "normal"
>>>>>> reasons.
>>>>>>
>>>>> removed.
>>>>>> Instead, if dumpFile is NULL, just skip the whole minidump stuff
>>>>>> in os::abort; just behave the same way as for os::abort(false):
>>>>>>
>>>>>> + if (!dump_core || !dumpFile) {
>>>>>> + win32::exit_process_or_thread(win32::EPT_PROCESS, 1);
>>>>>> }
>>>>>>
>>>>>>
>>>>> changed as expected.
>>>>>>
>>>>>> os_windows.cpp: 1076
>>>>>>
>>>>>> I would just skip the whole "FormatMessage" stuff. It is just
>>>>>> sugar, but it will allocate buffers, which we may not want at
>>>>>> this stage. I think the plain error code from GetLastError() is
>>>>>> sufficient here. So I would just:
>>>>>>
>>>>>> 1076 // Older versions of dbghelp.dll (the one shipped with
>>>>>> Win2003 for example) may not support all
>>>>>> 1077 // the dump types we really want. If first call fails,
>>>>>> lets fall back to just use MiniDumpWithFullMemory then.
>>>>>> 1078 if (_MiniDumpWriteDump(hProcess, processId, dumpFile,
>>>>>> dumpType, pmei, NULL, NULL) == false &&
>>>>>> 1079 _MiniDumpWriteDump(hProcess, processId, dumpFile,
>>>>>> (MINIDUMP_TYPE)MiniDumpWithFullMemory, pmei, NULL, NULL) == false) {
>>>>>> jio_fprintf(stderr, "Call to MiniDumpWriteDump() failed
>>>>>> (Error 0x%x)\n", ::GetLastError());
>>>>>> 1089 } else {
>>>>>> 1090 ....
>>>>>>
>>>>>>
>>>>> Agree, this is the last step before exit, with output of last
>>>>> error code, user can get what happened --- though a little effort
>>>>> to research.
>>>>>
>>>>>> Rest is fine.
>>>>>> Thomas
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Apr 13, 2015 at 7:59 AM, Yumin Qi <yumin.qi at oracle.com
>>>>>> <mailto:yumin.qi at oracle.com>> wrote:
>>>>>>
>>>>>> Thomas and Dan,
>>>>>>
>>>>>> Please review the changes at
>>>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev08/
>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev08/>
>>>>>>
>>>>>> Thanks
>>>>>> Yumin
>>>>>>
>>>>>> On 4/10/2015 10:14 PM, Yumin Qi wrote:
>>>>>>
>>>>>> Thomas,
>>>>>>
>>>>>> Thanks for review again.
>>>>>>
>>>>>> On 4/10/2015 4:20 AM, Thomas Stüfe wrote:
>>>>>>
>>>>>> Hi Yumin,
>>>>>>
>>>>>> Hi Yumin,
>>>>>>
>>>>>> this looks mostly fine for me. I only was able to test
>>>>>> it on Linux, but will test it on other OSes next week.
>>>>>>
>>>>>> Here two remaining small points:
>>>>>>
>>>>>> - in os::check_dump_limit, maybe a comment would be
>>>>>> helpful, because the contract for this function is a
>>>>>> bit difficult to understand. Proposal:
>>>>>>
>>>>>> "Check or prepare a core dump to be taken at a later
>>>>>> point in the same thread in os::abort(). Use the
>>>>>> caller provided buffer as a scratch buffer. Output
>>>>>> status message via VMError::report_cordedump_status()
>>>>>> (on success, the core dump file location, on error, a
>>>>>> short error message) - this status message which will
>>>>>> be written into the error log."
>>>>>>
>>>>>> I will take the comments, new comments:
>>>>>>
>>>>>> // ON Posix compatible OS it will simply check core dump
>>>>>> limits while on Windows
>>>>>> // it will check if dump file can be created. Check or
>>>>>> prepare a core dump to be
>>>>>> // taken at a later point in the same thread in
>>>>>> os::abort(). Use the caller
>>>>>> // provided buffer as a scratch buffer. The status
>>>>>> message which will be written
>>>>>> // into the error log either is file location or a short
>>>>>> error message, depends
>>>>>> // on checking result.
>>>>>> static void check_dump_limit(char* buffer, size_t
>>>>>> bufferSize);
>>>>>>
>>>>>> - It would make sense to improve the output in
>>>>>> VMError.report() in STEP(63) a bit:
>>>>>>
>>>>>> Problem 1: we use past-tense, which is misleading
>>>>>> ("Core dump written", "Failed to write core"). We did
>>>>>> not generate the core dump yet, we just plan to do it,
>>>>>> and it still may fail for various reasons (e.g. even
>>>>>> if the limits are fine, the current directory may be
>>>>>> write protected. There is no way to catch all errors
>>>>>> beforehand). If the user reads "Core dump written", he
>>>>>> assumes the core dump was written successfully. If
>>>>>> abort(3) later fails to write the core dump, user will
>>>>>> be confused.
>>>>>>
>>>>>> Will change the message to
>>>>>> if (coredump_status) {
>>>>>> st->print("Core dump will be written. %s",
>>>>>> coredump_message);
>>>>>> } else {
>>>>>> st->print("No core dump will be written. %s",
>>>>>> coredump_message);
>>>>>> }
>>>>>>
>>>>>> as your suggested.
>>>>>> Whether the coredump can be successfully created depends
>>>>>> on following core file generation procedure. User can
>>>>>> trace error messages if failed to create.
>>>>>>
>>>>>> Problem 2: this is more obscure: on Linux core dumps
>>>>>> may not be written to the file system at all but be
>>>>>> processed by some other program. On my Ubuntu 14.4
>>>>>> machine I see this output:
>>>>>>
>>>>>> #
>>>>>> # Core dump written. Default location: Core dumps may
>>>>>> be processed with "/usr/share/apport/apport %p %s %c
>>>>>> %P" (or dumping to
>>>>>> /shared/projects/openjdk/jdk9-hs-rt/output/core.26261)
>>>>>> #
>>>>>>
>>>>>> This output is a bit jarring.
>>>>>>
>>>>>> I have not tested on Ubuntu so no comments.
>>>>>>
>>>>>> I will post a new webrev based on your and Dan's comments.
>>>>>> Thanks for your patience.
>>>>>>
>>>>>> Thanks
>>>>>> Yumin
>>>>>>
>>>>>> My suggestion for better texts would be:
>>>>>>
>>>>>> "Core dump will be written. %s"
>>>>>> "No core dump will be written. %s"
>>>>>>
>>>>>> which hopefully fits all cases better. But I don't
>>>>>> know, other suggestions are welcome :-)
>>>>>>
>>>>>> Otherwise, this looks fine.
>>>>>>
>>>>>> Thanks for your work!
>>>>>>
>>>>>> Thomas
>>>>>>
>>>>>>
>>>>>> On Fri, Apr 10, 2015 at 5:18 AM, Yumin Qi
>>>>>> <yumin.qi at oracle.com <mailto:yumin.qi at oracle.com>
>>>>>> <mailto:yumin.qi at oracle.com
>>>>>> <mailto:yumin.qi at oracle.com>>> wrote:
>>>>>>
>>>>>> Hi, Thomas and Dan
>>>>>>
>>>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev07/
>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>>>>>
>>>>>> On 4/9/2015 7:51 PM, Yumin Qi wrote:
>>>>>>
>>>>>> Please hold on, I missed the test cases:
>>>>>>
>>>>>> *test/runtime/Unsafe/RangeCheck.java
>>>>>> *
>>>>>> *test/runtime/memory/ReadFromNoaccessArea.java
>>>>>>
>>>>>> Added.
>>>>>>
>>>>>> Also some file heads need to be updated with
>>>>>> this year.
>>>>>>
>>>>>> Changed
>>>>>>
>>>>>> Thanks
>>>>>> Yumin
>>>>>>
>>>>>> Thanks
>>>>>> Yumin
>>>>>>
>>>>>>
>>>>>> *
>>>>>> On 4/9/2015 7:07 PM, Yumin Qi wrote:
>>>>>>
>>>>>> Hi, Thomas and Dan
>>>>>>
>>>>>> The same URL is updated with new version.
>>>>>>
>>>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev07/
>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>>>>>
>>>>>> I remove the public access to
>>>>>> VMError::out,
>>>>>> VMError::coredump_message and the size of
>>>>>> VMError::coredump_message.
>>>>>> Added private static HANDLE in
>>>>>> os_windows.cpp for
>>>>>> dumpFile as suggested by Thomas.
>>>>>>
>>>>>> Tests passes JPRT and manual tests.
>>>>>>
>>>>>> Thanks
>>>>>> Yumin
>>>>>>
>>>>>>
>>>>>> On 4/9/2015 1:32 AM, Thomas Stüfe wrote:
>>>>>>
>>>>>> Hi Yumin,
>>>>>>
>>>>>> I do not think exposing
>>>>>> VMError::coredump_status
>>>>>> buffer just for the sake of using it
>>>>>> as a scratch
>>>>>> buffer in os::abort() is a good idea.
>>>>>> It is very
>>>>>> confusing and has nothing to do with
>>>>>> the purpose of
>>>>>> VMerror::coredump_status.
>>>>>>
>>>>>> Also, VMerror::coredump_status is
>>>>>> static, so you do
>>>>>> not even gain anything over just using
>>>>>> a global or
>>>>>> function scope static buffer. Either
>>>>>> we are worried
>>>>>> about thread safety in os::abort(), or
>>>>>> we aren't. If
>>>>>> we aren't, just use a local static
>>>>>> buffer, it is much
>>>>>> clearer and no need to expose
>>>>>> VMError::coredump_status
>>>>>> like this.
>>>>>>
>>>>>> Apart from that, see my remarks in
>>>>>> yesterdays mail.
>>>>>>
>>>>>> Cheers, Thomas
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Apr 9, 2015 at 5:16 AM, Yumin Qi
>>>>>> <yumin.qi at oracle.com
>>>>>> <mailto:yumin.qi at oracle.com>
>>>>>> <mailto:yumin.qi at oracle.com
>>>>>> <mailto:yumin.qi at oracle.com>>
>>>>>> <mailto:yumin.qi at oracle.com
>>>>>> <mailto:yumin.qi at oracle.com>
>>>>>> <mailto:yumin.qi at oracle.com
>>>>>> <mailto:yumin.qi at oracle.com>>>> wrote:
>>>>>>
>>>>>> Hi, Dan
>>>>>>
>>>>>> New webrev at
>>>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev07/
>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>>>>>
>>>>>> To make code clean, add
>>>>>> os::abort_internal which
>>>>>> is only used
>>>>>> for Windows.
>>>>>> Since passing
>>>>>> VMError::coredump_message and its
>>>>>> size as
>>>>>> parameters across multiple
>>>>>> functions makes
>>>>>> function have a bigger
>>>>>> signature, I change the way to
>>>>>> access them -- add
>>>>>> two functions in
>>>>>> VMError to access them:
>>>>>> coredump_msg_buffer() and
>>>>>> coredump_msg_buffer_size(). They
>>>>>> are static
>>>>>> members, so can save
>>>>>> stack space for passing around.
>>>>>>
>>>>>> in os_windows.cpp, abort(bool,
>>>>>> void*, void*):
>>>>>> if error encountered, exit with
>>>>>> error messages.
>>>>>> If no error,
>>>>>> VMError::coredump_message contains
>>>>>> the dump
>>>>>> path/name. I just use them so no
>>>>>> need to have
>>>>>> extra space for name
>>>>>> allocation.
>>>>>>
>>>>>> Passed JPRT and manual tests.
>>>>>>
>>>>>> Thanks
>>>>>> Yumin
>>>>>>
>>>>>>
>>>>>> On 4/7/2015 9:23 AM, Daniel D.
>>>>>> Daugherty wrote:
>>>>>>
>>>>>> On 4/1/15 10:57 AM, Yumin Qi
>>>>>> wrote:
>>>>>>
>>>>>> Hi, Dan and Thomas,
>>>>>>
>>>>>> I have a new webrev at
>>>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev06/
>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev06/>
>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev06/>
>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev06/>
>>>>>>
>>>>>>
>>>>>>
>>>>>> src/share/vm/runtime/globals.hpp
>>>>>> Sorry, I didn't notice
>>>>>> this before:
>>>>>>
>>>>>> L939: product(bool,
>>>>>> CreateCoredumpOnCrash,
>>>>>> true, \
>>>>>> L940: "Create minidump on
>>>>>> VM fatal
>>>>>> error") \
>>>>>>
>>>>>> The "minidump" on L940
>>>>>> should change
>>>>>> also. Don't know if you
>>>>>> want to say "core/mini
>>>>>> dump" or
>>>>>> something else.
>>>>>>
>>>>>>
>>>>>> src/share/vm/runtime/os.hpp
>>>>>> L719: // On
>>>>>> Linux/Solaris it will simply
>>>>>> check core dump
>>>>>> limits while on Windows will
>>>>>> do nothing.
>>>>>> grammar: 'on Windows
>>>>>> will do nothing'
>>>>>> -> 'on Windows
>>>>>> it will do nothing'
>>>>>>
>>>>>>
>>>>>> src/share/vm/runtime/arguments.cpp
>>>>>> No comments.
>>>>>>
>>>>>> src/share/vm/utilities/vmError.hpp
>>>>>> No comments.
>>>>>>
>>>>>> src/share/vm/utilities/vmError.cpp
>>>>>> No comments.
>>>>>>
>>>>>> src/os/aix/vm/os_aix.cpp
>>>>>> No comments.
>>>>>>
>>>>>> src/os/bsd/vm/os_bsd.cpp
>>>>>> No comments.
>>>>>>
>>>>>> src/os/linux/vm/os_linux.cpp
>>>>>> No comments.
>>>>>>
>>>>>> src/os/posix/vm/os_posix.cpp
>>>>>> No comments.
>>>>>>
>>>>>> src/os/solaris/vm/os_solaris.cpp
>>>>>> No comments.
>>>>>>
>>>>>> src/os/windows/vm/os_windows.cpp
>>>>>> L92: jio_snprintf(buffer,
>>>>>> bufferSize,
>>>>>> "%s\\hs_err_pid%u.mdmp",
>>>>>> get_current_directory(NULL, 0),
>>>>>> current_process_id());
>>>>>> Other non-Windows
>>>>>> calls to
>>>>>> get_current_directory() do a NULL
>>>>>> check before using the
>>>>>> return value.
>>>>>> The original Windows
>>>>>> code that called
>>>>>> get_current_directory() did not
>>>>>> make this
>>>>>> check, but I think
>>>>>> it's better to be
>>>>>> consistent and safe.
>>>>>>
>>>>>> L1007: static char
>>>>>> buffer[O_BUFLEN];
>>>>>> L1008: char
>>>>>> filename[FILENAME_MAX];
>>>>>> Two more potentially
>>>>>> large stack
>>>>>> variables here. Will we
>>>>>> run into stack
>>>>>> overflow on Win* more
>>>>>> frequently when what
>>>>>> we really wanted was
>>>>>> the reason for
>>>>>> the abort() call?
>>>>>>
>>>>>> The reason the
>>>>>> original code used
>>>>>> 'buffer' to construct the
>>>>>> dump file name was to
>>>>>> save space. The
>>>>>> original code was also
>>>>>> careful to not need
>>>>>> the file name in
>>>>>> any error messages after
>>>>>> the point at which the
>>>>>> dump file was
>>>>>> successfully created.
>>>>>>
>>>>>> It would be better
>>>>>> (more resilient) if
>>>>>> you could pass in
>>>>>> an already existing
>>>>>> buffer to abort()
>>>>>> like the original
>>>>>> code passed into
>>>>>> os::check_or_create_dump(). Yes, that
>>>>>> would require visiting
>>>>>> all existing
>>>>>> calls to abort().
>>>>>>
>>>>>> L1015:
>>>>>> assert(out->is_open(),
>>>>>> "defaultStream should be open");
>>>>>> Is it possible for
>>>>>> os::abort() to be
>>>>>> called early enough on
>>>>>> Windows such that
>>>>>> VMError::out is not
>>>>>> actually setup? Yes,
>>>>>> this would have to be
>>>>>> absolutely
>>>>>> catastrophic...
>>>>>>
>>>>>> See the comment in
>>>>>> os_solaris.cpp
>>>>>> L1520-1522.
>>>>>>
>>>>>> L1018: jio_snprintf(buffer,
>>>>>> sizeof(buffer), "%s", "Either
>>>>>> CreateCoredumpOnCrash is
>>>>>> disabled from command"
>>>>>> L1019 " line or coredump
>>>>>> is not available");
>>>>>> When you are using
>>>>>> jio_snprintf() as a
>>>>>> replacement for:
>>>>>>
>>>>>> strncpy(buffer, "my string",
>>>>>> sizeof(buffer));
>>>>>> buffer[sizeof(buffer) - 1] = '\0';
>>>>>>
>>>>>> you don't need the
>>>>>> "%s" format string.
>>>>>> So this code would be:
>>>>>>
>>>>>> jio_snprintf(buffer, sizeof(buffer),
>>>>>> "Either
>>>>>> CreateCoredumpOnCrash is disabled
>>>>>> from "
>>>>>> "command line or coredump
>>>>>> is not available");
>>>>>>
>>>>>> Same comment for:
>>>>>> L1026, L1039
>>>>>>
>>>>>> L1020: goto done;
>>>>>> The use of 'goto'
>>>>>> makes me cringe. You
>>>>>> can refactor much
>>>>>> of this logic into an
>>>>>> abort_internal()
>>>>>> function that
>>>>>> abort() calls and keep
>>>>>> all the
>>>>>> original "return" logic
>>>>>> to bail out.
>>>>>>
>>>>>> L1045: // Older versions
>>>>>> of dbghelp.h
>>>>>> doesn't contain all
>>>>>> grammar: 'doesn't' ->
>>>>>> 'do not'
>>>>>>
>>>>>> L1052: cwd =
>>>>>> get_current_directory(NULL, 0);
>>>>>> L1053:
>>>>>> jio_snprintf(filename,
>>>>>> sizeof(filename),
>>>>>> "%s\\hs_err_pid%u.mdmp", cwd,
>>>>>> current_process_id());
>>>>>> Other non-Windows
>>>>>> calls to
>>>>>> get_current_directory() do a NULL
>>>>>> check before using the
>>>>>> return value.
>>>>>> The original Windows
>>>>>> code that called
>>>>>> get_current_directory() did not
>>>>>> make this
>>>>>> check, but I think
>>>>>> it's better to be
>>>>>> consistent and safe.
>>>>>>
>>>>>> L1092: // well done.
>>>>>> Like a overcooked
>>>>>> steak? :-) Not sure
>>>>>> what you're trying
>>>>>> to say here...
>>>>>>
>>>>>> test/runtime/ErrorHandling/ProblematicFrameTest.java
>>>>>> No comments.
>>>>>>
>>>>>> test/runtime/ErrorHandling/SafeFetchInErrorHandlingTest.java
>>>>>> No comments.
>>>>>>
>>>>>> test/runtime/ErrorHandling/SecondaryErrorTest.java
>>>>>> No comments.
>>>>>>
>>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency1.java
>>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency2.java
>>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency3.java
>>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency4.java
>>>>>> No comments on these four.
>>>>>>
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>>
>>>>>> What changes:
>>>>>>
>>>>>> 1) os_windows.cpp,
>>>>>> os::abort(bool ....):
>>>>>> a) Since when
>>>>>> abort() is called, the
>>>>>> last function called
>>>>>> before exit process,
>>>>>> VMError::log already closed,
>>>>>> which
>>>>>> is the file logging.
>>>>>> defaultStream still
>>>>>> available to use, so
>>>>>> use it for writing error
>>>>>> or success
>>>>>> messages. VMError::out is
>>>>>> the defaultStream, for
>>>>>> using it, added a
>>>>>> function out_stream() in
>>>>>> VMError.
>>>>>> b) Delete the codes
>>>>>> which decide to
>>>>>> write core based on
>>>>>> client/server version with
>>>>>> debug. The new
>>>>>> code is if
>>>>>> CreateCoredumpOnCrash set or dump_core is
>>>>>> on, the coredump will
>>>>>> be tried no matter which
>>>>>> version it is.
>>>>>> There is no special
>>>>>> reason not to dump core
>>>>>> with client
>>>>>> version I think. Any concern
>>>>>> for this change?
>>>>>> c) Use of 'goto' in
>>>>>> this function. I
>>>>>> tried not to use it,
>>>>>> but using it makes code
>>>>>> clear. I remember
>>>>>> somewhere suggest not
>>>>>> using 'goto' in cplusplus.
>>>>>>
>>>>>> 2) changed comments to
>>>>>> make them
>>>>>> consistent as indicated by Dan.
>>>>>>
>>>>>> 3) Added/modified head
>>>>>> version for files.
>>>>>>
>>>>>> Tests: JPRT, manual
>>>>>> test on local Windows.
>>>>>>
>>>>>> Thanks
>>>>>> Yumin
>>>>>>
>>>>>>
>>>>>> On 3/31/2015 1:25 AM,
>>>>>> Thomas Stüfe wrote:
>>>>>>
>>>>>> Hi Daniel, Yumin,
>>>>>>
>>>>>> sorry, a lot of Yumins
>>>>>> changes were
>>>>>> based on suggestions from
>>>>>> me, so here more
>>>>>> background:
>>>>>>
>>>>>> Yumin reversed the
>>>>>> order of error log
>>>>>> writing and core dumping.
>>>>>> The reason for that
>>>>>> are explained in
>>>>>> detail at the start of the
>>>>>> mail thread, in short:
>>>>>>
>>>>>> - on Windows core file
>>>>>> dumping may
>>>>>> hang - its rare but it
>>>>>> happens - preventing
>>>>>> error logs.
>>>>>> Depending on who you ask,
>>>>>> error logs are more
>>>>>> important than
>>>>>> minidumps, so it makes sense
>>>>>> to first write the
>>>>>> erorr log.
>>>>>> - This also brings
>>>>>> windows core paths
>>>>>> closer to UNIX code
>>>>>> paths. See below.
>>>>>>
>>>>>> About writing to
>>>>>> stderr in os::abort():
>>>>>>
>>>>>> After this change,
>>>>>> calling
>>>>>> VmError::report_coredump_status() in
>>>>>> os::abort() will not
>>>>>> do anything
>>>>>> because the error log is
>>>>>> already written at
>>>>>> that point. I
>>>>>> suggested to Yumin
>>>>>> writing to
>>>>>> stderr instead in
>>>>>> os::abort() because
>>>>>> that mimicks UNIX
>>>>>> behaviour: On UNIX,
>>>>>> you call abort(3),
>>>>>> which writes a core and
>>>>>> aborts. It writes a
>>>>>> short message to
>>>>>> stderr "Aborted (core
>>>>>> dumped)" or just
>>>>>> "Aborted".
>>>>>>
>>>>>> After Yumins change,
>>>>>> on Windows
>>>>>> os::abort also writes the
>>>>>> core,
>>>>>> then aborts. It made
>>>>>> sense to me that
>>>>>> this function should
>>>>>> behave the same way as
>>>>>> on UNIX: if
>>>>>> core writing fails, write to
>>>>>> stderr. There is no
>>>>>> way otherwise to
>>>>>> communicate a core dump
>>>>>> writing failure. After
>>>>>> writing the
>>>>>> core, process stops.
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> About the control flows:
>>>>>>
>>>>>> Before Yumins change:
>>>>>>
>>>>>> On Windows
>>>>>> 1)
>>>>>> os::check_or_create_dump(): we
>>>>>> wrote the core dump and then
>>>>>> information about the
>>>>>> dump (file
>>>>>> location, success or error)
>>>>>> was handed to the
>>>>>> misnamed
>>>>>> VmError::report_coredump_status(),
>>>>>> which just stored that
>>>>>> information.
>>>>>> 2) VmError::report():
>>>>>> Error log was
>>>>>> written and in step 63
>>>>>> information about the
>>>>>> core (already
>>>>>> written at that point) was
>>>>>> added to error log.
>>>>>> 3) os::abort() just
>>>>>> did an exit(3)
>>>>>>
>>>>>> On Unix,
>>>>>> 1)
>>>>>> os::check_or_create_dump(): checks
>>>>>> the limits to guess
>>>>>> whether core dumping
>>>>>> later will
>>>>>> succeed. Again, that
>>>>>> information is handed to
>>>>>> VmError::report_coredump_status()
>>>>>> 2) VmError::report():
>>>>>> Error log is
>>>>>> written and in step 63,
>>>>>> information about the
>>>>>> core's probable
>>>>>> future location is added
>>>>>> to error log. Here,
>>>>>> messages use past
>>>>>> tense, which is
>>>>>> misleading, because
>>>>>> the core is not
>>>>>> yet written.
>>>>>> 3) os::abort() calls
>>>>>> abort(3), which
>>>>>> stops and writes a core.
>>>>>>
>>>>>> Yumin pushed core file
>>>>>> writing on
>>>>>> Windows down to os::abort().
>>>>>> This brings the
>>>>>> control flow much
>>>>>> closer to Unix:
>>>>>>
>>>>>> (1) call
>>>>>> os::check_dump_limit() to
>>>>>> check the prerequisites for
>>>>>> core file writing and
>>>>>> gather a bit
>>>>>> information to put in the
>>>>>> error log: On Unix,
>>>>>> limit checks, on
>>>>>> windows, so far nothing
>>>>>> much but
>>>>>> precalculating the error file
>>>>>> name.
>>>>>> (2) VmError::report():
>>>>>> Error log is
>>>>>> written and information
>>>>>> about the core's
>>>>>> probable location is
>>>>>> added to error log.
>>>>>> (3) os::abort() is
>>>>>> called, which on
>>>>>> all platforms:
>>>>>> - dumps the core
>>>>>> - if failing, writes
>>>>>> a one-liner to
>>>>>> stderr
>>>>>> - stops the process.
>>>>>>
>>>>>> I think, if one agrees
>>>>>> that reversing
>>>>>> order of core dumping and
>>>>>> error log writing on
>>>>>> windows is a good
>>>>>> thing, this change looks
>>>>>> ok to me. Some
>>>>>> improvements could make
>>>>>> it clearer:
>>>>>>
>>>>>> - maybe rename
>>>>>> "os::check_dump_limit()" (was my suggestion,
>>>>>> sorry) to
>>>>>> "os::check_core_prerequisites()" - that
>>>>>> leaves room
>>>>>> to flesh it out a bit
>>>>>> more on Windows,
>>>>>> where we do not have
>>>>>> UNIX limits.
>>>>>> - make the messages in
>>>>>> VMError::report() future
>>>>>> sense: "Will
>>>>>> write core file to
>>>>>> location.....".
>>>>>> - Make the messages
>>>>>> written in
>>>>>> os::abort() on Windows
>>>>>> clearer,
>>>>>> and shorter, and we
>>>>>> also do not need
>>>>>> the buffer to be static
>>>>>> here. Actually, if we
>>>>>> want to be as on
>>>>>> UNIX, just dumping "core
>>>>>> file writing
>>>>>> succeeded" or "failed" -
>>>>>> maybe with an error
>>>>>> number from
>>>>>> GetLastError() - will be
>>>>>> enough because the file
>>>>>> location is already
>>>>>> printed in the
>>>>>> error log. Like on UNIX.
>>>>>>
>>>>>> Kind Regards, Thomas
>>>>>>
>>>>>>
>>>>>> On Mon, Mar 30, 2015
>>>>>> at 9:51 PM,
>>>>>> Daniel D. Daugherty
>>>>>> <daniel.daugherty at oracle.com
>>>>>> <mailto:daniel.daugherty at oracle.com>
>>>>>> <mailto:daniel.daugherty at oracle.com
>>>>>> <mailto:daniel.daugherty at oracle.com>>
>>>>>> <mailto:daniel.daugherty at oracle.com
>>>>>> <mailto:daniel.daugherty at oracle.com>
>>>>>> <mailto:daniel.daugherty at oracle.com
>>>>>> <mailto:daniel.daugherty at oracle.com>>>> wrote:
>>>>>>
>>>>>> Re:
>>>>>> report_coredump_status() is
>>>>>> really
>>>>>> record_coredump_status()
>>>>>>
>>>>>> The
>>>>>> report_coredump_status()
>>>>>> function is designed to
>>>>>> record two
>>>>>> things about core
>>>>>> dumps for later
>>>>>> reporting by the code that
>>>>>> generates the
>>>>>> hs_err_pid file.
>>>>>> That's why the original
>>>>>> report_coredump_status() didn't
>>>>>> output anything to stderr.
>>>>>>
>>>>>> By changing the
>>>>>> Windows calls to
>>>>>> report_coredump_status() into
>>>>>> jio_fprintf() calls you have:
>>>>>>
>>>>>> - put output onto
>>>>>> stderr that
>>>>>> should go to the
>>>>>> hs_err_pid file
>>>>>> - made the windows
>>>>>> code paths work
>>>>>> differently than the
>>>>>> non-windows
>>>>>> code paths
>>>>>>
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 3/30/15 1:28
>>>>>> PM, Yumin Qi wrote:
>>>>>>
>>>>>> Thanks Dan
>>>>>>
>>>>>> On 3/30/2015
>>>>>> 11:25 AM, Daniel
>>>>>> D. Daugherty wrote:
>>>>>>
>>>>>> On 3/25/15
>>>>>> 12:11 PM, Yumin
>>>>>> Qi wrote:
>>>>>>
>>>>>> Hi, all
>>>>>>
>>>>>> I updated the
>>>>>> webrev with a new change to
>>>>>> test case:
>>>>>> test/runtime/Unsafe/RangeCheck.java
>>>>>>
>>>>>> Add
>>>>>> -XX:-CreateCoredumpOnCrash to test, no
>>>>>> dump needed on this case.
>>>>>>
>>>>>> new webrev:
>>>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev05/
>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev05/>
>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev05/>
>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev05/>
>>>>>>
>>>>>>
>>>>>> General nit - please
>>>>>> update copyright lines to
>>>>>> 2015
>>>>>> as needed
>>>>>>
>>>>>> src/share/vm/runtime/globals.hpp
>>>>>> No comments.
>>>>>>
>>>>>> src/share/vm/runtime/os.hpp
>>>>>> line 720: // On
>>>>>> Windows this will create an
>>>>>> actual minidump, on
>>>>>> Linux/Solaris it will simply
>>>>>> check core dump limits
>>>>>> line 721: static
>>>>>> void check_dump_limit(char*
>>>>>> buffer, size_t bufferSize);
>>>>>> The comment on
>>>>>> line 720 no longer matches
>>>>>> with the
>>>>>> function
>>>>>> rename from
>>>>>> check_or_create_dump() to
>>>>>> check_dump_limit().
>>>>>>
>>>>>> You have this
>>>>>> comment in vmError.cpp:
>>>>>> 943 // check
>>>>>> core dump limits on
>>>>>> Linux/Solaris, nothing on
>>>>>> Windows
>>>>>> 944
>>>>>> os::check_dump_limit(buffer,
>>>>>> sizeof(buffer));
>>>>>>
>>>>>> so the two
>>>>>> comments are out of sync.
>>>>>>
>>>>>> Will convert
>>>>>> them to be
>>>>>> consistent.
>>>>>>
>>>>>> src/share/vm/runtime/arguments.cpp
>>>>>> line 3252: } else if
>>>>>> (match_option(option,
>>>>>> "-XX:+CreateMinidumpOnCrash", &tail)) {
>>>>>> line 3256: } else if
>>>>>> (match_option(option,
>>>>>> "-XX:-CreateMinidumpOnCrash", &tail)) {
>>>>>> These two checks
>>>>>> should use the
>>>>>> match_option() version
>>>>>> without a '&tail'
>>>>>> parameter.
>>>>>>
>>>>>> Will use
>>>>>> version w/o tail.
>>>>>>
>>>>>> src/share/vm/utilities/vmError.cpp
>>>>>> line 217: bool
>>>>>> VMError::coredump_status =
>>>>>> true; // presume we
>>>>>> can dump core file first
>>>>>> I don't think you
>>>>>> should init this to true.
>>>>>> It confuses
>>>>>> things with
>>>>>> VMError::report_coredump_status(). It will
>>>>>> also
>>>>>> confuse this code:
>>>>>>
>>>>>> 526 STEP(63,
>>>>>> "(printing core file
>>>>>> information)")
>>>>>> 529 if
>>>>>> (coredump_status) {
>>>>>> 530
>>>>>> st->print("Core dump
>>>>>> written. Default
>>>>>> location: %s",
>>>>>> coredump_message);
>>>>>>
>>>>>> because
>>>>>> coredump_status won't
>>>>>> accurately
>>>>>> reflect whether
>>>>>> the
>>>>>> coredump_message field has
>>>>>> been set.
>>>>>>
>>>>>> line 943: // check
>>>>>> core dump limits on
>>>>>> Linux/Solaris, nothing on
>>>>>> Windows
>>>>>> This updated
>>>>>> comment doesn't match the
>>>>>> unmodified
>>>>>> comment in os.hpp:
>>>>>>
>>>>>> 720: // On
>>>>>> Windows this will create an
>>>>>> actual minidump, on
>>>>>> Linux/Solaris it will simply
>>>>>> check core dump limits
>>>>>>
>>>>>> I will
>>>>>> consider your comments,
>>>>>> then revise with a new
>>>>>> version. I
>>>>>> remember here there
>>>>>> is a case, if status not
>>>>>> set, it will
>>>>>> output an
>>>>>> inconsistent message. Will
>>>>>> check
>>>>>> again.
>>>>>>
>>>>>> src/os/aix/vm/os_aix.cpp
>>>>>> No comments.
>>>>>>
>>>>>> src/os/bsd/vm/os_bsd.cpp
>>>>>> No comments.
>>>>>>
>>>>>> src/os/linux/vm/os_linux.cpp
>>>>>> No comments.
>>>>>>
>>>>>> src/os/posix/vm/os_posix.cpp
>>>>>> No comments.
>>>>>>
>>>>>> src/os/solaris/vm/os_solaris.cpp
>>>>>> No comments.
>>>>>>
>>>>>> src/os/windows/vm/os_windows.cpp
>>>>>> line 1008: static char
>>>>>> buffer[O_BUFLEN];
>>>>>> Why switch from a
>>>>>> buffer parameter to a
>>>>>> static buffer?
>>>>>> What happens with
>>>>>> parallel calls to
>>>>>> abort()? Will the static
>>>>>> buffer get stomped
>>>>>> by the competing threads?
>>>>>>
>>>>>> The original
>>>>>> buffer is static
>>>>>> too. Carrying an buffer
>>>>>> for abort
>>>>>> seems not right.
>>>>>> This buffer in fact only for
>>>>>> storing file
>>>>>> name (based on
>>>>>> pid) here.
>>>>>> abort() will
>>>>>> dump the core
>>>>>> file, should we prevent
>>>>>> multi-thread calling to this
>>>>>> function? I did not check
>>>>>> this part,
>>>>>> will check if it is
>>>>>> MT-safe.
>>>>>>
>>>>>>
>>>>>> line 1015: // If
>>>>>> running on a client version
>>>>>> of Windows
>>>>>> and user has
>>>>>> not explicitly enabled
>>>>>> dumping
>>>>>> line 1016: if
>>>>>> (!os::win32::is_windows_server() &&
>>>>>> !CreateCoredumpOnCrash) {
>>>>>> The default for
>>>>>> CreateCoredumpOnCrash is
>>>>>> now 'true'
>>>>>> so the
>>>>>> comment and logic
>>>>>> are no longer correct
>>>>>> here. The Windows
>>>>>> Client VM will
>>>>>> generate minidumps by
>>>>>> default.
>>>>>>
>>>>>> old line 1007:
>>>>>> VMError::report_coredump_status("Minidumps
>>>>>> are not
>>>>>> enabled by default on
>>>>>> client versions of Windows",
>>>>>> false);
>>>>>> new line 1017:
>>>>>> jio_fprintf(stderr, "Minidumps
>>>>>> are not
>>>>>> enabled by default
>>>>>> on client versions of
>>>>>> Windows.\n");
>>>>>> There are a number
>>>>>> of places where we
>>>>>> change from
>>>>>> VMError::report_coredump_status() to
>>>>>> jio_fprintf()
>>>>>> and I'm not quite
>>>>>> seeing why this was done.
>>>>>>
>>>>>> Update:
>>>>>> VMError::report_coredump_status()
>>>>>> is
>>>>>> misnamed. It doesn't
>>>>>> report anything in
>>>>>> the sense that it
>>>>>> doesn't print anything. It
>>>>>> does record two
>>>>>> pieces of information about
>>>>>> core dumps
>>>>>> so maybe
>>>>>> it should be
>>>>>> VMError::record_coredump_status().
>>>>>>
>>>>>> line 1100:
>>>>>> jio_fprintf(stderr,
>>>>>> "%s.\n", buffer);
>>>>>> At this point,
>>>>>> buffer contains the path
>>>>>> to the
>>>>>> mini-dump file so
>>>>>> the above line simply prints
>>>>>> that on stderr. Why?
>>>>>>
>>>>>> Yes, I see that
>>>>>> the old code did something
>>>>>> similar:
>>>>>>
>>>>>> 1090
>>>>>> VMError::report_coredump_status(buffer,
>>>>>> true);
>>>>>>
>>>>>> but
>>>>>> report_coredump_status() didn't
>>>>>> actually print
>>>>>> anything. It just
>>>>>> squirreled away these in
>>>>>> vmError.cpp:
>>>>>>
>>>>>> 217 bool
>>>>>> VMError::coredump_status;
>>>>>> 218 char
>>>>>> VMError::coredump_message[O_BUFLEN];
>>>>>>
>>>>>> See comments above
>>>>>> about how
>>>>>> report_coredump_status()
>>>>>> is misnamed.
>>>>>>
>>>>>> At this point, I'm
>>>>>> convinced that all the
>>>>>> changes from
>>>>>> VMError::report_coredump_status() to
>>>>>> jio_fprintf() are
>>>>>> a bad idea.
>>>>>>
>>>>>>
>>>>>> Changing to
>>>>>> jio_fprintf is because
>>>>>> report_coredump_status did not
>>>>>> output anything, it is
>>>>>> just a logging
>>>>>> as you
>>>>>> indicated. It is
>>>>>> reasonable we
>>>>>> change it to
>>>>>> record_coredump_status instead. How about
>>>>>> add printout to
>>>>>> report_coredump_status? So I do not
>>>>>> need to use
>>>>>> jio_printf here.
>>>>>>
>>>>>> Other
>>>>>> consideration of using
>>>>>> jio_fprintf after call
>>>>>> shutdown
>>>>>> called, the static
>>>>>> output stream still exists,
>>>>>> but not
>>>>>> guaranteed to work as
>>>>>> expected.
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> Yumin
>>>>>>
>>>>>> test/runtime/ErrorHandling/ProblematicFrameTest.java
>>>>>> No comments.
>>>>>>
>>>>>> test/runtime/ErrorHandling/SafeFetchInErrorHandlingTest.java
>>>>>> No comments.
>>>>>>
>>>>>> test/runtime/ErrorHandling/SecondaryErrorTest.java
>>>>>> No comments.
>>>>>>
>>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency1.java
>>>>>> No comments.
>>>>>>
>>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency2.java
>>>>>> No comments.
>>>>>>
>>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency3.java
>>>>>> No comments.
>>>>>>
>>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency4.java
>>>>>> No comments.
>>>>>>
>>>>>> test/runtime/Unsafe/RangeCheck.java
>>>>>> No comments.
>>>>>>
>>>>>> test/runtime/memory/ReadFromNoaccessArea.java
>>>>>> No comments.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> Yumin
>>>>>>
>>>>>>
>>>>>> On 3/23/2015 11:48 AM,
>>>>>> Yumin Qi wrote:
>>>>>>
>>>>>> Since Thomas is
>>>>>> not a reviewer in open jdk,
>>>>>> I need two
>>>>>> volunteers (at least one
>>>>>> "R"eviewer).
>>>>>>
>>>>>> Dan, since you
>>>>>> already reviewed previous
>>>>>> version, could you
>>>>>> have a look?
>>>>>>
>>>>>> Thanks
>>>>>> Yumin
>>>>>>
>>>>>> On 3/20/2015 3:20
>>>>>> PM, Yumin Qi wrote:
>>>>>>
>>>>>> Thomas,
>>>>>>
>>>>>> Thanks. Yes,
>>>>>> it is webrev04. Also, I
>>>>>> have updated
>>>>>> webrev04 to add another
>>>>>> test case change:
>>>>>> test/runtime/memory/ReadFromNoaccessArea.java
>>>>>> (Thanks Dan's
>>>>>> update)
>>>>>>
>>>>>> Thanks
>>>>>> Yumin
>>>>>>
>>>>>> On 3/20/2015
>>>>>> 11:55 AM, Thomas Stüfe
>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>> Hi Yumin,
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
More information about the hotspot-runtime-dev
mailing list