RFR(S): 8074354: tests that intentionally crash the VM can timeout creating a core file
Yumin Qi
yumin.qi at oracle.com
Mon Apr 13 22:05:40 UTC 2015
Dan,
On 4/13/2015 2:09 PM, Daniel D. Daugherty wrote:
> On 4/13/15 10:58 AM, Yumin Qi wrote:
>> Thomas,
>>
>> Answers embeded below. The new web is just the same link updated (08):
>>
>> http://cr.openjdk.java.net/~minqi/8074354/webrev08/
>
> src/share/vm/runtime/globals.hpp
> No comments.
>
> src/share/vm/runtime/os.hpp
> L720: // ON Posix compatible OS it will
> typo: 'ON' -> 'On'
>
> L724: ... or a short error message, depends
> grammar: 'depends' -> 'depending'
>
> L725: // on checking result.
> grammar: 'on checking' -> 'on the checking'
>
> src/share/vm/runtime/arguments.cpp
> No comments.
>
> src/share/vm/utilities/vmError.hpp
> No comments.
>
> src/share/vm/utilities/vmError.cpp
> No comments.
>
> src/os/aix/vm/os_aix.cpp
> src/os/bsd/vm/os_bsd.cpp
> src/os/linux/vm/os_linux.cpp
> src/os/posix/vm/os_posix.cpp
> src/os/solaris/vm/os_solaris.cpp
> No comments on the above five files.
>
> src/os/windows/vm/os_windows.cpp
> L1010: jio_snprintf(buffer, buffsz, "Failed to create coredump
> file (0x%x).", GetLastError());
> Because this is a Win* specific file you can use "minidump"
> here instead of "coredump".
>
> I made a similar comment in the previous round.
>
Yes, I will change 'coredump' into 'minidump'
> L1028: if (!dump_core || !dumpFile) {
> The "!dumpFile" part is an implied boolean which we don't
> like to use in HotSpot. Perhaps: dumpFile == NULL
>
Will change to Hotspot style.
> L1078: win32::exit_process_or_thread(win32::EPT_PROCESS, 1);
> The "CloseHandle(dumpFile)" call is missing.
>
There is no need to call CloseHandle here explicitly, since we will exit
next and all resources will be reclaimed by OS. I remove it at last
since think it is not needed here. If you check previous calls to
win32::exit_process_or_thread(win32::EPT_PROCESS, 1);
I did not place CloseHandle(dumpFile) either.
If needed, I think we should put the call before all exit entry.
> test/runtime/ErrorHandling/ProblematicFrameTest.java
> test/runtime/ErrorHandling/SafeFetchInErrorHandlingTest.java
> No comments on the above two files.
>
> test/runtime/ErrorHandling/SecondaryErrorTest.java
> No comments.
>
> test/runtime/Safepoint/AssertSafepointCheckConsistency1.java
> test/runtime/Safepoint/AssertSafepointCheckConsistency2.java
> test/runtime/Safepoint/AssertSafepointCheckConsistency3.java
> test/runtime/Safepoint/AssertSafepointCheckConsistency4.java
> test/runtime/Unsafe/RangeCheck.java
> test/runtime/memory/ReadFromNoaccessArea.java
> No comments on the above six files.
Thanks
Yumin
>
>
> Dan
>
>
>>
>>
>> Thanks
>> Yumin
>>
>> On 4/13/2015 12:30 AM, Thomas Stüfe wrote:
>>> Hi Yumin,
>>>
>>> some small things:
>>>
>>> os_windows.cpp:1012
>>>
>>> + if (dumpFile == INVALID_HANDLE_VALUE) {
>>> + jio_snprintf(buffer, buffsz, "Failed to create coredump file");
>>> + status = false;
>>> + }
>>>
>>> Please print out the error number too:
>>> + jio_snprintf(buffer, buffsz, "Failed to create coredump file
>>> (0x%x).", ::GetLastError());
>>>
>>>
>> Changed as above. No :: global scope operator needed here.
>>>
>>> os_windows.cpp:1036
>>>
>>> + assert(dumpFile != NULL, "dumpFile should not be NULL");
>>>
>>> Please remove this assert. I would not do any asserts at this stage,
>>> we are dying anyway and a recursive assert would just confuse
>>> things. Moreover, CreateFile may have failed for "normal" reasons.
>>>
>> removed.
>>> Instead, if dumpFile is NULL, just skip the whole minidump stuff in
>>> os::abort; just behave the same way as for os::abort(false):
>>>
>>> + if (!dump_core || !dumpFile) {
>>> + win32::exit_process_or_thread(win32::EPT_PROCESS, 1);
>>> }
>>>
>>>
>> changed as expected.
>>>
>>> os_windows.cpp: 1076
>>>
>>> I would just skip the whole "FormatMessage" stuff. It is just sugar,
>>> but it will allocate buffers, which we may not want at this stage. I
>>> think the plain error code from GetLastError() is sufficient here.
>>> So I would just:
>>>
>>> 1076 // Older versions of dbghelp.dll (the one shipped with
>>> Win2003 for example) may not support all
>>> 1077 // the dump types we really want. If first call fails, lets
>>> fall back to just use MiniDumpWithFullMemory then.
>>> 1078 if (_MiniDumpWriteDump(hProcess, processId, dumpFile,
>>> dumpType, pmei, NULL, NULL) == false &&
>>> 1079 _MiniDumpWriteDump(hProcess, processId, dumpFile,
>>> (MINIDUMP_TYPE)MiniDumpWithFullMemory, pmei, NULL, NULL) == false) {
>>> jio_fprintf(stderr, "Call to MiniDumpWriteDump() failed
>>> (Error 0x%x)\n", ::GetLastError());
>>> 1089 } else {
>>> 1090 ....
>>>
>>>
>> Agree, this is the last step before exit, with output of last error
>> code, user can get what happened --- though a little effort to research.
>>
>>> Rest is fine.
>>> Thomas
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Apr 13, 2015 at 7:59 AM, Yumin Qi <yumin.qi at oracle.com
>>> <mailto:yumin.qi at oracle.com>> wrote:
>>>
>>> Thomas and Dan,
>>>
>>> Please review the changes at
>>> http://cr.openjdk.java.net/~minqi/8074354/webrev08/
>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev08/>
>>>
>>> Thanks
>>> Yumin
>>>
>>> On 4/10/2015 10:14 PM, Yumin Qi wrote:
>>>
>>> Thomas,
>>>
>>> Thanks for review again.
>>>
>>> On 4/10/2015 4:20 AM, Thomas Stüfe wrote:
>>>
>>> Hi Yumin,
>>>
>>> Hi Yumin,
>>>
>>> this looks mostly fine for me. I only was able to test
>>> it on Linux, but will test it on other OSes next week.
>>>
>>> Here two remaining small points:
>>>
>>> - in os::check_dump_limit, maybe a comment would be
>>> helpful, because the contract for this function is a bit
>>> difficult to understand. Proposal:
>>>
>>> "Check or prepare a core dump to be taken at a later
>>> point in the same thread in os::abort(). Use the caller
>>> provided buffer as a scratch buffer. Output status
>>> message via VMError::report_cordedump_status() (on
>>> success, the core dump file location, on error, a short
>>> error message) - this status message which will be
>>> written into the error log."
>>>
>>> I will take the comments, new comments:
>>>
>>> // ON Posix compatible OS it will simply check core dump
>>> limits while on Windows
>>> // it will check if dump file can be created. Check or
>>> prepare a core dump to be
>>> // taken at a later point in the same thread in
>>> os::abort(). Use the caller
>>> // provided buffer as a scratch buffer. The status message
>>> which will be written
>>> // into the error log either is file location or a short
>>> error message, depends
>>> // on checking result.
>>> static void check_dump_limit(char* buffer, size_t bufferSize);
>>>
>>> - It would make sense to improve the output in
>>> VMError.report() in STEP(63) a bit:
>>>
>>> Problem 1: we use past-tense, which is misleading ("Core
>>> dump written", "Failed to write core"). We did not
>>> generate the core dump yet, we just plan to do it, and
>>> it still may fail for various reasons (e.g. even if the
>>> limits are fine, the current directory may be write
>>> protected. There is no way to catch all errors
>>> beforehand). If the user reads "Core dump written", he
>>> assumes the core dump was written successfully. If
>>> abort(3) later fails to write the core dump, user will
>>> be confused.
>>>
>>> Will change the message to
>>> if (coredump_status) {
>>> st->print("Core dump will be written. %s",
>>> coredump_message);
>>> } else {
>>> st->print("No core dump will be written. %s",
>>> coredump_message);
>>> }
>>>
>>> as your suggested.
>>> Whether the coredump can be successfully created depends on
>>> following core file generation procedure. User can trace
>>> error messages if failed to create.
>>>
>>> Problem 2: this is more obscure: on Linux core dumps may
>>> not be written to the file system at all but be
>>> processed by some other program. On my Ubuntu 14.4
>>> machine I see this output:
>>>
>>> #
>>> # Core dump written. Default location: Core dumps may be
>>> processed with "/usr/share/apport/apport %p %s %c %P"
>>> (or dumping to
>>> /shared/projects/openjdk/jdk9-hs-rt/output/core.26261)
>>> #
>>>
>>> This output is a bit jarring.
>>>
>>> I have not tested on Ubuntu so no comments.
>>>
>>> I will post a new webrev based on your and Dan's comments.
>>> Thanks for your patience.
>>>
>>> Thanks
>>> Yumin
>>>
>>> My suggestion for better texts would be:
>>>
>>> "Core dump will be written. %s"
>>> "No core dump will be written. %s"
>>>
>>> which hopefully fits all cases better. But I don't know,
>>> other suggestions are welcome :-)
>>>
>>> Otherwise, this looks fine.
>>>
>>> Thanks for your work!
>>>
>>> Thomas
>>>
>>>
>>> On Fri, Apr 10, 2015 at 5:18 AM, Yumin Qi
>>> <yumin.qi at oracle.com <mailto:yumin.qi at oracle.com>
>>> <mailto:yumin.qi at oracle.com
>>> <mailto:yumin.qi at oracle.com>>> wrote:
>>>
>>> Hi, Thomas and Dan
>>>
>>> http://cr.openjdk.java.net/~minqi/8074354/webrev07/
>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>>
>>> On 4/9/2015 7:51 PM, Yumin Qi wrote:
>>>
>>> Please hold on, I missed the test cases:
>>>
>>> *test/runtime/Unsafe/RangeCheck.java
>>> *
>>> *test/runtime/memory/ReadFromNoaccessArea.java
>>>
>>> Added.
>>>
>>> Also some file heads need to be updated with
>>> this year.
>>>
>>> Changed
>>>
>>> Thanks
>>> Yumin
>>>
>>> Thanks
>>> Yumin
>>>
>>>
>>> *
>>> On 4/9/2015 7:07 PM, Yumin Qi wrote:
>>>
>>> Hi, Thomas and Dan
>>>
>>> The same URL is updated with new version.
>>>
>>> http://cr.openjdk.java.net/~minqi/8074354/webrev07/
>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>>
>>> I remove the public access to VMError::out,
>>> VMError::coredump_message and the size of
>>> VMError::coredump_message.
>>> Added private static HANDLE in
>>> os_windows.cpp for
>>> dumpFile as suggested by Thomas.
>>>
>>> Tests passes JPRT and manual tests.
>>>
>>> Thanks
>>> Yumin
>>>
>>>
>>> On 4/9/2015 1:32 AM, Thomas Stüfe wrote:
>>>
>>> Hi Yumin,
>>>
>>> I do not think exposing
>>> VMError::coredump_status
>>> buffer just for the sake of using it as
>>> a scratch
>>> buffer in os::abort() is a good idea. It
>>> is very
>>> confusing and has nothing to do with the
>>> purpose of
>>> VMerror::coredump_status.
>>>
>>> Also, VMerror::coredump_status is
>>> static, so you do
>>> not even gain anything over just using a
>>> global or
>>> function scope static buffer. Either we
>>> are worried
>>> about thread safety in os::abort(), or
>>> we aren't. If
>>> we aren't, just use a local static
>>> buffer, it is much
>>> clearer and no need to expose
>>> VMError::coredump_status
>>> like this.
>>>
>>> Apart from that, see my remarks in
>>> yesterdays mail.
>>>
>>> Cheers, Thomas
>>>
>>>
>>>
>>>
>>> On Thu, Apr 9, 2015 at 5:16 AM, Yumin Qi
>>> <yumin.qi at oracle.com
>>> <mailto:yumin.qi at oracle.com> <mailto:yumin.qi at oracle.com
>>> <mailto:yumin.qi at oracle.com>>
>>> <mailto:yumin.qi at oracle.com
>>> <mailto:yumin.qi at oracle.com>
>>> <mailto:yumin.qi at oracle.com
>>> <mailto:yumin.qi at oracle.com>>>> wrote:
>>>
>>> Hi, Dan
>>>
>>> New webrev at
>>> http://cr.openjdk.java.net/~minqi/8074354/webrev07/
>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>>
>>> To make code clean, add
>>> os::abort_internal which
>>> is only used
>>> for Windows.
>>> Since passing
>>> VMError::coredump_message and its
>>> size as
>>> parameters across multiple functions
>>> makes
>>> function have a bigger
>>> signature, I change the way to
>>> access them -- add
>>> two functions in
>>> VMError to access them:
>>> coredump_msg_buffer() and
>>> coredump_msg_buffer_size(). They are
>>> static
>>> members, so can save
>>> stack space for passing around.
>>>
>>> in os_windows.cpp, abort(bool,
>>> void*, void*):
>>> if error encountered, exit with
>>> error messages.
>>> If no error,
>>> VMError::coredump_message contains
>>> the dump
>>> path/name. I just use them so no
>>> need to have
>>> extra space for name
>>> allocation.
>>>
>>> Passed JPRT and manual tests.
>>>
>>> Thanks
>>> Yumin
>>>
>>>
>>> On 4/7/2015 9:23 AM, Daniel D.
>>> Daugherty wrote:
>>>
>>> On 4/1/15 10:57 AM, Yumin Qi wrote:
>>>
>>> Hi, Dan and Thomas,
>>>
>>> I have a new webrev at
>>> http://cr.openjdk.java.net/~minqi/8074354/webrev06/
>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev06/>
>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev06/>
>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev06/>
>>>
>>>
>>>
>>> src/share/vm/runtime/globals.hpp
>>> Sorry, I didn't notice this
>>> before:
>>>
>>> L939: product(bool,
>>> CreateCoredumpOnCrash,
>>> true, \
>>> L940: "Create
>>> minidump on VM fatal
>>> error") \
>>>
>>> The "minidump" on L940
>>> should change
>>> also. Don't know if you
>>> want to say "core/mini
>>> dump" or
>>> something else.
>>>
>>>
>>> src/share/vm/runtime/os.hpp
>>> L719: // On Linux/Solaris
>>> it will simply
>>> check core dump
>>> limits while on Windows will do
>>> nothing.
>>> grammar: 'on Windows
>>> will do nothing'
>>> -> 'on Windows it
>>> will do nothing'
>>>
>>>
>>> src/share/vm/runtime/arguments.cpp
>>> No comments.
>>>
>>> src/share/vm/utilities/vmError.hpp
>>> No comments.
>>>
>>> src/share/vm/utilities/vmError.cpp
>>> No comments.
>>>
>>> src/os/aix/vm/os_aix.cpp
>>> No comments.
>>>
>>> src/os/bsd/vm/os_bsd.cpp
>>> No comments.
>>>
>>> src/os/linux/vm/os_linux.cpp
>>> No comments.
>>>
>>> src/os/posix/vm/os_posix.cpp
>>> No comments.
>>>
>>> src/os/solaris/vm/os_solaris.cpp
>>> No comments.
>>>
>>> src/os/windows/vm/os_windows.cpp
>>> L92: jio_snprintf(buffer,
>>> bufferSize,
>>> "%s\\hs_err_pid%u.mdmp",
>>> get_current_directory(NULL, 0),
>>> current_process_id());
>>> Other non-Windows calls to
>>> get_current_directory() do a NULL
>>> check before using the
>>> return value.
>>> The original Windows
>>> code that called
>>> get_current_directory() did not make
>>> this
>>> check, but I think it's
>>> better to be
>>> consistent and safe.
>>>
>>> L1007: static char
>>> buffer[O_BUFLEN];
>>> L1008: char
>>> filename[FILENAME_MAX];
>>> Two more potentially
>>> large stack
>>> variables here. Will we
>>> run into stack overflow
>>> on Win* more
>>> frequently when what
>>> we really wanted was the
>>> reason for
>>> the abort() call?
>>>
>>> The reason the original
>>> code used
>>> 'buffer' to construct the
>>> dump file name was to
>>> save space. The
>>> original code was also
>>> careful to not need the
>>> file name in
>>> any error messages after
>>> the point at which the
>>> dump file was
>>> successfully created.
>>>
>>> It would be better (more
>>> resilient) if
>>> you could pass in
>>> an already existing
>>> buffer to abort()
>>> like the original
>>> code passed into
>>> os::check_or_create_dump(). Yes, that
>>> would require visiting
>>> all existing
>>> calls to abort().
>>>
>>> L1015: assert(out->is_open(),
>>> "defaultStream should be open");
>>> Is it possible for
>>> os::abort() to be
>>> called early enough on
>>> Windows such that
>>> VMError::out is not
>>> actually setup? Yes,
>>> this would have to be
>>> absolutely
>>> catastrophic...
>>>
>>> See the comment in
>>> os_solaris.cpp
>>> L1520-1522.
>>>
>>> L1018: jio_snprintf(buffer,
>>> sizeof(buffer), "%s", "Either
>>> CreateCoredumpOnCrash is
>>> disabled from command"
>>> L1019 "
>>> line or coredump
>>> is not available");
>>> When you are using
>>> jio_snprintf() as a
>>> replacement for:
>>>
>>> strncpy(buffer, "my
>>> string",
>>> sizeof(buffer));
>>> buffer[sizeof(buffer) - 1] = '\0';
>>>
>>> you don't need the "%s"
>>> format string.
>>> So this code would be:
>>>
>>> jio_snprintf(buffer,
>>> sizeof(buffer),
>>> "Either
>>> CreateCoredumpOnCrash is disabled from "
>>> "command line or coredump
>>> is not available");
>>>
>>> Same comment for: L1026,
>>> L1039
>>>
>>> L1020: goto done;
>>> The use of 'goto' makes
>>> me cringe. You
>>> can refactor much
>>> of this logic into an
>>> abort_internal()
>>> function that
>>> abort() calls and keep
>>> all the
>>> original "return" logic
>>> to bail out.
>>>
>>> L1045: // Older versions of
>>> dbghelp.h
>>> doesn't contain all
>>> grammar: 'doesn't' ->
>>> 'do not'
>>>
>>> L1052: cwd =
>>> get_current_directory(NULL, 0);
>>> L1053: jio_snprintf(filename,
>>> sizeof(filename),
>>> "%s\\hs_err_pid%u.mdmp", cwd,
>>> current_process_id());
>>> Other non-Windows calls to
>>> get_current_directory() do a NULL
>>> check before using the
>>> return value.
>>> The original Windows
>>> code that called
>>> get_current_directory() did not make
>>> this
>>> check, but I think it's
>>> better to be
>>> consistent and safe.
>>>
>>> L1092: // well done.
>>> Like a overcooked steak?
>>> :-) Not sure
>>> what you're trying
>>> to say here...
>>>
>>> test/runtime/ErrorHandling/ProblematicFrameTest.java
>>> No comments.
>>>
>>> test/runtime/ErrorHandling/SafeFetchInErrorHandlingTest.java
>>> No comments.
>>>
>>> test/runtime/ErrorHandling/SecondaryErrorTest.java
>>> No comments.
>>>
>>> test/runtime/Safepoint/AssertSafepointCheckConsistency1.java
>>> test/runtime/Safepoint/AssertSafepointCheckConsistency2.java
>>> test/runtime/Safepoint/AssertSafepointCheckConsistency3.java
>>> test/runtime/Safepoint/AssertSafepointCheckConsistency4.java
>>> No comments on these four.
>>>
>>>
>>> Dan
>>>
>>>
>>>
>>> What changes:
>>>
>>> 1) os_windows.cpp,
>>> os::abort(bool ....):
>>> a) Since when abort()
>>> is called, the
>>> last function called before
>>> exit process,
>>> VMError::log already closed, which
>>> is the file logging.
>>> defaultStream still
>>> available to use, so
>>> use it for writing error or
>>> success
>>> messages. VMError::out is
>>> the defaultStream, for using
>>> it, added a
>>> function out_stream() in
>>> VMError.
>>> b) Delete the codes
>>> which decide to
>>> write core based on
>>> client/server version with
>>> debug. The new
>>> code is if
>>> CreateCoredumpOnCrash set or
>>> dump_core is
>>> on, the coredump will
>>> be tried no matter which
>>> version it is.
>>> There is no special
>>> reason not to dump core with
>>> client
>>> version I think. Any concern
>>> for this change?
>>> c) Use of 'goto' in
>>> this function. I
>>> tried not to use it,
>>> but using it makes code
>>> clear. I remember
>>> somewhere suggest not
>>> using 'goto' in cplusplus.
>>>
>>> 2) changed comments to
>>> make them
>>> consistent as indicated by Dan.
>>>
>>> 3) Added/modified head
>>> version for files.
>>>
>>> Tests: JPRT, manual test
>>> on local Windows.
>>>
>>> Thanks
>>> Yumin
>>>
>>>
>>> On 3/31/2015 1:25 AM, Thomas
>>> Stüfe wrote:
>>>
>>> Hi Daniel, Yumin,
>>>
>>> sorry, a lot of Yumins
>>> changes were
>>> based on suggestions from
>>> me, so here more background:
>>>
>>> Yumin reversed the order
>>> of error log
>>> writing and core dumping.
>>> The reason for that are
>>> explained in
>>> detail at the start of the
>>> mail thread, in short:
>>>
>>> - on Windows core file
>>> dumping may
>>> hang - its rare but it
>>> happens - preventing
>>> error logs.
>>> Depending on who you ask,
>>> error logs are more
>>> important than
>>> minidumps, so it makes sense
>>> to first write the erorr
>>> log.
>>> - This also brings
>>> windows core paths
>>> closer to UNIX code
>>> paths. See below.
>>>
>>> About writing to stderr
>>> in os::abort():
>>>
>>> After this change, calling
>>> VmError::report_coredump_status() in
>>> os::abort() will not do
>>> anything
>>> because the error log is
>>> already written at that
>>> point. I
>>> suggested to Yumin writing to
>>> stderr instead in
>>> os::abort() because
>>> that mimicks UNIX
>>> behaviour: On UNIX, you
>>> call abort(3),
>>> which writes a core and
>>> aborts. It writes a
>>> short message to
>>> stderr "Aborted (core
>>> dumped)" or just "Aborted".
>>>
>>> After Yumins change, on
>>> Windows
>>> os::abort also writes the core,
>>> then aborts. It made
>>> sense to me that
>>> this function should
>>> behave the same way as
>>> on UNIX: if
>>> core writing fails, write to
>>> stderr. There is no way
>>> otherwise to
>>> communicate a core dump
>>> writing failure. After
>>> writing the
>>> core, process stops.
>>>
>>> ---
>>>
>>> About the control flows:
>>>
>>> Before Yumins change:
>>>
>>> On Windows
>>> 1)
>>> os::check_or_create_dump(): we
>>> wrote the core dump and then
>>> information about the
>>> dump (file
>>> location, success or error)
>>> was handed to the misnamed
>>> VmError::report_coredump_status(),
>>> which just stored that
>>> information.
>>> 2) VmError::report():
>>> Error log was
>>> written and in step 63
>>> information about the
>>> core (already
>>> written at that point) was
>>> added to error log.
>>> 3) os::abort() just did
>>> an exit(3)
>>>
>>> On Unix,
>>> 1)
>>> os::check_or_create_dump(): checks
>>> the limits to guess
>>> whether core dumping
>>> later will
>>> succeed. Again, that
>>> information is handed to
>>> VmError::report_coredump_status()
>>> 2) VmError::report():
>>> Error log is
>>> written and in step 63,
>>> information about the
>>> core's probable
>>> future location is added
>>> to error log. Here,
>>> messages use past
>>> tense, which is
>>> misleading, because the
>>> core is not
>>> yet written.
>>> 3) os::abort() calls
>>> abort(3), which
>>> stops and writes a core.
>>>
>>> Yumin pushed core file
>>> writing on
>>> Windows down to os::abort().
>>> This brings the control
>>> flow much
>>> closer to Unix:
>>>
>>> (1) call
>>> os::check_dump_limit() to
>>> check the prerequisites for
>>> core file writing and
>>> gather a bit
>>> information to put in the
>>> error log: On Unix,
>>> limit checks, on
>>> windows, so far nothing
>>> much but precalculating
>>> the error file
>>> name.
>>> (2) VmError::report():
>>> Error log is
>>> written and information
>>> about the core's
>>> probable location is
>>> added to error log.
>>> (3) os::abort() is
>>> called, which on
>>> all platforms:
>>> - dumps the core
>>> - if failing, writes a
>>> one-liner to
>>> stderr
>>> - stops the process.
>>>
>>> I think, if one agrees
>>> that reversing
>>> order of core dumping and
>>> error log writing on
>>> windows is a good
>>> thing, this change looks
>>> ok to me. Some
>>> improvements could make
>>> it clearer:
>>>
>>> - maybe rename
>>> "os::check_dump_limit()"
>>> (was my suggestion,
>>> sorry) to
>>> "os::check_core_prerequisites()" - that
>>> leaves room
>>> to flesh it out a bit
>>> more on Windows,
>>> where we do not have
>>> UNIX limits.
>>> - make the messages in
>>> VMError::report() future
>>> sense: "Will
>>> write core file to
>>> location.....".
>>> - Make the messages
>>> written in
>>> os::abort() on Windows clearer,
>>> and shorter, and we also
>>> do not need
>>> the buffer to be static
>>> here. Actually, if we
>>> want to be as on
>>> UNIX, just dumping "core
>>> file writing succeeded"
>>> or "failed" -
>>> maybe with an error
>>> number from
>>> GetLastError() - will be
>>> enough because the file
>>> location is already
>>> printed in the
>>> error log. Like on UNIX.
>>>
>>> Kind Regards, Thomas
>>>
>>>
>>> On Mon, Mar 30, 2015 at
>>> 9:51 PM,
>>> Daniel D. Daugherty
>>>
>>> <daniel.daugherty at oracle.com
>>> <mailto:daniel.daugherty at oracle.com>
>>> <mailto:daniel.daugherty at oracle.com
>>> <mailto:daniel.daugherty at oracle.com>>
>>> <mailto:daniel.daugherty at oracle.com
>>> <mailto:daniel.daugherty at oracle.com>
>>> <mailto:daniel.daugherty at oracle.com
>>> <mailto:daniel.daugherty at oracle.com>>>> wrote:
>>>
>>> Re:
>>> report_coredump_status() is
>>> really record_coredump_status()
>>>
>>> The
>>> report_coredump_status()
>>> function is designed to
>>> record two
>>> things about core
>>> dumps for later
>>> reporting by the code that
>>> generates the
>>> hs_err_pid file.
>>> That's why the original
>>> report_coredump_status() didn't
>>> output anything to stderr.
>>>
>>> By changing the
>>> Windows calls to
>>> report_coredump_status() into
>>> jio_fprintf() calls
>>> you have:
>>>
>>> - put output onto
>>> stderr that
>>> should go to the hs_err_pid file
>>> - made the windows
>>> code paths work
>>> differently than the
>>> non-windows
>>> code paths
>>>
>>>
>>> Dan
>>>
>>>
>>>
>>> On 3/30/15 1:28 PM,
>>> Yumin Qi wrote:
>>>
>>> Thanks Dan
>>>
>>> On 3/30/2015
>>> 11:25 AM, Daniel
>>> D. Daugherty wrote:
>>>
>>> On 3/25/15
>>> 12:11 PM, Yumin
>>> Qi wrote:
>>>
>>> Hi, all
>>>
>>> I updated the
>>> webrev with a new change to
>>> test case:
>>> test/runtime/Unsafe/RangeCheck.java
>>>
>>> Add
>>> -XX:-CreateCoredumpOnCrash to test, no
>>> dump
>>> needed on this case.
>>>
>>> new webrev:
>>> http://cr.openjdk.java.net/~minqi/8074354/webrev05/
>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev05/>
>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev05/>
>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev05/>
>>>
>>>
>>> General nit
>>> - please
>>> update copyright lines to 2015
>>> as needed
>>>
>>> src/share/vm/runtime/globals.hpp
>>> No comments.
>>>
>>> src/share/vm/runtime/os.hpp
>>> line
>>> 720: // On
>>> Windows this will create an
>>> actual
>>> minidump, on
>>> Linux/Solaris it will simply
>>> check core
>>> dump limits
>>> line
>>> 721: static
>>> void check_dump_limit(char*
>>> buffer,
>>> size_t bufferSize);
>>> The comment on
>>> line 720 no longer matches
>>> with the
>>> function
>>> rename from
>>> check_or_create_dump() to
>>> check_dump_limit().
>>>
>>> You have this
>>> comment in vmError.cpp:
>>> 943 // check
>>> core dump limits on
>>> Linux/Solaris, nothing on
>>> Windows
>>> 944
>>> os::check_dump_limit(buffer,
>>> sizeof(buffer));
>>>
>>> so the two
>>> comments are out of sync.
>>>
>>> Will convert
>>> them to be
>>> consistent.
>>>
>>> src/share/vm/runtime/arguments.cpp
>>> line
>>> 3252: } else if
>>> (match_option(option,
>>> "-XX:+CreateMinidumpOnCrash", &tail)) {
>>> line
>>> 3256: } else if
>>> (match_option(option,
>>> "-XX:-CreateMinidumpOnCrash", &tail)) {
>>> These two checks
>>> should use the
>>> match_option() version
>>> without a '&tail'
>>> parameter.
>>>
>>> Will use version
>>> w/o tail.
>>>
>>> src/share/vm/utilities/vmError.cpp
>>> line
>>> 217: bool
>>> VMError::coredump_status =
>>> true; //
>>> presume we
>>> can dump core file first
>>> I don't think you
>>> should init this to true.
>>> It confuses
>>> things with
>>> VMError::report_coredump_status(). It will
>>> also
>>> confuse this code:
>>>
>>> 526 STEP(63,
>>> "(printing core file
>>> information)")
>>> 529 if
>>> (coredump_status) {
>>> 530
>>> st->print("Core dump
>>> written. Default
>>> location: %s",
>>> coredump_message);
>>>
>>> because
>>> coredump_status won't accurately
>>> reflect whether
>>> the
>>> coredump_message field has
>>> been set.
>>>
>>> line
>>> 943: // check
>>> core dump limits on
>>> Linux/Solaris, nothing on
>>> Windows
>>> This updated
>>> comment doesn't match the
>>> unmodified
>>> comment in os.hpp:
>>>
>>> 720: // On
>>> Windows this will create an
>>> actual
>>> minidump, on
>>> Linux/Solaris it will simply
>>> check core
>>> dump limits
>>>
>>> I will consider
>>> your comments,
>>> then revise with a new
>>> version. I
>>> remember here there
>>> is a case, if status not
>>> set, it will
>>> output an
>>> inconsistent message. Will check
>>> again.
>>>
>>> src/os/aix/vm/os_aix.cpp
>>> No comments.
>>>
>>> src/os/bsd/vm/os_bsd.cpp
>>> No comments.
>>>
>>> src/os/linux/vm/os_linux.cpp
>>> No comments.
>>>
>>> src/os/posix/vm/os_posix.cpp
>>> No comments.
>>>
>>> src/os/solaris/vm/os_solaris.cpp
>>> No comments.
>>>
>>> src/os/windows/vm/os_windows.cpp
>>> line
>>> 1008: static char
>>> buffer[O_BUFLEN];
>>> Why switch from a
>>> buffer parameter to a
>>> static buffer?
>>> What happens with
>>> parallel calls to
>>> abort()?
>>> Will the static
>>> buffer get stomped
>>> by the competing threads?
>>>
>>> The original
>>> buffer is static
>>> too. Carrying an buffer
>>> for abort seems
>>> not right.
>>> This buffer in fact only for
>>> storing file
>>> name (based on
>>> pid) here.
>>> abort() will
>>> dump the core
>>> file, should we prevent
>>> multi-thread
>>> calling to this
>>> function? I did not check
>>> this part, will
>>> check if it is
>>> MT-safe.
>>>
>>>
>>> line
>>> 1015: // If
>>> running on a client version
>>> of Windows
>>> and user has
>>> not explicitly enabled dumping
>>> line
>>> 1016: if
>>> (!os::win32::is_windows_server() &&
>>> !CreateCoredumpOnCrash) {
>>> The default for
>>> CreateCoredumpOnCrash is
>>> now 'true'
>>> so the
>>> comment and logic
>>> are no longer correct
>>> here. The
>>> Windows
>>> Client VM will
>>> generate minidumps by default.
>>>
>>> old line
>>> 1007:
>>> VMError::report_coredump_status("Minidumps
>>> are not
>>> enabled by
>>> default on
>>> client versions of Windows",
>>> false);
>>> new line
>>> 1017:
>>> jio_fprintf(stderr, "Minidumps
>>> are not
>>> enabled by default
>>> on client versions of
>>> Windows.\n");
>>> There are a number
>>> of places where we
>>> change from
>>> VMError::report_coredump_status() to
>>> jio_fprintf()
>>> and I'm not quite
>>> seeing why this was done.
>>>
>>> Update:
>>> VMError::report_coredump_status()
>>> is misnamed.
>>> It doesn't
>>> report anything in
>>> the sense that it
>>> doesn't
>>> print anything. It
>>> does record two
>>> pieces of information about
>>> core dumps
>>> so maybe
>>> it should be
>>> VMError::record_coredump_status().
>>>
>>> line 1100:
>>> jio_fprintf(stderr, "%s.\n",
>>> buffer);
>>> At this point,
>>> buffer contains the path to the
>>> mini-dump file so
>>> the above line simply prints
>>> that on stderr. Why?
>>>
>>> Yes, I see that
>>> the old code did something
>>> similar:
>>>
>>> 1090
>>> VMError::report_coredump_status(buffer,
>>> true);
>>>
>>> but
>>> report_coredump_status() didn't
>>> actually print
>>> anything. It just
>>> squirreled away these in
>>> vmError.cpp:
>>>
>>> 217 bool
>>> VMError::coredump_status;
>>> 218 char
>>> VMError::coredump_message[O_BUFLEN];
>>>
>>> See comments above
>>> about how
>>> report_coredump_status()
>>> is misnamed.
>>>
>>> At this
>>> point, I'm
>>> convinced that all the
>>> changes from
>>> VMError::report_coredump_status() to
>>> jio_fprintf() are
>>> a bad idea.
>>>
>>>
>>> Changing to
>>> jio_fprintf is because
>>> report_coredump_status did not
>>> output anything, it is
>>> just a logging
>>> as you
>>> indicated. It is reasonable we
>>> change it to
>>> record_coredump_status
>>> instead. How about
>>> add printout to
>>> report_coredump_status? So I
>>> do not
>>> need to use
>>> jio_printf here.
>>>
>>> Other
>>> consideration of using
>>> jio_fprintf after call
>>> shutdown called,
>>> the static
>>> output stream still exists,
>>> but not
>>> guaranteed to work as
>>> expected.
>>>
>>>
>>> Thanks
>>> Yumin
>>>
>>> test/runtime/ErrorHandling/ProblematicFrameTest.java
>>> No comments.
>>>
>>> test/runtime/ErrorHandling/SafeFetchInErrorHandlingTest.java
>>> No comments.
>>>
>>> test/runtime/ErrorHandling/SecondaryErrorTest.java
>>> No comments.
>>>
>>> test/runtime/Safepoint/AssertSafepointCheckConsistency1.java
>>> No comments.
>>>
>>> test/runtime/Safepoint/AssertSafepointCheckConsistency2.java
>>> No comments.
>>>
>>> test/runtime/Safepoint/AssertSafepointCheckConsistency3.java
>>> No comments.
>>>
>>> test/runtime/Safepoint/AssertSafepointCheckConsistency4.java
>>> No comments.
>>>
>>> test/runtime/Unsafe/RangeCheck.java
>>> No comments.
>>>
>>> test/runtime/memory/ReadFromNoaccessArea.java
>>> No comments.
>>>
>>>
>>>
>>> Thanks
>>> Yumin
>>>
>>>
>>> On
>>> 3/23/2015 11:48 AM,
>>> Yumin Qi wrote:
>>>
>>> Since Thomas is
>>> not a reviewer in open jdk,
>>> I need two
>>> volunteers (at least one
>>> "R"eviewer).
>>>
>>> Dan, since you
>>> already reviewed previous
>>> version, could you
>>> have a look?
>>>
>>> Thanks
>>> Yumin
>>>
>>> On 3/20/2015 3:20
>>> PM, Yumin Qi wrote:
>>>
>>> Thomas,
>>>
>>> Thanks. Yes,
>>> it is webrev04. Also, I
>>> have updated
>>> webrev04 to add another
>>> test case change:
>>> test/runtime/memory/ReadFromNoaccessArea.java
>>> (Thanks Dan's
>>> update)
>>>
>>> Thanks
>>> Yumin
>>>
>>> On 3/20/2015
>>> 11:55 AM, Thomas Stüfe wrote:
>>>
>>>
>>> Hi Yumin,
>>>
>>>
>>
>
More information about the hotspot-runtime-dev
mailing list