RFR(S): 8074354: tests that intentionally crash the VM can timeout creating a core file
Daniel D. Daugherty
daniel.daugherty at oracle.com
Wed Apr 15 18:18:07 UTC 2015
Yumin,
I think Thomas made enough contributions during the review
cycle to count him as a (r)eviewer.
However, it does not look like Thomas has an OpenJDK user
name so it's more difficult.What I do in cases like that is
add something to this line:
Summary: <my usual summary>; Also reviewed by thomas.stuefe at gmail.com.
You may also choose to add Thomas to the contributed by line which
would look like this:
Contributed-by: yumin.qi at oracle.com, thomas.stuefe at gmail.com
Dan
On 4/15/15 12:08 PM, Yumin Qi wrote:
> On 4/15/2015 11:05 AM, Yumin Qi wrote:
>> I need one more reviewer for push --- Thomas Stufe is not a reviewer.
>>
> In fact I am confused here --- Thomas is not a reviewer and not a
> commit-er either, so could I count him in review list.
>
>> Thanks
>> Yumin
>>
>> On 4/14/2015 12:40 PM, Daniel D. Daugherty wrote:
>>> On 4/14/15 8:42 AM, Yumin Qi wrote:
>>>> Dan,
>>>>
>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev08/
>>>
>>> Thumbs up on this round.
>>>
>>>
>>> src/share/vm/runtime/globals.hpp
>>> No comments.
>>>
>>> src/share/vm/runtime/os.hpp
>>> No comments.
>>>
>>> src/share/vm/runtime/arguments.cpp
>>> No comments.
>>>
>>> src/share/vm/utilities/vmError.hpp
>>> No comments.
>>>
>>> src/share/vm/utilities/vmError.cpp
>>> No comments.
>>>
>>> src/os/aix/vm/os_aix.cpp
>>> src/os/bsd/vm/os_bsd.cpp
>>> src/os/linux/vm/os_linux.cpp
>>> src/os/posix/vm/os_posix.cpp
>>> src/os/solaris/vm/os_solaris.cpp
>>> No comments on the above five files.
>>>
>>> src/os/windows/vm/os_windows.cpp
>>> No comments.
>>>
>>> test/runtime/ErrorHandling/ProblematicFrameTest.java
>>> test/runtime/ErrorHandling/SafeFetchInErrorHandlingTest.java
>>> No comments on the above two files.
>>>
>>> test/runtime/ErrorHandling/SecondaryErrorTest.java
>>> No comments.
>>>
>>> test/runtime/Safepoint/AssertSafepointCheckConsistency1.java
>>> test/runtime/Safepoint/AssertSafepointCheckConsistency2.java
>>> test/runtime/Safepoint/AssertSafepointCheckConsistency3.java
>>> test/runtime/Safepoint/AssertSafepointCheckConsistency4.java
>>> test/runtime/Unsafe/RangeCheck.java
>>> test/runtime/memory/ReadFromNoaccessArea.java
>>> No comments on the above six files.
>>>
>>>
>>> Dan
>>>
>>>
>>>> Sorry missing the comment change. Now it is done.
>>>>
>>>> Thanks
>>>> Yumin
>>>>
>>>> On 4/14/2015 7:14 AM, Yumin Qi wrote:
>>>>> Dan,
>>>>>
>>>>>
>>>>> I missed your comments, so will change again. Sorry for that.
>>>>> Will update soon.
>>>>>
>>>>> Thanks
>>>>> Yumin
>>>>>
>>>>> On 4/13/2015 10:13 PM, Yumin Qi wrote:
>>>>>> Dan, Thanks.
>>>>>>
>>>>>> Changed to add CloseHandle(dumpFile) before all exit paths.
>>>>>> Please review at same link:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev08/
>>>>>>
>>>>>> Yumin
>>>>>>
>>>>>>
>>>>>> On 4/13/2015 7:04 PM, Daniel D. Daugherty wrote:
>>>>>>> Yumin,
>>>>>>>
>>>>>>> I would prefer that the handle is properly closed on all the
>>>>>>> exit paths. Not closing the handle might even trigger a future
>>>>>>> Microsoft sanity check; kind of like when Microsoft added checks
>>>>>>> for duplicate close() calls...
>>>>>>>
>>>>>>> Dan
>>>>>>>
>>>>>>>
>>>>>>> On 4/13/15 4:05 PM, Yumin Qi wrote:
>>>>>>>> Dan,
>>>>>>>>
>>>>>>>> On 4/13/2015 2:09 PM, Daniel D. Daugherty wrote:
>>>>>>>>> On 4/13/15 10:58 AM, Yumin Qi wrote:
>>>>>>>>>> Thomas,
>>>>>>>>>>
>>>>>>>>>> Answers embeded below. The new web is just the same link
>>>>>>>>>> updated (08):
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev08/
>>>>>>>>>
>>>>>>>>> src/share/vm/runtime/globals.hpp
>>>>>>>>> No comments.
>>>>>>>>>
>>>>>>>>> src/share/vm/runtime/os.hpp
>>>>>>>>> L720: // ON Posix compatible OS it will
>>>>>>>>> typo: 'ON' -> 'On'
>>>>>>>>>
>>>>>>>>> L724: ... or a short error message, depends
>>>>>>>>> grammar: 'depends' -> 'depending'
>>>>>>>>>
>>>>>>>>> L725: // on checking result.
>>>>>>>>> grammar: 'on checking' -> 'on the checking'
>>>>>>>>>
>>>>>>>>> src/share/vm/runtime/arguments.cpp
>>>>>>>>> No comments.
>>>>>>>>>
>>>>>>>>> src/share/vm/utilities/vmError.hpp
>>>>>>>>> No comments.
>>>>>>>>>
>>>>>>>>> src/share/vm/utilities/vmError.cpp
>>>>>>>>> No comments.
>>>>>>>>>
>>>>>>>>> src/os/aix/vm/os_aix.cpp
>>>>>>>>> src/os/bsd/vm/os_bsd.cpp
>>>>>>>>> src/os/linux/vm/os_linux.cpp
>>>>>>>>> src/os/posix/vm/os_posix.cpp
>>>>>>>>> src/os/solaris/vm/os_solaris.cpp
>>>>>>>>> No comments on the above five files.
>>>>>>>>>
>>>>>>>>> src/os/windows/vm/os_windows.cpp
>>>>>>>>> L1010: jio_snprintf(buffer, buffsz, "Failed to create
>>>>>>>>> coredump file (0x%x).", GetLastError());
>>>>>>>>> Because this is a Win* specific file you can use
>>>>>>>>> "minidump"
>>>>>>>>> here instead of "coredump".
>>>>>>>>>
>>>>>>>>> I made a similar comment in the previous round.
>>>>>>>>>
>>>>>>>> Yes, I will change 'coredump' into 'minidump'
>>>>>>>>> L1028: if (!dump_core || !dumpFile) {
>>>>>>>>> The "!dumpFile" part is an implied boolean which we don't
>>>>>>>>> like to use in HotSpot. Perhaps: dumpFile == NULL
>>>>>>>>>
>>>>>>>> Will change to Hotspot style.
>>>>>>>>> L1078: win32::exit_process_or_thread(win32::EPT_PROCESS, 1);
>>>>>>>>> The "CloseHandle(dumpFile)" call is missing.
>>>>>>>>>
>>>>>>>> There is no need to call CloseHandle here explicitly, since we
>>>>>>>> will exit next and all resources will be reclaimed by OS. I
>>>>>>>> remove it at last since think it is not needed here. If you
>>>>>>>> check previous calls to
>>>>>>>> win32::exit_process_or_thread(win32::EPT_PROCESS, 1);
>>>>>>>> I did not place CloseHandle(dumpFile) either.
>>>>>>>> If needed, I think we should put the call before all exit entry.
>>>>>>>>> test/runtime/ErrorHandling/ProblematicFrameTest.java
>>>>>>>>> test/runtime/ErrorHandling/SafeFetchInErrorHandlingTest.java
>>>>>>>>> No comments on the above two files.
>>>>>>>>>
>>>>>>>>> test/runtime/ErrorHandling/SecondaryErrorTest.java
>>>>>>>>> No comments.
>>>>>>>>>
>>>>>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency1.java
>>>>>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency2.java
>>>>>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency3.java
>>>>>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency4.java
>>>>>>>>> test/runtime/Unsafe/RangeCheck.java
>>>>>>>>> test/runtime/memory/ReadFromNoaccessArea.java
>>>>>>>>> No comments on the above six files.
>>>>>>>> Thanks
>>>>>>>> Yumin
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Dan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> Yumin
>>>>>>>>>>
>>>>>>>>>> On 4/13/2015 12:30 AM, Thomas Stüfe wrote:
>>>>>>>>>>> Hi Yumin,
>>>>>>>>>>>
>>>>>>>>>>> some small things:
>>>>>>>>>>>
>>>>>>>>>>> os_windows.cpp:1012
>>>>>>>>>>>
>>>>>>>>>>> + if (dumpFile == INVALID_HANDLE_VALUE) {
>>>>>>>>>>> + jio_snprintf(buffer, buffsz, "Failed to create
>>>>>>>>>>> coredump file");
>>>>>>>>>>> + status = false;
>>>>>>>>>>> + }
>>>>>>>>>>>
>>>>>>>>>>> Please print out the error number too:
>>>>>>>>>>> + jio_snprintf(buffer, buffsz, "Failed to create
>>>>>>>>>>> coredump file (0x%x).", ::GetLastError());
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> Changed as above. No :: global scope operator needed here.
>>>>>>>>>>>
>>>>>>>>>>> os_windows.cpp:1036
>>>>>>>>>>>
>>>>>>>>>>> + assert(dumpFile != NULL, "dumpFile should not be NULL");
>>>>>>>>>>>
>>>>>>>>>>> Please remove this assert. I would not do any asserts at
>>>>>>>>>>> this stage, we are dying anyway and a recursive assert would
>>>>>>>>>>> just confuse things. Moreover, CreateFile may have failed
>>>>>>>>>>> for "normal" reasons.
>>>>>>>>>>>
>>>>>>>>>> removed.
>>>>>>>>>>> Instead, if dumpFile is NULL, just skip the whole minidump
>>>>>>>>>>> stuff in os::abort; just behave the same way as for
>>>>>>>>>>> os::abort(false):
>>>>>>>>>>>
>>>>>>>>>>> + if (!dump_core || !dumpFile) {
>>>>>>>>>>> + win32::exit_process_or_thread(win32::EPT_PROCESS, 1);
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> changed as expected.
>>>>>>>>>>>
>>>>>>>>>>> os_windows.cpp: 1076
>>>>>>>>>>>
>>>>>>>>>>> I would just skip the whole "FormatMessage" stuff. It is
>>>>>>>>>>> just sugar, but it will allocate buffers, which we may not
>>>>>>>>>>> want at this stage. I think the plain error code from
>>>>>>>>>>> GetLastError() is sufficient here. So I would just:
>>>>>>>>>>>
>>>>>>>>>>> 1076 // Older versions of dbghelp.dll (the one shipped
>>>>>>>>>>> with Win2003 for example) may not support all
>>>>>>>>>>> 1077 // the dump types we really want. If first call
>>>>>>>>>>> fails, lets fall back to just use MiniDumpWithFullMemory then.
>>>>>>>>>>> 1078 if (_MiniDumpWriteDump(hProcess, processId, dumpFile,
>>>>>>>>>>> dumpType, pmei, NULL, NULL) == false &&
>>>>>>>>>>> 1079 _MiniDumpWriteDump(hProcess, processId, dumpFile,
>>>>>>>>>>> (MINIDUMP_TYPE)MiniDumpWithFullMemory, pmei, NULL, NULL) ==
>>>>>>>>>>> false) {
>>>>>>>>>>> jio_fprintf(stderr, "Call to MiniDumpWriteDump()
>>>>>>>>>>> failed (Error 0x%x)\n", ::GetLastError());
>>>>>>>>>>> 1089 } else {
>>>>>>>>>>> 1090 ....
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> Agree, this is the last step before exit, with output of last
>>>>>>>>>> error code, user can get what happened --- though a little
>>>>>>>>>> effort to research.
>>>>>>>>>>
>>>>>>>>>>> Rest is fine.
>>>>>>>>>>> Thomas
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Apr 13, 2015 at 7:59 AM, Yumin Qi
>>>>>>>>>>> <yumin.qi at oracle.com <mailto:yumin.qi at oracle.com>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Thomas and Dan,
>>>>>>>>>>>
>>>>>>>>>>> Please review the changes at
>>>>>>>>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev08/
>>>>>>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev08/>
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> Yumin
>>>>>>>>>>>
>>>>>>>>>>> On 4/10/2015 10:14 PM, Yumin Qi wrote:
>>>>>>>>>>>
>>>>>>>>>>> Thomas,
>>>>>>>>>>>
>>>>>>>>>>> Thanks for review again.
>>>>>>>>>>>
>>>>>>>>>>> On 4/10/2015 4:20 AM, Thomas Stüfe wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi Yumin,
>>>>>>>>>>>
>>>>>>>>>>> Hi Yumin,
>>>>>>>>>>>
>>>>>>>>>>> this looks mostly fine for me. I only was able
>>>>>>>>>>> to test
>>>>>>>>>>> it on Linux, but will test it on other OSes next
>>>>>>>>>>> week.
>>>>>>>>>>>
>>>>>>>>>>> Here two remaining small points:
>>>>>>>>>>>
>>>>>>>>>>> - in os::check_dump_limit, maybe a comment would be
>>>>>>>>>>> helpful, because the contract for this function
>>>>>>>>>>> is a
>>>>>>>>>>> bit difficult to understand. Proposal:
>>>>>>>>>>>
>>>>>>>>>>> "Check or prepare a core dump to be taken at a
>>>>>>>>>>> later
>>>>>>>>>>> point in the same thread in os::abort(). Use the
>>>>>>>>>>> caller provided buffer as a scratch buffer. Output
>>>>>>>>>>> status message via
>>>>>>>>>>> VMError::report_cordedump_status()
>>>>>>>>>>> (on success, the core dump file location, on
>>>>>>>>>>> error, a
>>>>>>>>>>> short error message) - this status message which
>>>>>>>>>>> will
>>>>>>>>>>> be written into the error log."
>>>>>>>>>>>
>>>>>>>>>>> I will take the comments, new comments:
>>>>>>>>>>>
>>>>>>>>>>> // ON Posix compatible OS it will simply check
>>>>>>>>>>> core dump
>>>>>>>>>>> limits while on Windows
>>>>>>>>>>> // it will check if dump file can be created.
>>>>>>>>>>> Check or
>>>>>>>>>>> prepare a core dump to be
>>>>>>>>>>> // taken at a later point in the same thread in
>>>>>>>>>>> os::abort(). Use the caller
>>>>>>>>>>> // provided buffer as a scratch buffer. The status
>>>>>>>>>>> message which will be written
>>>>>>>>>>> // into the error log either is file location or a
>>>>>>>>>>> short
>>>>>>>>>>> error message, depends
>>>>>>>>>>> // on checking result.
>>>>>>>>>>> static void check_dump_limit(char* buffer, size_t
>>>>>>>>>>> bufferSize);
>>>>>>>>>>>
>>>>>>>>>>> - It would make sense to improve the output in
>>>>>>>>>>> VMError.report() in STEP(63) a bit:
>>>>>>>>>>>
>>>>>>>>>>> Problem 1: we use past-tense, which is misleading
>>>>>>>>>>> ("Core dump written", "Failed to write core").
>>>>>>>>>>> We did
>>>>>>>>>>> not generate the core dump yet, we just plan to
>>>>>>>>>>> do it,
>>>>>>>>>>> and it still may fail for various reasons (e.g.
>>>>>>>>>>> even
>>>>>>>>>>> if the limits are fine, the current directory
>>>>>>>>>>> may be
>>>>>>>>>>> write protected. There is no way to catch all
>>>>>>>>>>> errors
>>>>>>>>>>> beforehand). If the user reads "Core dump
>>>>>>>>>>> written", he
>>>>>>>>>>> assumes the core dump was written successfully. If
>>>>>>>>>>> abort(3) later fails to write the core dump,
>>>>>>>>>>> user will
>>>>>>>>>>> be confused.
>>>>>>>>>>>
>>>>>>>>>>> Will change the message to
>>>>>>>>>>> if (coredump_status) {
>>>>>>>>>>> st->print("Core dump will be written. %s",
>>>>>>>>>>> coredump_message);
>>>>>>>>>>> } else {
>>>>>>>>>>> st->print("No core dump will be written. %s",
>>>>>>>>>>> coredump_message);
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> as your suggested.
>>>>>>>>>>> Whether the coredump can be successfully created
>>>>>>>>>>> depends
>>>>>>>>>>> on following core file generation procedure. User can
>>>>>>>>>>> trace error messages if failed to create.
>>>>>>>>>>>
>>>>>>>>>>> Problem 2: this is more obscure: on Linux core
>>>>>>>>>>> dumps
>>>>>>>>>>> may not be written to the file system at all but be
>>>>>>>>>>> processed by some other program. On my Ubuntu 14.4
>>>>>>>>>>> machine I see this output:
>>>>>>>>>>>
>>>>>>>>>>> #
>>>>>>>>>>> # Core dump written. Default location: Core
>>>>>>>>>>> dumps may
>>>>>>>>>>> be processed with "/usr/share/apport/apport %p
>>>>>>>>>>> %s %c
>>>>>>>>>>> %P" (or dumping to
>>>>>>>>>>> /shared/projects/openjdk/jdk9-hs-rt/output/core.26261)
>>>>>>>>>>> #
>>>>>>>>>>>
>>>>>>>>>>> This output is a bit jarring.
>>>>>>>>>>>
>>>>>>>>>>> I have not tested on Ubuntu so no comments.
>>>>>>>>>>>
>>>>>>>>>>> I will post a new webrev based on your and Dan's
>>>>>>>>>>> comments.
>>>>>>>>>>> Thanks for your patience.
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> Yumin
>>>>>>>>>>>
>>>>>>>>>>> My suggestion for better texts would be:
>>>>>>>>>>>
>>>>>>>>>>> "Core dump will be written. %s"
>>>>>>>>>>> "No core dump will be written. %s"
>>>>>>>>>>>
>>>>>>>>>>> which hopefully fits all cases better. But I don't
>>>>>>>>>>> know, other suggestions are welcome :-)
>>>>>>>>>>>
>>>>>>>>>>> Otherwise, this looks fine.
>>>>>>>>>>>
>>>>>>>>>>> Thanks for your work!
>>>>>>>>>>>
>>>>>>>>>>> Thomas
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Apr 10, 2015 at 5:18 AM, Yumin Qi
>>>>>>>>>>> <yumin.qi at oracle.com <mailto:yumin.qi at oracle.com>
>>>>>>>>>>> <mailto:yumin.qi at oracle.com
>>>>>>>>>>> <mailto:yumin.qi at oracle.com>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi, Thomas and Dan
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev07/
>>>>>>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>>>>>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>>>>>>>>>>
>>>>>>>>>>> On 4/9/2015 7:51 PM, Yumin Qi wrote:
>>>>>>>>>>>
>>>>>>>>>>> Please hold on, I missed the test cases:
>>>>>>>>>>>
>>>>>>>>>>> *test/runtime/Unsafe/RangeCheck.java
>>>>>>>>>>> *
>>>>>>>>>>> *test/runtime/memory/ReadFromNoaccessArea.java
>>>>>>>>>>>
>>>>>>>>>>> Added.
>>>>>>>>>>>
>>>>>>>>>>> Also some file heads need to be updated
>>>>>>>>>>> with
>>>>>>>>>>> this year.
>>>>>>>>>>>
>>>>>>>>>>> Changed
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> Yumin
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> Yumin
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *
>>>>>>>>>>> On 4/9/2015 7:07 PM, Yumin Qi wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi, Thomas and Dan
>>>>>>>>>>>
>>>>>>>>>>> The same URL is updated with new
>>>>>>>>>>> version.
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev07/
>>>>>>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>>>>>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>>>>>>>>>>
>>>>>>>>>>> I remove the public access to
>>>>>>>>>>> VMError::out,
>>>>>>>>>>> VMError::coredump_message and the size of
>>>>>>>>>>> VMError::coredump_message.
>>>>>>>>>>> Added private static HANDLE in
>>>>>>>>>>> os_windows.cpp for
>>>>>>>>>>> dumpFile as suggested by Thomas.
>>>>>>>>>>>
>>>>>>>>>>> Tests passes JPRT and manual tests.
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> Yumin
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 4/9/2015 1:32 AM, Thomas Stüfe
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi Yumin,
>>>>>>>>>>>
>>>>>>>>>>> I do not think exposing
>>>>>>>>>>> VMError::coredump_status
>>>>>>>>>>> buffer just for the sake of
>>>>>>>>>>> using it
>>>>>>>>>>> as a scratch
>>>>>>>>>>> buffer in os::abort() is a good
>>>>>>>>>>> idea.
>>>>>>>>>>> It is very
>>>>>>>>>>> confusing and has nothing to do
>>>>>>>>>>> with
>>>>>>>>>>> the purpose of
>>>>>>>>>>> VMerror::coredump_status.
>>>>>>>>>>>
>>>>>>>>>>> Also, VMerror::coredump_status is
>>>>>>>>>>> static, so you do
>>>>>>>>>>> not even gain anything over just
>>>>>>>>>>> using
>>>>>>>>>>> a global or
>>>>>>>>>>> function scope static buffer.
>>>>>>>>>>> Either
>>>>>>>>>>> we are worried
>>>>>>>>>>> about thread safety in
>>>>>>>>>>> os::abort(), or
>>>>>>>>>>> we aren't. If
>>>>>>>>>>> we aren't, just use a local static
>>>>>>>>>>> buffer, it is much
>>>>>>>>>>> clearer and no need to expose
>>>>>>>>>>> VMError::coredump_status
>>>>>>>>>>> like this.
>>>>>>>>>>>
>>>>>>>>>>> Apart from that, see my remarks in
>>>>>>>>>>> yesterdays mail.
>>>>>>>>>>>
>>>>>>>>>>> Cheers, Thomas
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Apr 9, 2015 at 5:16 AM,
>>>>>>>>>>> Yumin Qi
>>>>>>>>>>> <yumin.qi at oracle.com
>>>>>>>>>>> <mailto:yumin.qi at oracle.com>
>>>>>>>>>>> <mailto:yumin.qi at oracle.com
>>>>>>>>>>> <mailto:yumin.qi at oracle.com>>
>>>>>>>>>>> <mailto:yumin.qi at oracle.com
>>>>>>>>>>> <mailto:yumin.qi at oracle.com>
>>>>>>>>>>> <mailto:yumin.qi at oracle.com
>>>>>>>>>>> <mailto:yumin.qi at oracle.com>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi, Dan
>>>>>>>>>>>
>>>>>>>>>>> New webrev at
>>>>>>>>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev07/
>>>>>>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>>>>>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>>>>>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev07/>
>>>>>>>>>>>
>>>>>>>>>>> To make code clean, add
>>>>>>>>>>> os::abort_internal which
>>>>>>>>>>> is only used
>>>>>>>>>>> for Windows.
>>>>>>>>>>> Since passing
>>>>>>>>>>> VMError::coredump_message and its
>>>>>>>>>>> size as
>>>>>>>>>>> parameters across multiple
>>>>>>>>>>> functions makes
>>>>>>>>>>> function have a bigger
>>>>>>>>>>> signature, I change the way to
>>>>>>>>>>> access them -- add
>>>>>>>>>>> two functions in
>>>>>>>>>>> VMError to access them:
>>>>>>>>>>> coredump_msg_buffer() and
>>>>>>>>>>> coredump_msg_buffer_size(). They
>>>>>>>>>>> are static
>>>>>>>>>>> members, so can save
>>>>>>>>>>> stack space for passing around.
>>>>>>>>>>>
>>>>>>>>>>> in os_windows.cpp,
>>>>>>>>>>> abort(bool,
>>>>>>>>>>> void*, void*):
>>>>>>>>>>> if error encountered, exit
>>>>>>>>>>> with
>>>>>>>>>>> error messages.
>>>>>>>>>>> If no error,
>>>>>>>>>>> VMError::coredump_message contains
>>>>>>>>>>> the dump
>>>>>>>>>>> path/name. I just use them
>>>>>>>>>>> so no
>>>>>>>>>>> need to have
>>>>>>>>>>> extra space for name
>>>>>>>>>>> allocation.
>>>>>>>>>>>
>>>>>>>>>>> Passed JPRT and manual tests.
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> Yumin
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 4/7/2015 9:23 AM, Daniel D.
>>>>>>>>>>> Daugherty wrote:
>>>>>>>>>>>
>>>>>>>>>>> On 4/1/15 10:57 AM,
>>>>>>>>>>> Yumin Qi
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi, Dan and Thomas,
>>>>>>>>>>>
>>>>>>>>>>> I have a new
>>>>>>>>>>> webrev at
>>>>>>>>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev06/
>>>>>>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev06/>
>>>>>>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev06/>
>>>>>>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev06/>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> src/share/vm/runtime/globals.hpp
>>>>>>>>>>> Sorry, I didn't notice
>>>>>>>>>>> this before:
>>>>>>>>>>>
>>>>>>>>>>> L939: product(bool,
>>>>>>>>>>> CreateCoredumpOnCrash,
>>>>>>>>>>> true, \
>>>>>>>>>>> L940: "Create
>>>>>>>>>>> minidump on
>>>>>>>>>>> VM fatal
>>>>>>>>>>> error") \
>>>>>>>>>>>
>>>>>>>>>>> The "minidump" on L940
>>>>>>>>>>> should change
>>>>>>>>>>> also. Don't know if you
>>>>>>>>>>> want to say "core/mini
>>>>>>>>>>> dump" or
>>>>>>>>>>> something else.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> src/share/vm/runtime/os.hpp
>>>>>>>>>>> L719: // On
>>>>>>>>>>> Linux/Solaris it will simply
>>>>>>>>>>> check core dump
>>>>>>>>>>> limits while on Windows
>>>>>>>>>>> will
>>>>>>>>>>> do nothing.
>>>>>>>>>>> grammar: 'on Windows
>>>>>>>>>>> will do nothing'
>>>>>>>>>>> -> 'on Windows
>>>>>>>>>>> it will do nothing'
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> src/share/vm/runtime/arguments.cpp
>>>>>>>>>>> No comments.
>>>>>>>>>>>
>>>>>>>>>>> src/share/vm/utilities/vmError.hpp
>>>>>>>>>>> No comments.
>>>>>>>>>>>
>>>>>>>>>>> src/share/vm/utilities/vmError.cpp
>>>>>>>>>>> No comments.
>>>>>>>>>>>
>>>>>>>>>>> src/os/aix/vm/os_aix.cpp
>>>>>>>>>>> No comments.
>>>>>>>>>>>
>>>>>>>>>>> src/os/bsd/vm/os_bsd.cpp
>>>>>>>>>>> No comments.
>>>>>>>>>>>
>>>>>>>>>>> src/os/linux/vm/os_linux.cpp
>>>>>>>>>>> No comments.
>>>>>>>>>>>
>>>>>>>>>>> src/os/posix/vm/os_posix.cpp
>>>>>>>>>>> No comments.
>>>>>>>>>>>
>>>>>>>>>>> src/os/solaris/vm/os_solaris.cpp
>>>>>>>>>>> No comments.
>>>>>>>>>>>
>>>>>>>>>>> src/os/windows/vm/os_windows.cpp
>>>>>>>>>>> L92:
>>>>>>>>>>> jio_snprintf(buffer,
>>>>>>>>>>> bufferSize,
>>>>>>>>>>> "%s\\hs_err_pid%u.mdmp",
>>>>>>>>>>> get_current_directory(NULL, 0),
>>>>>>>>>>> current_process_id());
>>>>>>>>>>> Other non-Windows calls to
>>>>>>>>>>> get_current_directory() do a NULL
>>>>>>>>>>> check before using the
>>>>>>>>>>> return value.
>>>>>>>>>>> The original Windows
>>>>>>>>>>> code that called
>>>>>>>>>>> get_current_directory() did not
>>>>>>>>>>> make this
>>>>>>>>>>> check, but I think
>>>>>>>>>>> it's better to be
>>>>>>>>>>> consistent and safe.
>>>>>>>>>>>
>>>>>>>>>>> L1007: static char
>>>>>>>>>>> buffer[O_BUFLEN];
>>>>>>>>>>> L1008: char
>>>>>>>>>>> filename[FILENAME_MAX];
>>>>>>>>>>> Two more potentially
>>>>>>>>>>> large stack
>>>>>>>>>>> variables here. Will we
>>>>>>>>>>> run into stack
>>>>>>>>>>> overflow on Win* more
>>>>>>>>>>> frequently when what
>>>>>>>>>>> we really wanted
>>>>>>>>>>> was
>>>>>>>>>>> the reason for
>>>>>>>>>>> the abort() call?
>>>>>>>>>>>
>>>>>>>>>>> The reason the
>>>>>>>>>>> original code used
>>>>>>>>>>> 'buffer' to construct the
>>>>>>>>>>> dump file name was to
>>>>>>>>>>> save space. The
>>>>>>>>>>> original code was also
>>>>>>>>>>> careful to not need
>>>>>>>>>>> the file name in
>>>>>>>>>>> any error messages after
>>>>>>>>>>> the point at which the
>>>>>>>>>>> dump file was
>>>>>>>>>>> successfully created.
>>>>>>>>>>>
>>>>>>>>>>> It would be better
>>>>>>>>>>> (more resilient) if
>>>>>>>>>>> you could pass in
>>>>>>>>>>> an already existing
>>>>>>>>>>> buffer to abort()
>>>>>>>>>>> like the original
>>>>>>>>>>> code passed into
>>>>>>>>>>> os::check_or_create_dump(). Yes, that
>>>>>>>>>>> would require visiting
>>>>>>>>>>> all existing
>>>>>>>>>>> calls to abort().
>>>>>>>>>>>
>>>>>>>>>>> L1015:
>>>>>>>>>>> assert(out->is_open(),
>>>>>>>>>>> "defaultStream should be
>>>>>>>>>>> open");
>>>>>>>>>>> Is it possible for
>>>>>>>>>>> os::abort() to be
>>>>>>>>>>> called early enough on
>>>>>>>>>>> Windows such that
>>>>>>>>>>> VMError::out is not
>>>>>>>>>>> actually setup? Yes,
>>>>>>>>>>> this would have to be
>>>>>>>>>>> absolutely
>>>>>>>>>>> catastrophic...
>>>>>>>>>>>
>>>>>>>>>>> See the comment in
>>>>>>>>>>> os_solaris.cpp
>>>>>>>>>>> L1520-1522.
>>>>>>>>>>>
>>>>>>>>>>> L1018:
>>>>>>>>>>> jio_snprintf(buffer,
>>>>>>>>>>> sizeof(buffer), "%s", "Either
>>>>>>>>>>> CreateCoredumpOnCrash is
>>>>>>>>>>> disabled from command"
>>>>>>>>>>> L1019 " line or
>>>>>>>>>>> coredump
>>>>>>>>>>> is not available");
>>>>>>>>>>> When you are using
>>>>>>>>>>> jio_snprintf() as a
>>>>>>>>>>> replacement for:
>>>>>>>>>>>
>>>>>>>>>>> strncpy(buffer, "my string",
>>>>>>>>>>> sizeof(buffer));
>>>>>>>>>>> buffer[sizeof(buffer) - 1] = '\0';
>>>>>>>>>>>
>>>>>>>>>>> you don't need the
>>>>>>>>>>> "%s" format string.
>>>>>>>>>>> So this code would be:
>>>>>>>>>>>
>>>>>>>>>>> jio_snprintf(buffer, sizeof(buffer),
>>>>>>>>>>> "Either
>>>>>>>>>>> CreateCoredumpOnCrash is disabled
>>>>>>>>>>> from "
>>>>>>>>>>> "command line or coredump
>>>>>>>>>>> is not available");
>>>>>>>>>>>
>>>>>>>>>>> Same comment for:
>>>>>>>>>>> L1026, L1039
>>>>>>>>>>>
>>>>>>>>>>> L1020: goto done;
>>>>>>>>>>> The use of 'goto'
>>>>>>>>>>> makes me cringe. You
>>>>>>>>>>> can refactor much
>>>>>>>>>>> of this logic
>>>>>>>>>>> into an
>>>>>>>>>>> abort_internal()
>>>>>>>>>>> function that
>>>>>>>>>>> abort() calls and keep
>>>>>>>>>>> all the
>>>>>>>>>>> original "return" logic
>>>>>>>>>>> to bail out.
>>>>>>>>>>>
>>>>>>>>>>> L1045: // Older
>>>>>>>>>>> versions
>>>>>>>>>>> of dbghelp.h
>>>>>>>>>>> doesn't contain all
>>>>>>>>>>> grammar: 'doesn't' ->
>>>>>>>>>>> 'do not'
>>>>>>>>>>>
>>>>>>>>>>> L1052: cwd =
>>>>>>>>>>> get_current_directory(NULL, 0);
>>>>>>>>>>> L1053:
>>>>>>>>>>> jio_snprintf(filename,
>>>>>>>>>>> sizeof(filename),
>>>>>>>>>>> "%s\\hs_err_pid%u.mdmp", cwd,
>>>>>>>>>>> current_process_id());
>>>>>>>>>>> Other non-Windows calls to
>>>>>>>>>>> get_current_directory() do a NULL
>>>>>>>>>>> check before using the
>>>>>>>>>>> return value.
>>>>>>>>>>> The original Windows
>>>>>>>>>>> code that called
>>>>>>>>>>> get_current_directory() did not
>>>>>>>>>>> make this
>>>>>>>>>>> check, but I think
>>>>>>>>>>> it's better to be
>>>>>>>>>>> consistent and safe.
>>>>>>>>>>>
>>>>>>>>>>> L1092: // well done.
>>>>>>>>>>> Like a overcooked
>>>>>>>>>>> steak? :-) Not sure
>>>>>>>>>>> what you're trying
>>>>>>>>>>> to say here...
>>>>>>>>>>>
>>>>>>>>>>> test/runtime/ErrorHandling/ProblematicFrameTest.java
>>>>>>>>>>> No comments.
>>>>>>>>>>>
>>>>>>>>>>> test/runtime/ErrorHandling/SafeFetchInErrorHandlingTest.java
>>>>>>>>>>> No comments.
>>>>>>>>>>>
>>>>>>>>>>> test/runtime/ErrorHandling/SecondaryErrorTest.java
>>>>>>>>>>> No comments.
>>>>>>>>>>>
>>>>>>>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency1.java
>>>>>>>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency2.java
>>>>>>>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency3.java
>>>>>>>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency4.java
>>>>>>>>>>> No comments on these
>>>>>>>>>>> four.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Dan
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> What changes:
>>>>>>>>>>>
>>>>>>>>>>> 1) os_windows.cpp,
>>>>>>>>>>> os::abort(bool ....):
>>>>>>>>>>> a) Since when
>>>>>>>>>>> abort() is called, the
>>>>>>>>>>> last function called
>>>>>>>>>>> before exit process,
>>>>>>>>>>> VMError::log already closed, which
>>>>>>>>>>> is the file logging.
>>>>>>>>>>> defaultStream still
>>>>>>>>>>> available to use, so
>>>>>>>>>>> use it for writing
>>>>>>>>>>> error
>>>>>>>>>>> or success
>>>>>>>>>>> messages. VMError::out is
>>>>>>>>>>> the defaultStream, for
>>>>>>>>>>> using it, added a
>>>>>>>>>>> function out_stream()
>>>>>>>>>>> in VMError.
>>>>>>>>>>> b) Delete the codes
>>>>>>>>>>> which decide to
>>>>>>>>>>> write core based on
>>>>>>>>>>> client/server version with
>>>>>>>>>>> debug. The new
>>>>>>>>>>> code is if
>>>>>>>>>>> CreateCoredumpOnCrash set or dump_core is
>>>>>>>>>>> on, the coredump will
>>>>>>>>>>> be tried no matter
>>>>>>>>>>> which
>>>>>>>>>>> version it is.
>>>>>>>>>>> There is no special
>>>>>>>>>>> reason not to dump core
>>>>>>>>>>> with client
>>>>>>>>>>> version I think. Any
>>>>>>>>>>> concern
>>>>>>>>>>> for this change?
>>>>>>>>>>> c) Use of 'goto' in
>>>>>>>>>>> this function. I
>>>>>>>>>>> tried not to use it,
>>>>>>>>>>> but using it makes code
>>>>>>>>>>> clear. I remember
>>>>>>>>>>> somewhere suggest not
>>>>>>>>>>> using 'goto' in
>>>>>>>>>>> cplusplus.
>>>>>>>>>>>
>>>>>>>>>>> 2) changed
>>>>>>>>>>> comments to
>>>>>>>>>>> make them
>>>>>>>>>>> consistent as indicated
>>>>>>>>>>> by Dan.
>>>>>>>>>>>
>>>>>>>>>>> 3)
>>>>>>>>>>> Added/modified head
>>>>>>>>>>> version for files.
>>>>>>>>>>>
>>>>>>>>>>> Tests: JPRT, manual
>>>>>>>>>>> test on local Windows.
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> Yumin
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 3/31/2015 1:25 AM,
>>>>>>>>>>> Thomas Stüfe wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi Daniel, Yumin,
>>>>>>>>>>>
>>>>>>>>>>> sorry, a lot of Yumins
>>>>>>>>>>> changes were
>>>>>>>>>>> based on suggestions
>>>>>>>>>>> from
>>>>>>>>>>> me, so here more
>>>>>>>>>>> background:
>>>>>>>>>>>
>>>>>>>>>>> Yumin reversed the
>>>>>>>>>>> order of error log
>>>>>>>>>>> writing and core dumping.
>>>>>>>>>>> The reason for that
>>>>>>>>>>> are explained in
>>>>>>>>>>> detail at the start
>>>>>>>>>>> of the
>>>>>>>>>>> mail thread, in short:
>>>>>>>>>>>
>>>>>>>>>>> - on Windows
>>>>>>>>>>> core file
>>>>>>>>>>> dumping may
>>>>>>>>>>> hang - its rare but it
>>>>>>>>>>> happens - preventing
>>>>>>>>>>> error logs.
>>>>>>>>>>> Depending on who you ask,
>>>>>>>>>>> error logs are more
>>>>>>>>>>> important than
>>>>>>>>>>> minidumps, so it makes sense
>>>>>>>>>>> to first write the
>>>>>>>>>>> erorr log.
>>>>>>>>>>> - This also brings
>>>>>>>>>>> windows core paths
>>>>>>>>>>> closer to UNIX code
>>>>>>>>>>> paths. See below.
>>>>>>>>>>>
>>>>>>>>>>> About writing to
>>>>>>>>>>> stderr in os::abort():
>>>>>>>>>>>
>>>>>>>>>>> After this change, calling
>>>>>>>>>>> VmError::report_coredump_status() in
>>>>>>>>>>> os::abort() will not
>>>>>>>>>>> do anything
>>>>>>>>>>> because the error log is
>>>>>>>>>>> already written at
>>>>>>>>>>> that point. I
>>>>>>>>>>> suggested to Yumin writing to
>>>>>>>>>>> stderr instead in
>>>>>>>>>>> os::abort() because
>>>>>>>>>>> that mimicks UNIX
>>>>>>>>>>> behaviour: On UNIX,
>>>>>>>>>>> you call abort(3),
>>>>>>>>>>> which writes a core and
>>>>>>>>>>> aborts. It writes a
>>>>>>>>>>> short message to
>>>>>>>>>>> stderr "Aborted (core
>>>>>>>>>>> dumped)" or just
>>>>>>>>>>> "Aborted".
>>>>>>>>>>>
>>>>>>>>>>> After Yumins change,
>>>>>>>>>>> on Windows
>>>>>>>>>>> os::abort also writes the
>>>>>>>>>>> core,
>>>>>>>>>>> then aborts. It made
>>>>>>>>>>> sense to me that
>>>>>>>>>>> this function should
>>>>>>>>>>> behave the same way as
>>>>>>>>>>> on UNIX: if
>>>>>>>>>>> core writing fails,
>>>>>>>>>>> write to
>>>>>>>>>>> stderr. There is no
>>>>>>>>>>> way otherwise to
>>>>>>>>>>> communicate a core dump
>>>>>>>>>>> writing failure. After
>>>>>>>>>>> writing the
>>>>>>>>>>> core, process stops.
>>>>>>>>>>>
>>>>>>>>>>> ---
>>>>>>>>>>>
>>>>>>>>>>> About the control flows:
>>>>>>>>>>>
>>>>>>>>>>> Before Yumins change:
>>>>>>>>>>>
>>>>>>>>>>> On Windows
>>>>>>>>>>> 1)
>>>>>>>>>>> os::check_or_create_dump(): we
>>>>>>>>>>> wrote the core dump
>>>>>>>>>>> and then
>>>>>>>>>>> information about the
>>>>>>>>>>> dump (file
>>>>>>>>>>> location, success or error)
>>>>>>>>>>> was handed to the misnamed
>>>>>>>>>>> VmError::report_coredump_status(),
>>>>>>>>>>> which just stored that
>>>>>>>>>>> information.
>>>>>>>>>>> 2)
>>>>>>>>>>> VmError::report():
>>>>>>>>>>> Error log was
>>>>>>>>>>> written and in step 63
>>>>>>>>>>> information about the
>>>>>>>>>>> core (already
>>>>>>>>>>> written at that point) was
>>>>>>>>>>> added to error log.
>>>>>>>>>>> 3) os::abort() just
>>>>>>>>>>> did an exit(3)
>>>>>>>>>>>
>>>>>>>>>>> On Unix,
>>>>>>>>>>> 1)
>>>>>>>>>>> os::check_or_create_dump(): checks
>>>>>>>>>>> the limits to guess
>>>>>>>>>>> whether core dumping
>>>>>>>>>>> later will
>>>>>>>>>>> succeed. Again, that
>>>>>>>>>>> information is handed to
>>>>>>>>>>> VmError::report_coredump_status()
>>>>>>>>>>> 2)
>>>>>>>>>>> VmError::report():
>>>>>>>>>>> Error log is
>>>>>>>>>>> written and in step 63,
>>>>>>>>>>> information about the
>>>>>>>>>>> core's probable
>>>>>>>>>>> future location is
>>>>>>>>>>> added
>>>>>>>>>>> to error log. Here,
>>>>>>>>>>> messages use past
>>>>>>>>>>> tense, which is
>>>>>>>>>>> misleading, because
>>>>>>>>>>> the core is not
>>>>>>>>>>> yet written.
>>>>>>>>>>> 3) os::abort()
>>>>>>>>>>> calls
>>>>>>>>>>> abort(3), which
>>>>>>>>>>> stops and writes a
>>>>>>>>>>> core.
>>>>>>>>>>>
>>>>>>>>>>> Yumin pushed core file
>>>>>>>>>>> writing on
>>>>>>>>>>> Windows down to os::abort().
>>>>>>>>>>> This brings the
>>>>>>>>>>> control flow much
>>>>>>>>>>> closer to Unix:
>>>>>>>>>>>
>>>>>>>>>>> (1) call
>>>>>>>>>>> os::check_dump_limit() to
>>>>>>>>>>> check the
>>>>>>>>>>> prerequisites for
>>>>>>>>>>> core file writing and
>>>>>>>>>>> gather a bit
>>>>>>>>>>> information to put in the
>>>>>>>>>>> error log: On Unix,
>>>>>>>>>>> limit checks, on
>>>>>>>>>>> windows, so far nothing
>>>>>>>>>>> much but
>>>>>>>>>>> precalculating the error file
>>>>>>>>>>> name.
>>>>>>>>>>> (2) VmError::report():
>>>>>>>>>>> Error log is
>>>>>>>>>>> written and information
>>>>>>>>>>> about the core's
>>>>>>>>>>> probable location is
>>>>>>>>>>> added to error log.
>>>>>>>>>>> (3) os::abort() is
>>>>>>>>>>> called, which on
>>>>>>>>>>> all platforms:
>>>>>>>>>>> - dumps the core
>>>>>>>>>>> - if failing,
>>>>>>>>>>> writes
>>>>>>>>>>> a one-liner to
>>>>>>>>>>> stderr
>>>>>>>>>>> - stops the
>>>>>>>>>>> process.
>>>>>>>>>>>
>>>>>>>>>>> I think, if one
>>>>>>>>>>> agrees
>>>>>>>>>>> that reversing
>>>>>>>>>>> order of core
>>>>>>>>>>> dumping and
>>>>>>>>>>> error log writing on
>>>>>>>>>>> windows is a good
>>>>>>>>>>> thing, this change
>>>>>>>>>>> looks
>>>>>>>>>>> ok to me. Some
>>>>>>>>>>> improvements could make
>>>>>>>>>>> it clearer:
>>>>>>>>>>>
>>>>>>>>>>> - maybe rename
>>>>>>>>>>> "os::check_dump_limit()" (was my suggestion,
>>>>>>>>>>> sorry) to
>>>>>>>>>>> "os::check_core_prerequisites()" - that
>>>>>>>>>>> leaves room
>>>>>>>>>>> to flesh it out
>>>>>>>>>>> a bit
>>>>>>>>>>> more on Windows,
>>>>>>>>>>> where we do not have
>>>>>>>>>>> UNIX limits.
>>>>>>>>>>> - make the
>>>>>>>>>>> messages in
>>>>>>>>>>> VMError::report() future
>>>>>>>>>>> sense: "Will
>>>>>>>>>>> write core file to
>>>>>>>>>>> location.....".
>>>>>>>>>>> - Make the messages
>>>>>>>>>>> written in
>>>>>>>>>>> os::abort() on Windows
>>>>>>>>>>> clearer,
>>>>>>>>>>> and shorter, and we
>>>>>>>>>>> also do not need
>>>>>>>>>>> the buffer to be static
>>>>>>>>>>> here. Actually, if we
>>>>>>>>>>> want to be as on
>>>>>>>>>>> UNIX, just dumping
>>>>>>>>>>> "core
>>>>>>>>>>> file writing
>>>>>>>>>>> succeeded" or "failed" -
>>>>>>>>>>> maybe with an error
>>>>>>>>>>> number from
>>>>>>>>>>> GetLastError() - will be
>>>>>>>>>>> enough because the file
>>>>>>>>>>> location is already
>>>>>>>>>>> printed in the
>>>>>>>>>>> error log. Like on
>>>>>>>>>>> UNIX.
>>>>>>>>>>>
>>>>>>>>>>> Kind Regards, Thomas
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Mar 30,
>>>>>>>>>>> 2015
>>>>>>>>>>> at 9:51 PM,
>>>>>>>>>>> Daniel D. Daugherty
>>>>>>>>>>> <daniel.daugherty at oracle.com
>>>>>>>>>>> <mailto:daniel.daugherty at oracle.com>
>>>>>>>>>>> <mailto:daniel.daugherty at oracle.com
>>>>>>>>>>> <mailto:daniel.daugherty at oracle.com>>
>>>>>>>>>>> <mailto:daniel.daugherty at oracle.com
>>>>>>>>>>> <mailto:daniel.daugherty at oracle.com>
>>>>>>>>>>> <mailto:daniel.daugherty at oracle.com
>>>>>>>>>>> <mailto:daniel.daugherty at oracle.com>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Re:
>>>>>>>>>>> report_coredump_status() is
>>>>>>>>>>> really
>>>>>>>>>>> record_coredump_status()
>>>>>>>>>>>
>>>>>>>>>>> The
>>>>>>>>>>> report_coredump_status()
>>>>>>>>>>> function is designed to
>>>>>>>>>>> record two
>>>>>>>>>>> things about core
>>>>>>>>>>> dumps for later
>>>>>>>>>>> reporting by the code that
>>>>>>>>>>> generates the
>>>>>>>>>>> hs_err_pid file.
>>>>>>>>>>> That's why the original
>>>>>>>>>>> report_coredump_status() didn't
>>>>>>>>>>> output anything to
>>>>>>>>>>> stderr.
>>>>>>>>>>>
>>>>>>>>>>> By changing the
>>>>>>>>>>> Windows calls to
>>>>>>>>>>> report_coredump_status() into
>>>>>>>>>>> jio_fprintf() calls you have:
>>>>>>>>>>>
>>>>>>>>>>> - put output onto
>>>>>>>>>>> stderr that
>>>>>>>>>>> should go to the
>>>>>>>>>>> hs_err_pid file
>>>>>>>>>>> - made the windows
>>>>>>>>>>> code paths work
>>>>>>>>>>> differently than the
>>>>>>>>>>> non-windows
>>>>>>>>>>> code paths
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Dan
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 3/30/15 1:28
>>>>>>>>>>> PM, Yumin Qi wrote:
>>>>>>>>>>>
>>>>>>>>>>> Thanks Dan
>>>>>>>>>>>
>>>>>>>>>>> On 3/30/2015
>>>>>>>>>>> 11:25 AM, Daniel
>>>>>>>>>>> D. Daugherty wrote:
>>>>>>>>>>>
>>>>>>>>>>> On 3/25/15
>>>>>>>>>>> 12:11 PM, Yumin
>>>>>>>>>>> Qi wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi, all
>>>>>>>>>>>
>>>>>>>>>>> I updated the
>>>>>>>>>>> webrev with a new
>>>>>>>>>>> change to
>>>>>>>>>>> test case:
>>>>>>>>>>> test/runtime/Unsafe/RangeCheck.java
>>>>>>>>>>>
>>>>>>>>>>> Add
>>>>>>>>>>> -XX:-CreateCoredumpOnCrash to test, no
>>>>>>>>>>> dump needed on this case.
>>>>>>>>>>>
>>>>>>>>>>> new webrev:
>>>>>>>>>>> http://cr.openjdk.java.net/~minqi/8074354/webrev05/
>>>>>>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev05/>
>>>>>>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev05/>
>>>>>>>>>>> <http://cr.openjdk.java.net/%7Eminqi/8074354/webrev05/>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> General nit - please
>>>>>>>>>>> update copyright
>>>>>>>>>>> lines to 2015
>>>>>>>>>>> as needed
>>>>>>>>>>>
>>>>>>>>>>> src/share/vm/runtime/globals.hpp
>>>>>>>>>>> No comments.
>>>>>>>>>>>
>>>>>>>>>>> src/share/vm/runtime/os.hpp
>>>>>>>>>>> line 720: // On
>>>>>>>>>>> Windows this will create an
>>>>>>>>>>> actual minidump, on
>>>>>>>>>>> Linux/Solaris it will simply
>>>>>>>>>>> check core dump limits
>>>>>>>>>>> line 721: static
>>>>>>>>>>> void
>>>>>>>>>>> check_dump_limit(char*
>>>>>>>>>>> buffer, size_t bufferSize);
>>>>>>>>>>> The comment on
>>>>>>>>>>> line 720 no longer
>>>>>>>>>>> matches
>>>>>>>>>>> with the
>>>>>>>>>>> function
>>>>>>>>>>> rename from
>>>>>>>>>>> check_or_create_dump() to
>>>>>>>>>>> check_dump_limit().
>>>>>>>>>>>
>>>>>>>>>>> You have this
>>>>>>>>>>> comment in vmError.cpp:
>>>>>>>>>>> 943 // check
>>>>>>>>>>> core dump limits on
>>>>>>>>>>> Linux/Solaris, nothing on
>>>>>>>>>>> Windows
>>>>>>>>>>> 944
>>>>>>>>>>> os::check_dump_limit(buffer,
>>>>>>>>>>> sizeof(buffer));
>>>>>>>>>>>
>>>>>>>>>>> so the two
>>>>>>>>>>> comments are out of sync.
>>>>>>>>>>>
>>>>>>>>>>> Will convert
>>>>>>>>>>> them to be
>>>>>>>>>>> consistent.
>>>>>>>>>>>
>>>>>>>>>>> src/share/vm/runtime/arguments.cpp
>>>>>>>>>>> line 3252: } else if
>>>>>>>>>>> (match_option(option,
>>>>>>>>>>> "-XX:+CreateMinidumpOnCrash", &tail)) {
>>>>>>>>>>> line 3256: } else if
>>>>>>>>>>> (match_option(option,
>>>>>>>>>>> "-XX:-CreateMinidumpOnCrash", &tail)) {
>>>>>>>>>>> These two checks
>>>>>>>>>>> should use the
>>>>>>>>>>> match_option() version
>>>>>>>>>>> without a '&tail'
>>>>>>>>>>> parameter.
>>>>>>>>>>>
>>>>>>>>>>> Will use
>>>>>>>>>>> version w/o tail.
>>>>>>>>>>>
>>>>>>>>>>> src/share/vm/utilities/vmError.cpp
>>>>>>>>>>> line 217: bool
>>>>>>>>>>> VMError::coredump_status =
>>>>>>>>>>> true; // presume we
>>>>>>>>>>> can dump core file
>>>>>>>>>>> first
>>>>>>>>>>> I don't think you
>>>>>>>>>>> should init this to
>>>>>>>>>>> true.
>>>>>>>>>>> It confuses
>>>>>>>>>>> things with
>>>>>>>>>>> VMError::report_coredump_status(). It will
>>>>>>>>>>> also
>>>>>>>>>>> confuse this code:
>>>>>>>>>>>
>>>>>>>>>>> 526 STEP(63,
>>>>>>>>>>> "(printing core file
>>>>>>>>>>> information)")
>>>>>>>>>>> 529 if
>>>>>>>>>>> (coredump_status) {
>>>>>>>>>>> 530
>>>>>>>>>>> st->print("Core dump
>>>>>>>>>>> written. Default
>>>>>>>>>>> location: %s",
>>>>>>>>>>> coredump_message);
>>>>>>>>>>>
>>>>>>>>>>> because
>>>>>>>>>>> coredump_status won't
>>>>>>>>>>> accurately
>>>>>>>>>>> reflect whether
>>>>>>>>>>> the
>>>>>>>>>>> coredump_message field has
>>>>>>>>>>> been set.
>>>>>>>>>>>
>>>>>>>>>>> line 943: // check
>>>>>>>>>>> core dump limits on
>>>>>>>>>>> Linux/Solaris, nothing on
>>>>>>>>>>> Windows
>>>>>>>>>>> This updated
>>>>>>>>>>> comment doesn't match the
>>>>>>>>>>> unmodified
>>>>>>>>>>> comment in os.hpp:
>>>>>>>>>>>
>>>>>>>>>>> 720: // On
>>>>>>>>>>> Windows this will create an
>>>>>>>>>>> actual minidump, on
>>>>>>>>>>> Linux/Solaris it will simply
>>>>>>>>>>> check core dump limits
>>>>>>>>>>>
>>>>>>>>>>> I will
>>>>>>>>>>> consider your comments,
>>>>>>>>>>> then revise with a new
>>>>>>>>>>> version. I
>>>>>>>>>>> remember here there
>>>>>>>>>>> is a case, if status
>>>>>>>>>>> not
>>>>>>>>>>> set, it will
>>>>>>>>>>> output an
>>>>>>>>>>> inconsistent message. Will
>>>>>>>>>>> check
>>>>>>>>>>> again.
>>>>>>>>>>>
>>>>>>>>>>> src/os/aix/vm/os_aix.cpp
>>>>>>>>>>> No comments.
>>>>>>>>>>>
>>>>>>>>>>> src/os/bsd/vm/os_bsd.cpp
>>>>>>>>>>> No comments.
>>>>>>>>>>>
>>>>>>>>>>> src/os/linux/vm/os_linux.cpp
>>>>>>>>>>> No comments.
>>>>>>>>>>>
>>>>>>>>>>> src/os/posix/vm/os_posix.cpp
>>>>>>>>>>> No comments.
>>>>>>>>>>>
>>>>>>>>>>> src/os/solaris/vm/os_solaris.cpp
>>>>>>>>>>> No comments.
>>>>>>>>>>>
>>>>>>>>>>> src/os/windows/vm/os_windows.cpp
>>>>>>>>>>> line 1008: static char
>>>>>>>>>>> buffer[O_BUFLEN];
>>>>>>>>>>> Why switch from a
>>>>>>>>>>> buffer parameter to a
>>>>>>>>>>> static buffer?
>>>>>>>>>>> What happens with
>>>>>>>>>>> parallel calls to
>>>>>>>>>>> abort()? Will the static
>>>>>>>>>>> buffer get stomped
>>>>>>>>>>> by the competing
>>>>>>>>>>> threads?
>>>>>>>>>>>
>>>>>>>>>>> The original
>>>>>>>>>>> buffer is static
>>>>>>>>>>> too. Carrying an buffer
>>>>>>>>>>> for abort
>>>>>>>>>>> seems not right.
>>>>>>>>>>> This buffer in fact
>>>>>>>>>>> only for
>>>>>>>>>>> storing file
>>>>>>>>>>> name (based on
>>>>>>>>>>> pid) here.
>>>>>>>>>>> abort() will
>>>>>>>>>>> dump the core
>>>>>>>>>>> file, should we prevent
>>>>>>>>>>> multi-thread calling to this
>>>>>>>>>>> function? I did not check
>>>>>>>>>>> this part,
>>>>>>>>>>> will check if it is
>>>>>>>>>>> MT-safe.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> line 1015: // If
>>>>>>>>>>> running on a client version
>>>>>>>>>>> of Windows
>>>>>>>>>>> and user has
>>>>>>>>>>> not explicitly
>>>>>>>>>>> enabled dumping
>>>>>>>>>>> line 1016: if
>>>>>>>>>>> (!os::win32::is_windows_server() &&
>>>>>>>>>>> !CreateCoredumpOnCrash) {
>>>>>>>>>>> The default for
>>>>>>>>>>> CreateCoredumpOnCrash is
>>>>>>>>>>> now 'true'
>>>>>>>>>>> so the
>>>>>>>>>>> comment and logic
>>>>>>>>>>> are no longer correct
>>>>>>>>>>> here. The Windows
>>>>>>>>>>> Client VM will
>>>>>>>>>>> generate minidumps by default.
>>>>>>>>>>>
>>>>>>>>>>> old line 1007:
>>>>>>>>>>> VMError::report_coredump_status("Minidumps
>>>>>>>>>>> are not
>>>>>>>>>>> enabled by default on
>>>>>>>>>>> client versions of
>>>>>>>>>>> Windows",
>>>>>>>>>>> false);
>>>>>>>>>>> new line 1017:
>>>>>>>>>>> jio_fprintf(stderr, "Minidumps
>>>>>>>>>>> are not
>>>>>>>>>>> enabled by default
>>>>>>>>>>> on client versions of
>>>>>>>>>>> Windows.\n");
>>>>>>>>>>> There are a number
>>>>>>>>>>> of places where we
>>>>>>>>>>> change from
>>>>>>>>>>> VMError::report_coredump_status() to
>>>>>>>>>>> jio_fprintf()
>>>>>>>>>>> and I'm not quite
>>>>>>>>>>> seeing why this was
>>>>>>>>>>> done.
>>>>>>>>>>>
>>>>>>>>>>> Update:
>>>>>>>>>>> VMError::report_coredump_status()
>>>>>>>>>>> is
>>>>>>>>>>> misnamed. It doesn't
>>>>>>>>>>> report anything in
>>>>>>>>>>> the sense that it
>>>>>>>>>>> doesn't print anything. It
>>>>>>>>>>> does record two
>>>>>>>>>>> pieces of
>>>>>>>>>>> information about
>>>>>>>>>>> core dumps
>>>>>>>>>>> so maybe
>>>>>>>>>>> it should be
>>>>>>>>>>> VMError::record_coredump_status().
>>>>>>>>>>>
>>>>>>>>>>> line 1100:
>>>>>>>>>>> jio_fprintf(stderr,
>>>>>>>>>>> "%s.\n", buffer);
>>>>>>>>>>> At this point,
>>>>>>>>>>> buffer contains the
>>>>>>>>>>> path
>>>>>>>>>>> to the
>>>>>>>>>>> mini-dump file so
>>>>>>>>>>> the above line
>>>>>>>>>>> simply prints
>>>>>>>>>>> that on stderr. Why?
>>>>>>>>>>>
>>>>>>>>>>> Yes, I see that
>>>>>>>>>>> the old code did
>>>>>>>>>>> something
>>>>>>>>>>> similar:
>>>>>>>>>>>
>>>>>>>>>>> 1090
>>>>>>>>>>> VMError::report_coredump_status(buffer,
>>>>>>>>>>> true);
>>>>>>>>>>>
>>>>>>>>>>> but
>>>>>>>>>>> report_coredump_status() didn't
>>>>>>>>>>> actually print
>>>>>>>>>>> anything. It just
>>>>>>>>>>> squirreled away these in
>>>>>>>>>>> vmError.cpp:
>>>>>>>>>>>
>>>>>>>>>>> 217 bool
>>>>>>>>>>> VMError::coredump_status;
>>>>>>>>>>> 218 char
>>>>>>>>>>> VMError::coredump_message[O_BUFLEN];
>>>>>>>>>>>
>>>>>>>>>>> See comments above
>>>>>>>>>>> about how
>>>>>>>>>>> report_coredump_status()
>>>>>>>>>>> is misnamed.
>>>>>>>>>>>
>>>>>>>>>>> At this point, I'm
>>>>>>>>>>> convinced that all the
>>>>>>>>>>> changes from
>>>>>>>>>>> VMError::report_coredump_status() to
>>>>>>>>>>> jio_fprintf() are
>>>>>>>>>>> a bad idea.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Changing to
>>>>>>>>>>> jio_fprintf is because
>>>>>>>>>>> report_coredump_status did not
>>>>>>>>>>> output anything, it is
>>>>>>>>>>> just a logging
>>>>>>>>>>> as you
>>>>>>>>>>> indicated. It is reasonable we
>>>>>>>>>>> change it to
>>>>>>>>>>> record_coredump_status instead. How about
>>>>>>>>>>> add printout to
>>>>>>>>>>> report_coredump_status? So I do not
>>>>>>>>>>> need to use
>>>>>>>>>>> jio_printf here.
>>>>>>>>>>>
>>>>>>>>>>> Other
>>>>>>>>>>> consideration of using
>>>>>>>>>>> jio_fprintf after call
>>>>>>>>>>> shutdown
>>>>>>>>>>> called, the static
>>>>>>>>>>> output stream still
>>>>>>>>>>> exists,
>>>>>>>>>>> but not
>>>>>>>>>>> guaranteed to work as
>>>>>>>>>>> expected.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> Yumin
>>>>>>>>>>>
>>>>>>>>>>> test/runtime/ErrorHandling/ProblematicFrameTest.java
>>>>>>>>>>> No comments.
>>>>>>>>>>>
>>>>>>>>>>> test/runtime/ErrorHandling/SafeFetchInErrorHandlingTest.java
>>>>>>>>>>> No comments.
>>>>>>>>>>>
>>>>>>>>>>> test/runtime/ErrorHandling/SecondaryErrorTest.java
>>>>>>>>>>> No comments.
>>>>>>>>>>>
>>>>>>>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency1.java
>>>>>>>>>>> No comments.
>>>>>>>>>>>
>>>>>>>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency2.java
>>>>>>>>>>> No comments.
>>>>>>>>>>>
>>>>>>>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency3.java
>>>>>>>>>>> No comments.
>>>>>>>>>>>
>>>>>>>>>>> test/runtime/Safepoint/AssertSafepointCheckConsistency4.java
>>>>>>>>>>> No comments.
>>>>>>>>>>>
>>>>>>>>>>> test/runtime/Unsafe/RangeCheck.java
>>>>>>>>>>> No comments.
>>>>>>>>>>>
>>>>>>>>>>> test/runtime/memory/ReadFromNoaccessArea.java
>>>>>>>>>>> No comments.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> Yumin
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 3/23/2015 11:48 AM,
>>>>>>>>>>> Yumin Qi wrote:
>>>>>>>>>>>
>>>>>>>>>>> Since Thomas is
>>>>>>>>>>> not a reviewer in
>>>>>>>>>>> open jdk,
>>>>>>>>>>> I need two
>>>>>>>>>>> volunteers (at least one
>>>>>>>>>>> "R"eviewer).
>>>>>>>>>>>
>>>>>>>>>>> Dan, since you
>>>>>>>>>>> already reviewed previous
>>>>>>>>>>> version, could you
>>>>>>>>>>> have a look?
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> Yumin
>>>>>>>>>>>
>>>>>>>>>>> On 3/20/2015 3:20
>>>>>>>>>>> PM, Yumin Qi wrote:
>>>>>>>>>>>
>>>>>>>>>>> Thomas,
>>>>>>>>>>>
>>>>>>>>>>> Thanks. Yes,
>>>>>>>>>>> it is webrev04. Also, I
>>>>>>>>>>> have updated
>>>>>>>>>>> webrev04 to add another
>>>>>>>>>>> test case change:
>>>>>>>>>>> test/runtime/memory/ReadFromNoaccessArea.java
>>>>>>>>>>> (Thanks Dan's
>>>>>>>>>>> update)
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> Yumin
>>>>>>>>>>>
>>>>>>>>>>> On 3/20/2015
>>>>>>>>>>> 11:55 AM, Thomas
>>>>>>>>>>> Stüfe wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Hi Yumin,
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
>
>
More information about the hotspot-runtime-dev
mailing list