HSX-24 Backport: code review (round 0) for VMError::report_and_die () dies on bad PC (8015884)
Daniel D. Daugherty
daniel.daugherty at oracle.com
Tue Jul 9 13:13:32 PDT 2013
On 7/9/13 9:21 AM, Gerard Ziemski wrote:
> hi Dan,
>
> Are the comment around the case 12 and 13 in test_error_handler right?
Yes and no. :-)
> Is it really possible that dereferencing NULL will not cause the
> hardware to raise a signal?
For all of the platforms that we test, dereferencing NULL causes a
crash (as expected). However, I can't say what other platforms do...
Dan
>
>
> cheers
>
> On 7/8/2013 5:01 PM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> I have an HSX-24 backport of the proposed fix for the following bug:
>>
>> 8015884 runThese crashed with SIGSEGV, hs_err has an error instead
>> of stacktrace
>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8015884
>> https://jbs.oracle.com/bugs/browse/JDK-8015884
>>
>> Here are the (round 0) webrev URLs for the HSX-24 backport:
>>
>> OpenJDK: http://cr.openjdk.java.net/~dcubed/8015884-webrev/0-hsx24/
>> Internal: http://javaweb.us.oracle.com/~ddaugher/8015884-webrev/0-hsx24/
>>
>> Here are the (round 1) webrev URLs for the HSX-25 version for reference:
>>
>> OpenJDK: http://cr.openjdk.java.net/~dcubed/8015884-webrev/1-hsx25/
>> Internal: http://javaweb.us.oracle.com/~ddaugher/8015884-webrev/1-hsx25/
>>
>> The easiest way to review the backport is to download and compare
>> the two patch files with your favorite file comparison tool:
>>
>> OpenJDK patch links:
>> http://cr.openjdk.java.net/~dcubed/8015884-webrev/0-hsx24/8015884_exp_for_hsx24.patch
>>
>> http://cr.openjdk.java.net/~dcubed/8015884-webrev/1-hsx25/8015884_exp_for_hsx25.patch
>>
>>
>> Internal patch links:
>> http://javaweb.us.oracle.com/~ddaugher/8015884-webrev/0-hsx24/8015884_exp_for_hsx24.patch
>>
>> http://javaweb.us.oracle.com/~ddaugher/8015884-webrev/1-hsx25/8015884_exp_for_hsx25.patch
>>
>>
>> For 'diff' oriented people (like me), here's my quick sanity of the two
>> patch files:
>> - diff the patches
>> - strip down to line additions and deletions
>> - strip out the copyright changes
>>
>> $ diff 8015884-webrev/1-hsx25/8015884_exp_for_hsx25.patch \
>> 8015884-webrev/0-hsx24/8015884_exp_for_hsx24.patch \
>> | grep '^[<>] [+-] ' | grep -v Copyright
>> > - char fname[32];
>> > - pid_t pid = os::Bsd::gettid();
>> > + char fname[32];
>> > + pid_t pid = os::Bsd::gettid();
>> > - jio_snprintf(fname, sizeof(fname), "/proc/%d/maps", pid);
>> > + jio_snprintf(fname, sizeof(fname), "/proc/%d/maps", pid);
>> > - if (!_print_ascii_file(fname, st)) {
>> > - st->print("Can not get library information for pid = %d\n",
>> pid);
>> > - }
>> > + if (!_print_ascii_file(fname, st)) {
>> > + st->print("Can not get library information for pid = %d\n", pid);
>> > + }
>> < + #ifndef CALL_TEST_FUNC_WITH_WRAPPER_IF_NEEDED
>> > + #ifndef TARGET_OS_FAMILY_windows
>> < - NOT_PRODUCT(if (ReplayCompiles) ciReplay::replay(thread);)
>> < + if (ReplayCompiles) ciReplay::replay(thread);
>>
>> The majority of the diffs are for white-space changes in the HSX-24
>> version where the code no longer exists in the HSX-25 version. The
>> TARGET_OS_FAMILY_windows diff is for the change made after HSX-25 code
>> review round 1 which was reviewed via e-mail only. The ReplayCompiles
>> diff is for an option that doesn't exist in HSX-24.
>>
>> The remainder is an update to the HSX-25 round 0 code review invite.
>>
>> Testing:
>> - test/runtime/6888954/vmerrors.sh has been updated to test for a bad
>> data pointer instead of os::signal_raise() and includes a new test
>> for a bad function pointer
>> - (running) JPRT build and test job
>> - (running) JPRT build and new test only job
>> - (TBD) Aurora Adhoc vm.quick batch for all platforms in the following
>> configs: {Client VM, Server VM} x {fastdebug} x {-Xmixed}
>>
>> Just to be clear: This fix solves the problem of not getting a
>> stacktrace when we get a bad PC; it does not solve the underlying
>> reason why libpthread sent the VM into the weeds.
>>
>> Gory details are below. As always, comments, questions and
>> suggestions are welome.
>>
>> Dan
>>
>>
>> Gory Details:
>>
>> The primary bug fixes are:
>>
>> - the Dl_info struct should only be used if dladdr() has
>> returned non-zero (no errors) and always check the dladdr()
>> return value
>> - the Dl_info.dli_sname and Dl_info.dli_saddr fields should
>> only be used if non-NULL
>> - the first two fixes are based on the Solaris dladdr() man page
>> - make logic consistent between MacOS X, Linux and Solaris:
>> - offset is only set when a matching symbol is found (and
>> the pointer is non-NULL)
>> - additional changes to get proper hs_err_pid file generation in
>> non-product
>> Win* bits when the following tests are executed:
>> -XX:ErrorHandlerTest=N option aka test_error_handler() function
>> -XX:+ExecuteInternalVMTests option aka
>> execute_internal_vm_tests() function
>>
>> Cleanup bug fixes are:
>>
>> - add assert()'s to verify os::dll_address_to_function_name() and
>> os::dll_address_to_library_name() contract for 'buf' parameter
>>
>> - HotSpot style: explicitly check int function return values
>> rather that assume conversion to boolean, e.g.
>>
>> if (dladdr(...)) { => if (dladdr(...) != 0) {
>>
>> - consistently compare Dl_info fields against NULL instead of a mix of
>> '0', NULL and implied boolean
>>
>> - the Dl_info.dli_fname and Dl_info.dli_fbase fields are only used if
>> non-NULL; this was done for consistency (and paranoia); the Solaris
>> man page does not state whether or not NULL is a permitted value for
>> these fields
>>
>> - get rid of redundant "else" statements in modified code - thanks
>> Dmitry!
>> - cleanup whitespace/indenting in modified code
>> - switch to PTR_FORMAT in some print calls - thanks Harold!
>>
>> Test changes:
>> - update case "12" to test for bad data pointer instead of calling
>> os::signal_raise(SIGSEGV)
>> - add case "13" for bad function pointers
>> - change VM-side of the test code to issue an ERROR message
>> when an unexpected test_num value is passed
>> - change VM-side of the test code to call ShouldNotReachHere() if
>> the code makes it out of the case statement
>>
>>
>
More information about the hotspot-runtime-dev
mailing list