HSX-25: code review (round 0) for VMError::report_and_die () dies on bad PC (8015884)
Daniel D. Daugherty
daniel.daugherty at oracle.com
Wed Jul 3 13:53:59 PDT 2013
On 7/3/13 2:35 PM, Coleen Phillimore wrote:
>
> You are right, you did pique my interest with the SEH handler on windows.
> I think this is a significant work and nice cleanup. It looks good.
Thanks for the quick re-review!
> The only thing that I'm concerned with is that in jni.cpp if you get
> the windows compilation include file order wrong, you'll undefine this
> and never know it.
>
> 5142 #ifndef CALL_TEST_FUNC_WITH_WRAPPER_IF_NEEDED
> 5143 #define CALL_TEST_FUNC_WITH_WRAPPER_IF_NEEDED(f) f()
> 5144 #endif
> I think it would be better to have these defined as NULL in the
> os_*.hpp files instead, unfortunately there is one for each
> solaris,linux and bsd.
Header files (like os_windows.inline.hpp) are always included in a
block at the beginning of .cpp file by our (informal) style rules:
26 #include "precompiled.hpp"
27 #include "ci/ciReplay.hpp"
<snip>
80 #include "utilities/histogram.hpp"
81 #ifdef TARGET_OS_FAMILY_linux
82 # include "os_linux.inline.hpp"
83 #endif
84 #ifdef TARGET_OS_FAMILY_solaris
85 # include "os_solaris.inline.hpp"
86 #endif
87 #ifdef TARGET_OS_FAMILY_windows
88 # include "os_windows.inline.hpp"
89 #endif
90 #ifdef TARGET_OS_FAMILY_bsd
91 # include "os_bsd.inline.hpp"
92 #endif
and if someone does happen to break it, then the new version
of test/runtime/6888954/vmerrors.sh will catch the error
because we'll get timeouts on the test because of the unhandled
Win* exception handler; we'll get a popup which won't be dismissed
and the test will timeout...
Dan
>
> Thanks,
> Coleen
>
> On 7/3/2013 3:50 PM, Daniel D. Daugherty wrote:
>> Thanks for the fast re-review!!
>>
>> Dan
>>
>>
>> On 7/3/13 1:47 PM, Dmitry Samersoff wrote:
>>> Dan,
>>>
>>> Looks good for me.
>>>
>>> -Dmitry
>>>
>>>
>>> On 2013-07-03 22:37, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> Here are the (round 1) HSX-25 webrev URLs:
>>>>
>>>> OpenJDK: http://cr.openjdk.java.net/~dcubed/8015884-webrev/1-hsx25/
>>>> Internal:
>>>> http://javaweb.us.oracle.com/~ddaugher/8015884-webrev/1-hsx25/
>>>>
>>>> Because of the whitespace HotSpot style changes, the "frames" version
>>>> is the easiest to review... Yes, I've crawled through comparing the
>>>> patch files from round 0 and round 1 and now I have a headache... :-)
>>>>
>>>> Changes relative to round 0:
>>>>
>>>> - additional changes to get proper hs_err_pid file generation in
>>>> non-product
>>>> Win* bits when the following tests are executed:
>>>> -XX:ErrorHandlerTest=N option aka test_error_handler() function
>>>> -XX:+ExecuteInternalVMTests option aka
>>>> execute_internal_vm_tests()
>>>> function
>>>> - Coleen - you'll want to check out:
>>>> src/share/vm/prims/jni.cpp
>>>> src/os/windows/vm/os_windows.hpp
>>>> src/os/windows/vm/os_windows.inline.hpp
>>>>
>>>> - the latest version of test/runtime/6888954/vmerrors.sh now passes on
>>>> all platforms including MacOS X
>>>> - this includes test case 12 where we used to not get an hs_err_pid
>>>> file on Win* (just an exit code == 3)
>>>> - Gerard - please check out the new test and the changes in
>>>> src/share/vm/utilities/debug.cpp; we can discuss off-thread what
>>>> to do with: 7190125 runtime/6888954/vmerrors.sh fails on MacOS X
>>>>
>>>> - more code clean ups:
>>>> - redundant "else" removal - thanks Dmitry!
>>>> - switch to PTR_FORMAT in some print calls - thanks Harold!
>>>> - more HotSpot whitespace/style fixes... as long as I was in
>>>> there... :-)
>>>>
>>>> Of course, I'm looking for re-reviews from: Dmitry, Zhengyu,
>>>> Harold and Coleen...
>>>>
>>>> Dan
>>>>
>>>>
>>>> On 6/21/13 4:44 PM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> I have have a proposed fix for the following bug:
>>>>>
>>>>> 8015884 runThese crashed with SIGSEGV, hs_err has an error
>>>>> instead
>>>>> of stacktrace
>>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8015884
>>>>> https://jbs.oracle.com/bugs/browse/JDK-8015884
>>>>>
>>>>> Here are the HSX-25 webrev URLs:
>>>>>
>>>>> OpenJDK: http://cr.openjdk.java.net/~dcubed/8015884-webrev/0-hsx25/
>>>>> Internal:
>>>>> http://javaweb.us.oracle.com/~ddaugher/8015884-webrev/0-hsx25/
>>>>>
>>>>> Testing:
>>>>> - test/runtime/6888954/vmerrors.sh has been updated to include a
>>>>> test for a bad function pointer
>>>>> - vmerrors.sh has been used on MacOS X and Solaris X86 to verify
>>>>> that bad function pointers get better trouble shooting info
>>>>> - (running, @ 24 hours and not yet done) JPRT build and test job
>>>>> - (TBD) JPRT build and new test only job
>>>>> - (TBD) Aurora Adhoc vm.quick batch for all platforms in the
>>>>> following
>>>>> configs: {Client VM, Server VM} x {fastdebug} x {-Xmixed}
>>>>>
>>>>> Just to be clear: This fix solves the problem of not getting a
>>>>> stacktrace when we get a bad PC; it does not solve the underlying
>>>>> reason why libpthread sent the VM into the weeds.
>>>>>
>>>>> Gory details are below. As always, comments, questions and
>>>>> suggestions are welome.
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>> Gory Details:
>>>>>
>>>>> The primary bug fixes are:
>>>>>
>>>>> - the Dl_info struct should only be used if dladdr() has
>>>>> returned non-zero (no errors) and always check the dladdr()
>>>>> return value
>>>>> - the Dl_info.dli_sname and Dl_info.dli_saddr fields should
>>>>> only be used if non-NULL
>>>>> - the first two fixes are based on the Solaris dladdr() man page
>>>>> - make logic consistent between MacOS X, Linux and Solaris:
>>>>> - offset is only set when a matching symbol is found (and
>>>>> the pointer is non-NULL)
>>>>>
>>>>> Cleanup bug fixes are:
>>>>>
>>>>> - add assert()'s to verify os::dll_address_to_function_name() and
>>>>> os::dll_address_to_library_name() contract for 'buf' parameter
>>>>>
>>>>> - HotSpot style: explicitly check int function return values
>>>>> rather that assume conversion to boolean, e.g.
>>>>>
>>>>> if (dladdr(...)) { => if (dladdr(...) != 0) {
>>>>>
>>>>> - consistently compare Dl_info fields against NULL instead of a
>>>>> mix of
>>>>> '0', NULL and implied boolean
>>>>>
>>>>> - the Dl_info.dli_fname and Dl_info.dli_fbase fields are only used if
>>>>> non-NULL; this was done for consistency (and paranoia); the
>>>>> Solaris
>>>>> man page does not state whether or not NULL is a permitted
>>>>> value for
>>>>> these fields
>>>>>
>>>>> Test changes:
>>>>> - add case "13" for bad function pointers
>>>>> - change VM-side of the test code so that case "12", the
>>>>> os::signal_raise(SIGSEGV) case, does not have to be last;
>>>>> this puts a ShouldNotReachHere() call after the test_num
>>>>> case statement so this test function will always crash
>>>>> the VM in non-product bits
>>>>> - change VM-side of the test code to issue an ERROR message
>>>>> when an unexpected test_num value is passed
>>>>>
>>>>>
>>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/attachments/20130703/d70899ae/attachment.html
More information about the hotspot-runtime-dev
mailing list