HSX-25: code review (round 0) for VMError::report_and_die () dies on bad PC (8015884)

Zhengyu Gu zhengyu.gu at oracle.com
Tue Jul 9 05:43:13 PDT 2013


Thanks,

-Zhengyu

On Jul 8, 2013, at 7:30 PM, Daniel D. Daugherty wrote:

> Zhengyu,
> 
> I filed the following new RFE:
> 
>    JDK-8020092 os::address_is_in_vm() on Windows can use "GetModuleHandleEx()"
>    https://jbs.oracle.com/bugs/browse/JDK-8020092
> 
> Dan
> 
> 
> On 7/3/13 7:28 AM, Zhengyu Gu wrote:
>> 
>>>> One possible enhancement, you can decide if should be done under this bug.
>>>> 
>>>> os_windows.cpp
>>>>  os::address_is_in_vm(address addr)
>>>> 
>>>> I think there is simpler way using Windows API "GetModuleHandleEx() with GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS.
>>> 
>>> My Windows changes in this bug are very minimal and I think I'd
>>> like to keep it that way. However, I will file an RFE for the
>>> suggestion. Would this improvement count as a "starter" bug?
>>> 
>> Yes, make sense. It is a starter bug.
>> 
>> Thanks,
>> 
>> -Zhengyu
>> 
>> 
>>> Dan
>>> 
>>> 
>>>> 
>>>> Thanks,
>>>> 
>>>> -Zhengyu
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On 6/21/2013 6:44 PM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>> 
>>>>> I have have a proposed fix for the following bug:
>>>>> 
>>>>>    8015884 runThese crashed with SIGSEGV, hs_err has an error instead
>>>>>            of stacktrace
>>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8015884
>>>>>    https://jbs.oracle.com/bugs/browse/JDK-8015884
>>>>> 
>>>>> Here are the HSX-25 webrev URLs:
>>>>> 
>>>>> OpenJDK: http://cr.openjdk.java.net/~dcubed/8015884-webrev/0-hsx25/
>>>>> Internal: http://javaweb.us.oracle.com/~ddaugher/8015884-webrev/0-hsx25/
>>>>> 
>>>>> Testing:
>>>>> -  test/runtime/6888954/vmerrors.sh has been updated to include a
>>>>>   test for a bad function pointer
>>>>> - vmerrors.sh has been used on MacOS X and Solaris X86 to verify
>>>>>  that bad function pointers get better trouble shooting info
>>>>> - (running, @ 24 hours and not yet done) JPRT build and test job
>>>>> - (TBD) JPRT build and new test only job
>>>>> - (TBD) Aurora Adhoc vm.quick batch for all platforms in the following
>>>>>  configs: {Client VM, Server VM} x {fastdebug} x {-Xmixed}
>>>>> 
>>>>> Just to be clear: This fix solves the problem of not getting a
>>>>> stacktrace when we get a bad PC; it does not solve the underlying
>>>>> reason why libpthread sent the VM into the weeds.
>>>>> 
>>>>> Gory details are below. As always, comments, questions and
>>>>> suggestions are welome.
>>>>> 
>>>>> Dan
>>>>> 
>>>>> 
>>>>> Gory Details:
>>>>> 
>>>>> The primary bug fixes are:
>>>>> 
>>>>> - the Dl_info struct should only be used if dladdr() has
>>>>>  returned non-zero (no errors) and always check the dladdr()
>>>>>  return value
>>>>> - the Dl_info.dli_sname and Dl_info.dli_saddr fields should
>>>>>  only be used if non-NULL
>>>>> - the first two fixes are based on the Solaris dladdr() man page
>>>>> - make logic consistent between MacOS X, Linux and Solaris:
>>>>>  - offset is only set when a matching symbol is found (and
>>>>>    the pointer is non-NULL)
>>>>> 
>>>>> Cleanup bug fixes are:
>>>>> 
>>>>> - add assert()'s to verify os::dll_address_to_function_name() and
>>>>>  os::dll_address_to_library_name() contract for 'buf' parameter
>>>>> 
>>>>> - HotSpot style: explicitly check int function return values
>>>>>  rather that assume conversion to boolean, e.g.
>>>>> 
>>>>>    if (dladdr(...)) {  =>  if (dladdr(...) != 0) {
>>>>> 
>>>>> - consistently compare Dl_info fields against NULL instead of a mix of
>>>>>  '0', NULL and implied boolean
>>>>> 
>>>>> - the Dl_info.dli_fname and Dl_info.dli_fbase fields are only used if
>>>>>  non-NULL; this was done for consistency (and paranoia); the Solaris
>>>>>  man page does not state whether or not NULL is a permitted value for
>>>>>  these fields
>>>>> 
>>>>> Test changes:
>>>>> - add case "13" for bad function pointers
>>>>> - change VM-side of the test code so that case "12", the
>>>>>  os::signal_raise(SIGSEGV) case, does not have to be last;
>>>>>  this puts a ShouldNotReachHere() call after the test_num
>>>>>  case statement so this test function will always crash
>>>>>  the VM in non-product bits
>>>>> - change VM-side of the test code to issue an ERROR message
>>>>>  when an unexpected test_num value is passed
>>>>> 
>>>> 
>>> 
>> 
>> 
>> 
> 



More information about the hotspot-runtime-dev mailing list