HSX-25: code review (round 0) for VMError::report_and_die () dies on bad PC (8015884)

Daniel D. Daugherty daniel.daugherty at oracle.com
Wed Jul 3 09:06:16 PDT 2013


Harold,

Thanks for the review!

I think that will allow this original code:

6093 #ifdef _LP64
6094     if (dlinfo.dli_fbase)  st->print(" at 0x%016lx", dlinfo.dli_fbase);
6095 #else
6096     if (dlinfo.dli_fbase)  st->print(" at 0x%08x", dlinfo.dli_fbase);
6097 #endif

to change to this:

     if (dlinfo.dli_fbase)  st->print(" at " PTR_FORMAT, dlinfo.dli_fbase);

Do you concur?

Dan


On 7/3/13 7:47 AM, harold seigel wrote:
> Hi Dan,
>
> The changes are a nice cleanup.
>
> Can you use PTR_FORMAT in os_solaris.cpp, instead of 0x%016lx and 
> 0x%08x, at lines 6119 and 6121?
>
> Thanks, Harold
>
> On 7/3/2013 9:16 AM, Daniel D. Daugherty wrote:
>> Zhengyu,
>>
>> Thanks for the review!
>>
>> I will have a "round 1" code review out sometime later today.
>>
>>
>> On 7/2/13 7:50 PM, Zhengyu Gu wrote:
>>> Dan,
>>>
>>> Thank you for cleaning up dll_address_to_function_name(). In 
>>> general, it looks good to me.
>>
>> Thanks!
>>
>>
>>> One possible enhancement, you can decide if should be done under 
>>> this bug.
>>>
>>> os_windows.cpp
>>>   os::address_is_in_vm(address addr)
>>>
>>> I think there is simpler way using Windows API "GetModuleHandleEx() 
>>> with GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS.
>>
>> My Windows changes in this bug are very minimal and I think I'd
>> like to keep it that way. However, I will file an RFE for the
>> suggestion. Would this improvement count as a "starter" bug?
>>
>> Dan
>>
>>
>>>
>>> Thanks,
>>>
>>> -Zhengyu
>>>
>>>
>>>
>>>
>>> On 6/21/2013 6:44 PM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> I have have a proposed fix for the following bug:
>>>>
>>>>     8015884 runThese crashed with SIGSEGV, hs_err has an error instead
>>>>             of stacktrace
>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8015884
>>>>     https://jbs.oracle.com/bugs/browse/JDK-8015884
>>>>
>>>> Here are the HSX-25 webrev URLs:
>>>>
>>>> OpenJDK: http://cr.openjdk.java.net/~dcubed/8015884-webrev/0-hsx25/
>>>> Internal: 
>>>> http://javaweb.us.oracle.com/~ddaugher/8015884-webrev/0-hsx25/
>>>>
>>>> Testing:
>>>> -  test/runtime/6888954/vmerrors.sh has been updated to include a
>>>>    test for a bad function pointer
>>>> - vmerrors.sh has been used on MacOS X and Solaris X86 to verify
>>>>   that bad function pointers get better trouble shooting info
>>>> - (running, @ 24 hours and not yet done) JPRT build and test job
>>>> - (TBD) JPRT build and new test only job
>>>> - (TBD) Aurora Adhoc vm.quick batch for all platforms in the following
>>>>   configs: {Client VM, Server VM} x {fastdebug} x {-Xmixed}
>>>>
>>>> Just to be clear: This fix solves the problem of not getting a
>>>> stacktrace when we get a bad PC; it does not solve the underlying
>>>> reason why libpthread sent the VM into the weeds.
>>>>
>>>> Gory details are below. As always, comments, questions and
>>>> suggestions are welome.
>>>>
>>>> Dan
>>>>
>>>>
>>>> Gory Details:
>>>>
>>>> The primary bug fixes are:
>>>>
>>>> - the Dl_info struct should only be used if dladdr() has
>>>>   returned non-zero (no errors) and always check the dladdr()
>>>>   return value
>>>> - the Dl_info.dli_sname and Dl_info.dli_saddr fields should
>>>>   only be used if non-NULL
>>>> - the first two fixes are based on the Solaris dladdr() man page
>>>> - make logic consistent between MacOS X, Linux and Solaris:
>>>>   - offset is only set when a matching symbol is found (and
>>>>     the pointer is non-NULL)
>>>>
>>>> Cleanup bug fixes are:
>>>>
>>>> - add assert()'s to verify os::dll_address_to_function_name() and
>>>>   os::dll_address_to_library_name() contract for 'buf' parameter
>>>>
>>>> - HotSpot style: explicitly check int function return values
>>>>   rather that assume conversion to boolean, e.g.
>>>>
>>>>     if (dladdr(...)) {  =>  if (dladdr(...) != 0) {
>>>>
>>>> - consistently compare Dl_info fields against NULL instead of a mix of
>>>>   '0', NULL and implied boolean
>>>>
>>>> - the Dl_info.dli_fname and Dl_info.dli_fbase fields are only used if
>>>>   non-NULL; this was done for consistency (and paranoia); the Solaris
>>>>   man page does not state whether or not NULL is a permitted value for
>>>>   these fields
>>>>
>>>> Test changes:
>>>> - add case "13" for bad function pointers
>>>> - change VM-side of the test code so that case "12", the
>>>>   os::signal_raise(SIGSEGV) case, does not have to be last;
>>>>   this puts a ShouldNotReachHere() call after the test_num
>>>>   case statement so this test function will always crash
>>>>   the VM in non-product bits
>>>> - change VM-side of the test code to issue an ERROR message
>>>>   when an unexpected test_num value is passed
>>>>
>>>
>>
>



More information about the hotspot-runtime-dev mailing list