HSX-25: code review (round 0) for VMError::report_and_die () dies on bad PC (8015884)

Daniel D. Daugherty daniel.daugherty at oracle.com
Wed Jul 3 15:40:30 PDT 2013


Greetings,

Coleen and I discussed this off-thread a bit and settled on:

  diff -c diff-8015884-hsx25.03 diff-8015884-hsx25.04
*** diff-8015884-hsx25.03    Wed Jul  3 13:33:23 2013
--- diff-8015884-hsx25.04    Wed Jul  3 15:24:28 2013
***************
*** 833,839 ****
        }

   +#ifndef PRODUCT
! +  #ifndef CALL_TEST_FUNC_WITH_WRAPPER_IF_NEEDED
   +    #define CALL_TEST_FUNC_WITH_WRAPPER_IF_NEEDED(f) f()
   +  #endif
   +
--- 833,839 ----
        }

   +#ifndef PRODUCT
! +  #ifndef TARGET_OS_FAMILY_windows
   +    #define CALL_TEST_FUNC_WITH_WRAPPER_IF_NEEDED(f) f()
   +  #endif
   +

so Windows will get the wrapper and non-Windows won't get the
wrapper and we don't have to touch the other os_<platform>.inline.hpp
files... Yes, I'm trying to avoid touching any more files with this
fix...

Dan


On 7/3/13 2:53 PM, Daniel D. Daugherty wrote:
> On 7/3/13 2:35 PM, Coleen Phillimore wrote:
>>
>> You are right, you did pique my interest with the SEH handler on windows.
>> I think this is a significant work and nice cleanup.  It looks good.
>
> Thanks for the quick re-review!
>
>
>> The only thing that I'm concerned with is that in jni.cpp if you get 
>> the windows compilation include file order wrong, you'll undefine 
>> this and never know it.
>>
>> 5142   #ifndef CALL_TEST_FUNC_WITH_WRAPPER_IF_NEEDED
>> 5143     #define CALL_TEST_FUNC_WITH_WRAPPER_IF_NEEDED(f) f()
>> 5144   #endif
>> I think it would be better to have these defined as NULL in the 
>> os_*.hpp files instead, unfortunately there is one for each 
>> solaris,linux and bsd.
>
> Header files (like os_windows.inline.hpp) are always included in a
> block at the beginning of .cpp file by our (informal) style rules:
>
>   26 #include "precompiled.hpp"
>   27 #include "ci/ciReplay.hpp"
> <snip>
>   80 #include "utilities/histogram.hpp"
>   81 #ifdef TARGET_OS_FAMILY_linux
>   82 # include "os_linux.inline.hpp"
>   83 #endif
>   84 #ifdef TARGET_OS_FAMILY_solaris
>   85 # include "os_solaris.inline.hpp"
>   86 #endif
>   87 #ifdef TARGET_OS_FAMILY_windows
>   88 # include "os_windows.inline.hpp"
>   89 #endif
>   90 #ifdef TARGET_OS_FAMILY_bsd
>   91 # include "os_bsd.inline.hpp"
>   92 #endif
>
> and if someone does happen to break it, then the new version
> of test/runtime/6888954/vmerrors.sh will catch the error
> because we'll get timeouts on the test because of the unhandled
> Win* exception handler; we'll get a popup which won't be dismissed
> and the test will timeout...
>
> Dan
>
>
>>
>> Thanks,
>> Coleen
>>
>> On 7/3/2013 3:50 PM, Daniel D. Daugherty wrote:
>>> Thanks for the fast re-review!!
>>>
>>> Dan
>>>
>>>
>>> On 7/3/13 1:47 PM, Dmitry Samersoff wrote:
>>>> Dan,
>>>>
>>>> Looks good for me.
>>>>
>>>> -Dmitry
>>>>
>>>>
>>>> On 2013-07-03 22:37, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> Here are the (round 1) HSX-25 webrev URLs:
>>>>>
>>>>> OpenJDK: http://cr.openjdk.java.net/~dcubed/8015884-webrev/1-hsx25/
>>>>> Internal: 
>>>>> http://javaweb.us.oracle.com/~ddaugher/8015884-webrev/1-hsx25/
>>>>>
>>>>> Because of the whitespace HotSpot style changes, the "frames" version
>>>>> is the easiest to review... Yes, I've crawled through comparing the
>>>>> patch files from round 0 and round 1 and now I have a headache... :-)
>>>>>
>>>>> Changes relative to round 0:
>>>>>
>>>>> - additional changes to get proper hs_err_pid file generation in
>>>>> non-product
>>>>>    Win* bits when the following tests are executed:
>>>>>      -XX:ErrorHandlerTest=N option aka test_error_handler() function
>>>>>      -XX:+ExecuteInternalVMTests option aka 
>>>>> execute_internal_vm_tests()
>>>>> function
>>>>>    - Coleen - you'll want to check out:
>>>>>        src/share/vm/prims/jni.cpp
>>>>>        src/os/windows/vm/os_windows.hpp
>>>>>        src/os/windows/vm/os_windows.inline.hpp
>>>>>
>>>>> - the latest version of test/runtime/6888954/vmerrors.sh now 
>>>>> passes on
>>>>>    all platforms including MacOS X
>>>>>    - this includes test case 12 where we used to not get an 
>>>>> hs_err_pid
>>>>>      file on Win* (just an exit code == 3)
>>>>>    - Gerard - please check out the new test and the changes in
>>>>>      src/share/vm/utilities/debug.cpp; we can discuss off-thread what
>>>>>      to do with: 7190125 runtime/6888954/vmerrors.sh fails on MacOS X
>>>>>
>>>>> - more code clean ups:
>>>>>    - redundant "else" removal - thanks Dmitry!
>>>>>    - switch to PTR_FORMAT in some print calls - thanks Harold!
>>>>>    - more HotSpot whitespace/style fixes... as long as I was in 
>>>>> there... :-)
>>>>>
>>>>> Of course, I'm looking for re-reviews from: Dmitry, Zhengyu,
>>>>> Harold and Coleen...
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>> On 6/21/13 4:44 PM, Daniel D. Daugherty wrote:
>>>>>> Greetings,
>>>>>>
>>>>>> I have have a proposed fix for the following bug:
>>>>>>
>>>>>>      8015884 runThese crashed with SIGSEGV, hs_err has an error 
>>>>>> instead
>>>>>>              of stacktrace
>>>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8015884
>>>>>> https://jbs.oracle.com/bugs/browse/JDK-8015884
>>>>>>
>>>>>> Here are the HSX-25 webrev URLs:
>>>>>>
>>>>>> OpenJDK: http://cr.openjdk.java.net/~dcubed/8015884-webrev/0-hsx25/
>>>>>> Internal: 
>>>>>> http://javaweb.us.oracle.com/~ddaugher/8015884-webrev/0-hsx25/
>>>>>>
>>>>>> Testing:
>>>>>> -  test/runtime/6888954/vmerrors.sh has been updated to include a
>>>>>>     test for a bad function pointer
>>>>>> - vmerrors.sh has been used on MacOS X and Solaris X86 to verify
>>>>>>    that bad function pointers get better trouble shooting info
>>>>>> - (running, @ 24 hours and not yet done) JPRT build and test job
>>>>>> - (TBD) JPRT build and new test only job
>>>>>> - (TBD) Aurora Adhoc vm.quick batch for all platforms in the 
>>>>>> following
>>>>>>    configs: {Client VM, Server VM} x {fastdebug} x {-Xmixed}
>>>>>>
>>>>>> Just to be clear: This fix solves the problem of not getting a
>>>>>> stacktrace when we get a bad PC; it does not solve the underlying
>>>>>> reason why libpthread sent the VM into the weeds.
>>>>>>
>>>>>> Gory details are below. As always, comments, questions and
>>>>>> suggestions are welome.
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>> Gory Details:
>>>>>>
>>>>>> The primary bug fixes are:
>>>>>>
>>>>>> - the Dl_info struct should only be used if dladdr() has
>>>>>>    returned non-zero (no errors) and always check the dladdr()
>>>>>>    return value
>>>>>> - the Dl_info.dli_sname and Dl_info.dli_saddr fields should
>>>>>>    only be used if non-NULL
>>>>>> - the first two fixes are based on the Solaris dladdr() man page
>>>>>> - make logic consistent between MacOS X, Linux and Solaris:
>>>>>>    - offset is only set when a matching symbol is found (and
>>>>>>      the pointer is non-NULL)
>>>>>>
>>>>>> Cleanup bug fixes are:
>>>>>>
>>>>>> - add assert()'s to verify os::dll_address_to_function_name() and
>>>>>>    os::dll_address_to_library_name() contract for 'buf' parameter
>>>>>>
>>>>>> - HotSpot style: explicitly check int function return values
>>>>>>    rather that assume conversion to boolean, e.g.
>>>>>>
>>>>>>      if (dladdr(...)) {  =>  if (dladdr(...) != 0) {
>>>>>>
>>>>>> - consistently compare Dl_info fields against NULL instead of a 
>>>>>> mix of
>>>>>>    '0', NULL and implied boolean
>>>>>>
>>>>>> - the Dl_info.dli_fname and Dl_info.dli_fbase fields are only 
>>>>>> used if
>>>>>>    non-NULL; this was done for consistency (and paranoia); the 
>>>>>> Solaris
>>>>>>    man page does not state whether or not NULL is a permitted 
>>>>>> value for
>>>>>>    these fields
>>>>>>
>>>>>> Test changes:
>>>>>> - add case "13" for bad function pointers
>>>>>> - change VM-side of the test code so that case "12", the
>>>>>>    os::signal_raise(SIGSEGV) case, does not have to be last;
>>>>>>    this puts a ShouldNotReachHere() call after the test_num
>>>>>>    case statement so this test function will always crash
>>>>>>    the VM in non-product bits
>>>>>> - change VM-side of the test code to issue an ERROR message
>>>>>>    when an unexpected test_num value is passed
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/attachments/20130703/757fb02f/attachment.html 


More information about the hotspot-runtime-dev mailing list