HSX-25: code review (round 0) for VMError::report_and_die () dies on bad PC (8015884)

Daniel D. Daugherty daniel.daugherty at oracle.com
Tue Jul 2 18:00:03 PDT 2013


Dmitry,

I'm refining my code cleanup policy to be a little looser.
If I have touched the line (for whatever code reason), then
I'll fix the style issues...

So I'll be getting rid of the redundant elses in the next round.

Dan


On 7/1/13 5:07 PM, Daniel D. Daugherty wrote:
> Sorry for the delay in getting back to you on this thread.
> I've been focused on trying to figure out why the test was
> broken on Win*...
>
> I have a fix in mind so there will be a new webrev...
>
>
> On 6/27/13 11:21 AM, Dmitry Samersoff wrote:
>> Dan,
>>
>>> but I try not to change existing style unless I have to
>>> make other logic changes to the line.
>> os_linux.cpp:
>>
>> 1712 doesn't have redundant else, but 1795 does have.
>
> The change from the old code:
>
> 1701   if (dladdr((void*)addr, &dlinfo) && dlinfo.dli_sname != NULL) {
> <snip>
> 1709   } else if (dlinfo.dli_fname != NULL && dlinfo.dli_fbase != 0) {
>
> to the new code:
>
> 1705   if (dladdr((void*)addr, &dlinfo) != 0) {
> 1706     // see if we have a matching symbol
> 1707     if (dlinfo.dli_saddr != NULL && dlinfo.dli_sname != NULL) {
>
> <snip>
>
> 1713     }
> 1714     // no matching symbol so try for just file info
> 1715     if (dlinfo.dli_fname != NULL && dlinfo.dli_fbase != NULL) {
>
> was not simply a cosmetic change to get rid of the redundant "else"
> that was on line 1709.
>
> In fact, this is one of the key fixes:
>
> - the dladdr() call on old line 1701 could fail and
> - the dlinfo fields on old line 1709 could still be used
>
> In the new code:
>
> - old line 1701 is split into new lines 1705 and 1707
> - old line 1709 is now conditional inside the block
>   started by new line 1705
>
>>
>> I would prefer to have the same stile in all similar
>> code fragments if possible.
>
> Yes, I understand that. In the case of old line 1709, I changed
> the core logic of line (by moving it inside the now split-if)
> so it made sense to also get rid of the redundant "else".
>
> For line 1795, I just changed from an implied boolean to '!= 0'
> so that doesn't really meet my criteria.
>
> Dan
>
>
>>
>> -Dmitry
>>
>>
>> On 2013-06-27 20:18, Daniel D. Daugherty wrote:
>>> Dmitry,
>>>
>>> Thanks for the review!
>>>
>>> Sorry for the delay in responding. I was out of the office
>>> for a couple of days.
>>>
>>> Replies embedded below.
>>>
>>>
>>> On 6/22/13 11:51 AM, Dmitry Samersoff wrote:
>>>> Dan,
>>>>
>>>> os_linux.cpp:
>>>>
>>>> 1685:
>>>>     It might be better to just turn assert to guarantee. Something 
>>>> bad is
>>>> happening if we can't obtain JVM base address.
>>> So that would be:
>>>
>>> In os_bsd.cpp:
>>>
>>> 1232 bool os::address_is_in_vm(address addr) {
>>> 1240     assert(libjvm_base_addr !=NULL, "Cannot obtain base address 
>>> for
>>> libjvm");
>>>
>>> In os_linux.cpp:
>>>
>>> 1680 bool os::address_is_in_vm(address addr) {
>>> 1688     assert(libjvm_base_addr !=NULL, "Cannot obtain base address 
>>> for
>>> libjvm");
>>>
>>> In os_solaris.cpp:
>>>
>>> 1922 bool os::address_is_in_vm(address addr) {
>>> 1930     assert(libjvm_base_addr !=NULL, "Cannot obtain base address 
>>> for
>>> libjvm");
>>>
>>> In os_windows.cpp:
>>>
>>> 1477 bool os::address_is_in_vm(address addr) {
>>> 1481       assert(false, "Can't find jvm module.");
>>>
>>> I concur that not knowing the JVM base address would be bad.
>>> I will change the above "assert" calls into "guarantee" calls
>>> if other reviewers do not object.
>>>
>>>
>>>
>>>> 1795,1803:
>>>> else is redundant
>>> Agreed. My personal preference would be something like:
>>>
>>>    if (cond1) {
>>>      <some work>
>>>      return true;
>>>    }
>>>
>>>    if (cond2) {
>>>      <some other work>
>>>      return true;
>>>    }
>>>
>>>    <work to prepare for false return>
>>>    return false;
>>>
>>> but I try not to change existing style unless I have to
>>> make other logic changes to the line. In this case, I'm
>>> just changing from an implied boolean to '!= 0' so that
>>> doesn't really meet my criteria.
>>>
>>> The indenting is also wrong in this function and I'm
>>> trying really hard to not fix... Since this will likely
>>> get back ported to HSX-24, I shouldn't... must resist...
>>>
>>> Again, thanks for the review!
>>>
>>> Dan
>>>
>>>
>>>> -Dmitry
>>>>
>>>>
>>>> On 2013-06-22 02:44, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> I have have a proposed fix for the following bug:
>>>>>
>>>>>       8015884 runThese crashed with SIGSEGV, hs_err has an error 
>>>>> instead
>>>>>               of stacktrace
>>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8015884
>>>>>       https://jbs.oracle.com/bugs/browse/JDK-8015884
>>>>>
>>>>> Here are the HSX-25 webrev URLs:
>>>>>
>>>>> OpenJDK: http://cr.openjdk.java.net/~dcubed/8015884-webrev/0-hsx25/
>>>>> Internal: 
>>>>> http://javaweb.us.oracle.com/~ddaugher/8015884-webrev/0-hsx25/
>>>>>
>>>>> Testing:
>>>>> -  test/runtime/6888954/vmerrors.sh has been updated to include a
>>>>>      test for a bad function pointer
>>>>> - vmerrors.sh has been used on MacOS X and Solaris X86 to verify
>>>>>     that bad function pointers get better trouble shooting info
>>>>> - (running, @ 24 hours and not yet done) JPRT build and test job
>>>>> - (TBD) JPRT build and new test only job
>>>>> - (TBD) Aurora Adhoc vm.quick batch for all platforms in the 
>>>>> following
>>>>>     configs: {Client VM, Server VM} x {fastdebug} x {-Xmixed}
>>>>>
>>>>> Just to be clear: This fix solves the problem of not getting a
>>>>> stacktrace when we get a bad PC; it does not solve the underlying
>>>>> reason why libpthread sent the VM into the weeds.
>>>>>
>>>>> Gory details are below. As always, comments, questions and
>>>>> suggestions are welome.
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>> Gory Details:
>>>>>
>>>>> The primary bug fixes are:
>>>>>
>>>>> - the Dl_info struct should only be used if dladdr() has
>>>>>     returned non-zero (no errors) and always check the dladdr()
>>>>>     return value
>>>>> - the Dl_info.dli_sname and Dl_info.dli_saddr fields should
>>>>>     only be used if non-NULL
>>>>> - the first two fixes are based on the Solaris dladdr() man page
>>>>> - make logic consistent between MacOS X, Linux and Solaris:
>>>>>     - offset is only set when a matching symbol is found (and
>>>>>       the pointer is non-NULL)
>>>>>
>>>>> Cleanup bug fixes are:
>>>>>
>>>>> - add assert()'s to verify os::dll_address_to_function_name() and
>>>>>     os::dll_address_to_library_name() contract for 'buf' parameter
>>>>>
>>>>> - HotSpot style: explicitly check int function return values
>>>>>     rather that assume conversion to boolean, e.g.
>>>>>
>>>>>       if (dladdr(...)) {  =>  if (dladdr(...) != 0) {
>>>>>
>>>>> - consistently compare Dl_info fields against NULL instead of a 
>>>>> mix of
>>>>>     '0', NULL and implied boolean
>>>>>
>>>>> - the Dl_info.dli_fname and Dl_info.dli_fbase fields are only used if
>>>>>     non-NULL; this was done for consistency (and paranoia); the 
>>>>> Solaris
>>>>>     man page does not state whether or not NULL is a permitted 
>>>>> value for
>>>>>     these fields
>>>>>
>>>>> Test changes:
>>>>> - add case "13" for bad function pointers
>>>>> - change VM-side of the test code so that case "12", the
>>>>>     os::signal_raise(SIGSEGV) case, does not have to be last;
>>>>>     this puts a ShouldNotReachHere() call after the test_num
>>>>>     case statement so this test function will always crash
>>>>>     the VM in non-product bits
>>>>> - change VM-side of the test code to issue an ERROR message
>>>>>     when an unexpected test_num value is passed
>>>>>
>>
>
>
>



More information about the hotspot-runtime-dev mailing list