HSX-25: code review (round 0) for VMError::report_and_die () dies on bad PC (8015884)
Daniel D. Daugherty
daniel.daugherty at oracle.com
Tue Jul 2 18:00:03 PDT 2013
Dmitry,
I'm refining my code cleanup policy to be a little looser.
If I have touched the line (for whatever code reason), then
I'll fix the style issues...
So I'll be getting rid of the redundant elses in the next round.
Dan
On 7/1/13 5:07 PM, Daniel D. Daugherty wrote:
> Sorry for the delay in getting back to you on this thread.
> I've been focused on trying to figure out why the test was
> broken on Win*...
>
> I have a fix in mind so there will be a new webrev...
>
>
> On 6/27/13 11:21 AM, Dmitry Samersoff wrote:
>> Dan,
>>
>>> but I try not to change existing style unless I have to
>>> make other logic changes to the line.
>> os_linux.cpp:
>>
>> 1712 doesn't have redundant else, but 1795 does have.
>
> The change from the old code:
>
> 1701 if (dladdr((void*)addr, &dlinfo) && dlinfo.dli_sname != NULL) {
> <snip>
> 1709 } else if (dlinfo.dli_fname != NULL && dlinfo.dli_fbase != 0) {
>
> to the new code:
>
> 1705 if (dladdr((void*)addr, &dlinfo) != 0) {
> 1706 // see if we have a matching symbol
> 1707 if (dlinfo.dli_saddr != NULL && dlinfo.dli_sname != NULL) {
>
> <snip>
>
> 1713 }
> 1714 // no matching symbol so try for just file info
> 1715 if (dlinfo.dli_fname != NULL && dlinfo.dli_fbase != NULL) {
>
> was not simply a cosmetic change to get rid of the redundant "else"
> that was on line 1709.
>
> In fact, this is one of the key fixes:
>
> - the dladdr() call on old line 1701 could fail and
> - the dlinfo fields on old line 1709 could still be used
>
> In the new code:
>
> - old line 1701 is split into new lines 1705 and 1707
> - old line 1709 is now conditional inside the block
> started by new line 1705
>
>>
>> I would prefer to have the same stile in all similar
>> code fragments if possible.
>
> Yes, I understand that. In the case of old line 1709, I changed
> the core logic of line (by moving it inside the now split-if)
> so it made sense to also get rid of the redundant "else".
>
> For line 1795, I just changed from an implied boolean to '!= 0'
> so that doesn't really meet my criteria.
>
> Dan
>
>
>>
>> -Dmitry
>>
>>
>> On 2013-06-27 20:18, Daniel D. Daugherty wrote:
>>> Dmitry,
>>>
>>> Thanks for the review!
>>>
>>> Sorry for the delay in responding. I was out of the office
>>> for a couple of days.
>>>
>>> Replies embedded below.
>>>
>>>
>>> On 6/22/13 11:51 AM, Dmitry Samersoff wrote:
>>>> Dan,
>>>>
>>>> os_linux.cpp:
>>>>
>>>> 1685:
>>>> It might be better to just turn assert to guarantee. Something
>>>> bad is
>>>> happening if we can't obtain JVM base address.
>>> So that would be:
>>>
>>> In os_bsd.cpp:
>>>
>>> 1232 bool os::address_is_in_vm(address addr) {
>>> 1240 assert(libjvm_base_addr !=NULL, "Cannot obtain base address
>>> for
>>> libjvm");
>>>
>>> In os_linux.cpp:
>>>
>>> 1680 bool os::address_is_in_vm(address addr) {
>>> 1688 assert(libjvm_base_addr !=NULL, "Cannot obtain base address
>>> for
>>> libjvm");
>>>
>>> In os_solaris.cpp:
>>>
>>> 1922 bool os::address_is_in_vm(address addr) {
>>> 1930 assert(libjvm_base_addr !=NULL, "Cannot obtain base address
>>> for
>>> libjvm");
>>>
>>> In os_windows.cpp:
>>>
>>> 1477 bool os::address_is_in_vm(address addr) {
>>> 1481 assert(false, "Can't find jvm module.");
>>>
>>> I concur that not knowing the JVM base address would be bad.
>>> I will change the above "assert" calls into "guarantee" calls
>>> if other reviewers do not object.
>>>
>>>
>>>
>>>> 1795,1803:
>>>> else is redundant
>>> Agreed. My personal preference would be something like:
>>>
>>> if (cond1) {
>>> <some work>
>>> return true;
>>> }
>>>
>>> if (cond2) {
>>> <some other work>
>>> return true;
>>> }
>>>
>>> <work to prepare for false return>
>>> return false;
>>>
>>> but I try not to change existing style unless I have to
>>> make other logic changes to the line. In this case, I'm
>>> just changing from an implied boolean to '!= 0' so that
>>> doesn't really meet my criteria.
>>>
>>> The indenting is also wrong in this function and I'm
>>> trying really hard to not fix... Since this will likely
>>> get back ported to HSX-24, I shouldn't... must resist...
>>>
>>> Again, thanks for the review!
>>>
>>> Dan
>>>
>>>
>>>> -Dmitry
>>>>
>>>>
>>>> On 2013-06-22 02:44, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> I have have a proposed fix for the following bug:
>>>>>
>>>>> 8015884 runThese crashed with SIGSEGV, hs_err has an error
>>>>> instead
>>>>> of stacktrace
>>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8015884
>>>>> https://jbs.oracle.com/bugs/browse/JDK-8015884
>>>>>
>>>>> Here are the HSX-25 webrev URLs:
>>>>>
>>>>> OpenJDK: http://cr.openjdk.java.net/~dcubed/8015884-webrev/0-hsx25/
>>>>> Internal:
>>>>> http://javaweb.us.oracle.com/~ddaugher/8015884-webrev/0-hsx25/
>>>>>
>>>>> Testing:
>>>>> - test/runtime/6888954/vmerrors.sh has been updated to include a
>>>>> test for a bad function pointer
>>>>> - vmerrors.sh has been used on MacOS X and Solaris X86 to verify
>>>>> that bad function pointers get better trouble shooting info
>>>>> - (running, @ 24 hours and not yet done) JPRT build and test job
>>>>> - (TBD) JPRT build and new test only job
>>>>> - (TBD) Aurora Adhoc vm.quick batch for all platforms in the
>>>>> following
>>>>> configs: {Client VM, Server VM} x {fastdebug} x {-Xmixed}
>>>>>
>>>>> Just to be clear: This fix solves the problem of not getting a
>>>>> stacktrace when we get a bad PC; it does not solve the underlying
>>>>> reason why libpthread sent the VM into the weeds.
>>>>>
>>>>> Gory details are below. As always, comments, questions and
>>>>> suggestions are welome.
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>> Gory Details:
>>>>>
>>>>> The primary bug fixes are:
>>>>>
>>>>> - the Dl_info struct should only be used if dladdr() has
>>>>> returned non-zero (no errors) and always check the dladdr()
>>>>> return value
>>>>> - the Dl_info.dli_sname and Dl_info.dli_saddr fields should
>>>>> only be used if non-NULL
>>>>> - the first two fixes are based on the Solaris dladdr() man page
>>>>> - make logic consistent between MacOS X, Linux and Solaris:
>>>>> - offset is only set when a matching symbol is found (and
>>>>> the pointer is non-NULL)
>>>>>
>>>>> Cleanup bug fixes are:
>>>>>
>>>>> - add assert()'s to verify os::dll_address_to_function_name() and
>>>>> os::dll_address_to_library_name() contract for 'buf' parameter
>>>>>
>>>>> - HotSpot style: explicitly check int function return values
>>>>> rather that assume conversion to boolean, e.g.
>>>>>
>>>>> if (dladdr(...)) { => if (dladdr(...) != 0) {
>>>>>
>>>>> - consistently compare Dl_info fields against NULL instead of a
>>>>> mix of
>>>>> '0', NULL and implied boolean
>>>>>
>>>>> - the Dl_info.dli_fname and Dl_info.dli_fbase fields are only used if
>>>>> non-NULL; this was done for consistency (and paranoia); the
>>>>> Solaris
>>>>> man page does not state whether or not NULL is a permitted
>>>>> value for
>>>>> these fields
>>>>>
>>>>> Test changes:
>>>>> - add case "13" for bad function pointers
>>>>> - change VM-side of the test code so that case "12", the
>>>>> os::signal_raise(SIGSEGV) case, does not have to be last;
>>>>> this puts a ShouldNotReachHere() call after the test_num
>>>>> case statement so this test function will always crash
>>>>> the VM in non-product bits
>>>>> - change VM-side of the test code to issue an ERROR message
>>>>> when an unexpected test_num value is passed
>>>>>
>>
>
>
>
More information about the hotspot-runtime-dev
mailing list