HSX-25: code review (round 0) for VMError::report_and_die () dies on bad PC (8015884)

Daniel D. Daugherty daniel.daugherty at oracle.com
Thu Jun 27 09:18:44 PDT 2013


Dmitry,

Thanks for the review!

Sorry for the delay in responding. I was out of the office
for a couple of days.

Replies embedded below.


On 6/22/13 11:51 AM, Dmitry Samersoff wrote:
> Dan,
>
> os_linux.cpp:
>
> 1685:
>    It might be better to just turn assert to guarantee. Something bad is
> happening if we can't obtain JVM base address.

So that would be:

In os_bsd.cpp:

1232 bool os::address_is_in_vm(address addr) {
1240     assert(libjvm_base_addr !=NULL, "Cannot obtain base address for 
libjvm");

In os_linux.cpp:

1680 bool os::address_is_in_vm(address addr) {
1688     assert(libjvm_base_addr !=NULL, "Cannot obtain base address for 
libjvm");

In os_solaris.cpp:

1922 bool os::address_is_in_vm(address addr) {
1930     assert(libjvm_base_addr !=NULL, "Cannot obtain base address for 
libjvm");

In os_windows.cpp:

1477 bool os::address_is_in_vm(address addr) {
1481       assert(false, "Can't find jvm module.");

I concur that not knowing the JVM base address would be bad.
I will change the above "assert" calls into "guarantee" calls
if other reviewers do not object.



> 1795,1803:
> else is redundant

Agreed. My personal preference would be something like:

   if (cond1) {
     <some work>
     return true;
   }

   if (cond2) {
     <some other work>
     return true;
   }

   <work to prepare for false return>
   return false;

but I try not to change existing style unless I have to
make other logic changes to the line. In this case, I'm
just changing from an implied boolean to '!= 0' so that
doesn't really meet my criteria.

The indenting is also wrong in this function and I'm
trying really hard to not fix... Since this will likely
get back ported to HSX-24, I shouldn't... must resist...

Again, thanks for the review!

Dan


>
> -Dmitry
>
>
> On 2013-06-22 02:44, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> I have have a proposed fix for the following bug:
>>
>>      8015884 runThese crashed with SIGSEGV, hs_err has an error instead
>>              of stacktrace
>>      http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8015884
>>      https://jbs.oracle.com/bugs/browse/JDK-8015884
>>
>> Here are the HSX-25 webrev URLs:
>>
>> OpenJDK: http://cr.openjdk.java.net/~dcubed/8015884-webrev/0-hsx25/
>> Internal: http://javaweb.us.oracle.com/~ddaugher/8015884-webrev/0-hsx25/
>>
>> Testing:
>> -  test/runtime/6888954/vmerrors.sh has been updated to include a
>>     test for a bad function pointer
>> - vmerrors.sh has been used on MacOS X and Solaris X86 to verify
>>    that bad function pointers get better trouble shooting info
>> - (running, @ 24 hours and not yet done) JPRT build and test job
>> - (TBD) JPRT build and new test only job
>> - (TBD) Aurora Adhoc vm.quick batch for all platforms in the following
>>    configs: {Client VM, Server VM} x {fastdebug} x {-Xmixed}
>>
>> Just to be clear: This fix solves the problem of not getting a
>> stacktrace when we get a bad PC; it does not solve the underlying
>> reason why libpthread sent the VM into the weeds.
>>
>> Gory details are below. As always, comments, questions and
>> suggestions are welome.
>>
>> Dan
>>
>>
>> Gory Details:
>>
>> The primary bug fixes are:
>>
>> - the Dl_info struct should only be used if dladdr() has
>>    returned non-zero (no errors) and always check the dladdr()
>>    return value
>> - the Dl_info.dli_sname and Dl_info.dli_saddr fields should
>>    only be used if non-NULL
>> - the first two fixes are based on the Solaris dladdr() man page
>> - make logic consistent between MacOS X, Linux and Solaris:
>>    - offset is only set when a matching symbol is found (and
>>      the pointer is non-NULL)
>>
>> Cleanup bug fixes are:
>>
>> - add assert()'s to verify os::dll_address_to_function_name() and
>>    os::dll_address_to_library_name() contract for 'buf' parameter
>>
>> - HotSpot style: explicitly check int function return values
>>    rather that assume conversion to boolean, e.g.
>>
>>      if (dladdr(...)) {  =>  if (dladdr(...) != 0) {
>>
>> - consistently compare Dl_info fields against NULL instead of a mix of
>>    '0', NULL and implied boolean
>>
>> - the Dl_info.dli_fname and Dl_info.dli_fbase fields are only used if
>>    non-NULL; this was done for consistency (and paranoia); the Solaris
>>    man page does not state whether or not NULL is a permitted value for
>>    these fields
>>
>> Test changes:
>> - add case "13" for bad function pointers
>> - change VM-side of the test code so that case "12", the
>>    os::signal_raise(SIGSEGV) case, does not have to be last;
>>    this puts a ShouldNotReachHere() call after the test_num
>>    case statement so this test function will always crash
>>    the VM in non-product bits
>> - change VM-side of the test code to issue an ERROR message
>>    when an unexpected test_num value is passed
>>
>



More information about the hotspot-runtime-dev mailing list