(S) RFR: 8139300: Internal Error (vm/utilities/debug.cpp:399), # Error: ShouldNotReachHere()

David Holmes david.holmes at oracle.com
Thu Nov 12 06:33:54 UTC 2015


webrev: http://cr.openjdk.java.net/~dholmes/8139300/webrev/

Bug report is not public sorry.

In the crash handler test logic in the VM we have:

static void crash_with_sigfpe() {
    // generate a native synchronous SIGFPE where possible;
    // if that did not cause a signal (e.g. on ppc), just
    // raise the signal.
    volatile int x = 0;
    volatile int y = 1/x;
  #ifndef _WIN32
    raise(SIGFPE);
  #endif
  } // end: crash_with_sigfpe

and it is used here:

  402     case 15: crash_with_sigfpe(); break;
  403
  404     default: tty->print_cr("ERROR: %d: unexpected test_num 
value.", how);
  405   }
  406   ShouldNotReachHere();

We recently updated the compiler on OSX and started seeing occasional 
failures of the test exercising this code, the failure mode being that 
we hit the ShouldNotReachHere() at line #406.

It seems the new compiler may not be generating code for the "y=1/x;"** 
so we don't get the SIGFPE and so proceed to raise it directly. That 
then hits an apparent bug on OSX where raise always sends the signal to 
the main thread not the current thread as required by POSIX in a 
multi-threaded app. Consequently raise() could return and we'd hit the 
ShouldNotReachHere().

Whether the test failed or appeared to pass depended on which thread got 
into the error handler first (though I'm still unclear on the finer 
details of that potential interaction - the other thread should have 
seen error reporting was already in progress by the recursively failing 
thread!)

The solution I chose is to simply convert raise(sig) into its POSIX 
specified equivalent: pthread_kill(pthreead_self(), sig);

I'm a little reluctant having pthread functions in a shared file, but 
AFAIK all our platforms supports pthreads. This also doesn't seem any 
worse than assuming raise() exists on all platforms. But I could add 
some ifdef _POSIX_C_SOURCE if there is concern (and even have a #error 
for unknown platforms) ?

** I'm not surprised a compiler might elide the attempted division by 
zero. I don't think volatile on a local has any meaning and so could 
easily be optimized out completely. I overlooked that in the original 
commit of this logic, and it worked so ...

Thanks,
David


More information about the hotspot-runtime-dev mailing list