(S) RFR: 8139300: Internal Error (vm/utilities/debug.cpp:399), # Error: ShouldNotReachHere()

David Holmes david.holmes at oracle.com
Mon Nov 16 21:32:23 UTC 2015


On 17/11/2015 6:34 AM, Daniel D. Daugherty wrote:
> On 11/11/15 11:33 PM, David Holmes wrote:
>> webrev: http://cr.openjdk.java.net/~dholmes/8139300/webrev/
>
> src/share/vm/utilities/debug.cpp
>      No comments.
>
> test/runtime/ErrorHandling/SecondaryErrorTest.java
>      No comments.
>
> Thumbs up.
>
> Since you've used the POSIX equivalent code:
>
>      pthread_kill(pthread_self(), SIGFPE);
>
> I'm presuming we have no plans to change this code back once
> MacOS X fixes the bug in raise().

No plans. Also I'm not aware of OSX accepting this as a bug, so it may 
never get fixed.

Thanks for the review!

David

> Dan
>
>
>>
>> Bug report is not public sorry.
>>
>> In the crash handler test logic in the VM we have:
>>
>> static void crash_with_sigfpe() {
>>    // generate a native synchronous SIGFPE where possible;
>>    // if that did not cause a signal (e.g. on ppc), just
>>    // raise the signal.
>>    volatile int x = 0;
>>    volatile int y = 1/x;
>>  #ifndef _WIN32
>>    raise(SIGFPE);
>>  #endif
>>  } // end: crash_with_sigfpe
>>
>> and it is used here:
>>
>>  402     case 15: crash_with_sigfpe(); break;
>>  403
>>  404     default: tty->print_cr("ERROR: %d: unexpected test_num
>> value.", how);
>>  405   }
>>  406   ShouldNotReachHere();
>>
>> We recently updated the compiler on OSX and started seeing occasional
>> failures of the test exercising this code, the failure mode being that
>> we hit the ShouldNotReachHere() at line #406.
>>
>> It seems the new compiler may not be generating code for the
>> "y=1/x;"** so we don't get the SIGFPE and so proceed to raise it
>> directly. That then hits an apparent bug on OSX where raise always
>> sends the signal to the main thread not the current thread as required
>> by POSIX in a multi-threaded app. Consequently raise() could return
>> and we'd hit the ShouldNotReachHere().
>>
>> Whether the test failed or appeared to pass depended on which thread
>> got into the error handler first (though I'm still unclear on the
>> finer details of that potential interaction - the other thread should
>> have seen error reporting was already in progress by the recursively
>> failing thread!)
>>
>> The solution I chose is to simply convert raise(sig) into its POSIX
>> specified equivalent: pthread_kill(pthreead_self(), sig);
>>
>> I'm a little reluctant having pthread functions in a shared file, but
>> AFAIK all our platforms supports pthreads. This also doesn't seem any
>> worse than assuming raise() exists on all platforms. But I could add
>> some ifdef _POSIX_C_SOURCE if there is concern (and even have a #error
>> for unknown platforms) ?
>>
>> ** I'm not surprised a compiler might elide the attempted division by
>> zero. I don't think volatile on a local has any meaning and so could
>> easily be optimized out completely. I overlooked that in the original
>> commit of this logic, and it worked so ...
>>
>> Thanks,
>> David
>>
>


More information about the hotspot-runtime-dev mailing list