RFR (XS): 7129715: MAC: SIGBUS in nsk stress test

Roland Westrelin roland.westrelin at oracle.com
Fri Jun 15 03:20:54 PDT 2012


Hi Dan,

Thanks for taking a look at this. See below.

> Thanks for tackling such nasty code...
> Just trying to understand this one... These checks:
> 
>    476     if (sig == SIGSEGV || sig == SIGBUS) {
>    480       if (addr < thread->stack_base() &&
>    481           addr >= thread->stack_base() - thread->stack_size()) {
>    483         if (thread->in_stack_yellow_zone(addr)) {
>    485           if (thread->thread_state() == _thread_in_Java) {
> 
> tell us that we took a SIGSEGV or SIGBUS while running Java code
> in the yellow zone of our stack... so stack overflow... which gets
> us to this setting of "stub":
> 
>    488             stub = SharedRuntime::continuation_for_implicit_exception(thread, pc, SharedRuntime::STACK_OVERFLOW);
> 
> 
> This line:
> 
>    519     if (thread->thread_state() == _thread_in_Java) {
> 
> gets us into another block of "stub" setting code, but it
> currently doesn't care that "stub" was already set. That's
> the code you're trying to fix with this new line:
> 
> 519     if (thread->thread_state() == _thread_in_Java && stub == NULL) {

Exactly.

> Just to be complete, I'm trying to understand which of the
> many places that set "stub" is clobbering the existing value.

This is the code that clobbers the stub value:

 530       } else if (sig == SIGBUS && MacroAssembler::needs_explicit_null_check((intptr_t)info->si_addr)) {
 534         // BugId 4454115: A read from a MappedByteBuffer can fault
 535         // here if the underlying file has been truncated.
 536         // Do not crash the VM in such a case.
 537         CodeBlob* cb = CodeCache::find_blob_unsafe(pc);
 538         nmethod* nm = cb->is_nmethod() ? (nmethod*)cb : NULL;
 539         if (nm != NULL && nm->has_unsafe_access()) {
 540           stub = StubRoutines::handler_for_unsafe_access();
 541         }
 542       }

si_addr is an address on the stack so it's not in the first page and MacroAssembler::needs_explicit_null_check() returns true. The method where the SIGBUS due to the stack overflow happens is a compiled method and we're unlucky because it has some unsafe accesses and the SIGBUS is mistaken for an unsafe access that has gone wrong.

Roland.


More information about the hotspot-compiler-dev mailing list