RFR (XS): 7129715: MAC: SIGBUS in nsk stress test

Daniel D. Daugherty daniel.daugherty at oracle.com
Fri Jun 15 07:08:28 PDT 2012


On 6/15/12 4:20 AM, Roland Westrelin wrote:
> Hi Dan,
>
> Thanks for taking a look at this. See below.

No problem. I only noticed the bug because at one point it
was assigned to me... :-)


>> Thanks for tackling such nasty code...
>> Just trying to understand this one... These checks:
>>
>>     476     if (sig == SIGSEGV || sig == SIGBUS) {
>>     480       if (addr<  thread->stack_base()&&
>>     481           addr>= thread->stack_base() - thread->stack_size()) {
>>     483         if (thread->in_stack_yellow_zone(addr)) {
>>     485           if (thread->thread_state() == _thread_in_Java) {
>>
>> tell us that we took a SIGSEGV or SIGBUS while running Java code
>> in the yellow zone of our stack... so stack overflow... which gets
>> us to this setting of "stub":
>>
>>     488             stub = SharedRuntime::continuation_for_implicit_exception(thread, pc, SharedRuntime::STACK_OVERFLOW);
>>
>>
>> This line:
>>
>>     519     if (thread->thread_state() == _thread_in_Java) {
>>
>> gets us into another block of "stub" setting code, but it
>> currently doesn't care that "stub" was already set. That's
>> the code you're trying to fix with this new line:
>>
>> 519     if (thread->thread_state() == _thread_in_Java&&  stub == NULL) {
> Exactly.
>
>> Just to be complete, I'm trying to understand which of the
>> many places that set "stub" is clobbering the existing value.
> This is the code that clobbers the stub value:
>
>   530       } else if (sig == SIGBUS&&  MacroAssembler::needs_explicit_null_check((intptr_t)info->si_addr)) {
>   534         // BugId 4454115: A read from a MappedByteBuffer can fault
>   535         // here if the underlying file has been truncated.
>   536         // Do not crash the VM in such a case.
>   537         CodeBlob* cb = CodeCache::find_blob_unsafe(pc);
>   538         nmethod* nm = cb->is_nmethod() ? (nmethod*)cb : NULL;
>   539         if (nm != NULL&&  nm->has_unsafe_access()) {
>   540           stub = StubRoutines::handler_for_unsafe_access();
>   541         }
>   542       }
>
> si_addr is an address on the stack so it's not in the first page and MacroAssembler::needs_explicit_null_check() returns true. The method where the SIGBUS due to the stack overflow happens is a compiled method and we're unlucky because it has some unsafe accesses and the SIGBUS is mistaken for an unsafe access that has gone wrong.
>
> Roland.

Thanks for closing the loop with me on this one.

I had ruled out this block in my initial analysis since I
guessed an "unsafe access" was an uncommon case. It might
still be uncommon, but it was the one that you had in hand.

Is it just me or this whole function just crazy complicated???

Dan



More information about the hotspot-compiler-dev mailing list