RFR (XS): 7129715: MAC: SIGBUS in nsk stress test
Daniel D. Daugherty
daniel.daugherty at oracle.com
Fri Jun 15 07:08:28 PDT 2012
On 6/15/12 4:20 AM, Roland Westrelin wrote:
> Hi Dan,
>
> Thanks for taking a look at this. See below.
No problem. I only noticed the bug because at one point it
was assigned to me... :-)
>> Thanks for tackling such nasty code...
>> Just trying to understand this one... These checks:
>>
>> 476 if (sig == SIGSEGV || sig == SIGBUS) {
>> 480 if (addr< thread->stack_base()&&
>> 481 addr>= thread->stack_base() - thread->stack_size()) {
>> 483 if (thread->in_stack_yellow_zone(addr)) {
>> 485 if (thread->thread_state() == _thread_in_Java) {
>>
>> tell us that we took a SIGSEGV or SIGBUS while running Java code
>> in the yellow zone of our stack... so stack overflow... which gets
>> us to this setting of "stub":
>>
>> 488 stub = SharedRuntime::continuation_for_implicit_exception(thread, pc, SharedRuntime::STACK_OVERFLOW);
>>
>>
>> This line:
>>
>> 519 if (thread->thread_state() == _thread_in_Java) {
>>
>> gets us into another block of "stub" setting code, but it
>> currently doesn't care that "stub" was already set. That's
>> the code you're trying to fix with this new line:
>>
>> 519 if (thread->thread_state() == _thread_in_Java&& stub == NULL) {
> Exactly.
>
>> Just to be complete, I'm trying to understand which of the
>> many places that set "stub" is clobbering the existing value.
> This is the code that clobbers the stub value:
>
> 530 } else if (sig == SIGBUS&& MacroAssembler::needs_explicit_null_check((intptr_t)info->si_addr)) {
> 534 // BugId 4454115: A read from a MappedByteBuffer can fault
> 535 // here if the underlying file has been truncated.
> 536 // Do not crash the VM in such a case.
> 537 CodeBlob* cb = CodeCache::find_blob_unsafe(pc);
> 538 nmethod* nm = cb->is_nmethod() ? (nmethod*)cb : NULL;
> 539 if (nm != NULL&& nm->has_unsafe_access()) {
> 540 stub = StubRoutines::handler_for_unsafe_access();
> 541 }
> 542 }
>
> si_addr is an address on the stack so it's not in the first page and MacroAssembler::needs_explicit_null_check() returns true. The method where the SIGBUS due to the stack overflow happens is a compiled method and we're unlucky because it has some unsafe accesses and the SIGBUS is mistaken for an unsafe access that has gone wrong.
>
> Roland.
Thanks for closing the loop with me on this one.
I had ruled out this block in my initial analysis since I
guessed an "unsafe access" was an uncommon case. It might
still be uncommon, but it was the one that you had in hand.
Is it just me or this whole function just crazy complicated???
Dan
More information about the hotspot-runtime-dev
mailing list