RFR (S): 8137165: Tests fail in SR_Handler because thread is not VMThread or JavaThread
David Holmes
david.holmes at oracle.com
Mon Mar 14 06:46:58 UTC 2016
Bug: https://bugs.openjdk.java.net/browse/JDK-8137165
Webrev: http://cr.openjdk.java.net/~dholmes/8137165/webrev/
This isn't a fix per-se but some additional diagnostic code to try and
detect the conditions where the bug might manifest. The basic failure
mode was:
# Internal Error
(/opt/jprt/T/P1/175841.hseigel/s/hotspot/src/os/linux/vm/os_linux.cpp:3950),
pid=27906, tid=13248
# assert(thread->is_VM_thread() || thread->is_Java_thread()) failed:
Must be VMThread or JavaThread
with a stack showing in part:
#34 0xf6623ec0 in report_vm_error (
file=0xf71b6140
"/scratch/opt/jprt/T/P1/205457.cphillim/s/hotspot/src/os/linux/vm/os_linux.cpp",
line=3901,
error_msg=0xf71b62e0 "assert(thread->is_VM_thread() ||
thread->is_Java_thread()) failed", detail_fmt=0xf71b62c0 "Must be
VMThread or JavaThread")
at
/scratch/opt/jprt/T/P1/205457.cphillim/s/hotspot/src/share/vm/utilities/debug.cpp:218
#35 0xf6d21b3f in SR_handler (sig=12, siginfo=0xc1b58ccc,
context=0xc1b58d4c)
at
/scratch/opt/jprt/T/P1/205457.cphillim/s/hotspot/src/os/linux/vm/os_linux.cpp:3901
#36 <signal handler called>
#37 0xf776b430 in __kernel_vsyscall ()
#38 0xf773ccef in pthread_sigmask () from /lib/libpthread.so.0
#39 0xf6d23e6c in os::free_thread (osthread=0xc20cf8b8)
at
/scratch/opt/jprt/T/P1/205457.cphillim/s/hotspot/src/os/linux/vm/os_linux.cpp:879
#40 0xf6f6811d in Thread::~Thread (this=0xc20cd800, __in_chrg=<optimized
out>)
at
/scratch/opt/jprt/T/P1/205457.cphillim/s/hotspot/src/share/vm/runtime/thread.cpp:367
#41 0xf6f6866f in JavaThread::~JavaThread (this=0xc20cd800,
__in_chrg=<optimized out>)
at
/scratch/opt/jprt/T/P1/205457.cphillim/s/hotspot/src/share/vm/runtime/thread.cpp:1611
#42 0xf6f6877c in JavaThread::~JavaThread (this=0xc20cd800,
__in_chrg=<optimized out>)
at
/scratch/opt/jprt/T/P1/205457.cphillim/s/hotspot/src/share/vm/runtime/thread.cpp:1655
#43 0xf6f74a38 in JavaThread::thread_main_inner (this=0xc20cd800)
at
/scratch/opt/jprt/T/P1/205457.cphillim/s/hotspot/src/share/vm/runtime/thread.cpp:1724
#44 0xf6f74e12 in JavaThread::run (this=0xc20cd800)
at
/scratch/opt/jprt/T/P1/205457.cphillim/s/hotspot/src/share/vm/runtime/thread.cpp:1698
#45 0xf6d238ec in java_start (thread=0xc20cd800)
What appears to be happening is that the thread has blocked SR_signum
(SIGUSR2) at some point (though there is no code that does this), and
the signal has become pending on the thread due to the event sampling
logic. The thread terminates, executing well into the destructor until
it gets to os::free_thread which restores the original signal mask for
the thread - that signal mask has SR_signum unblocked and so the signal
is delivered immediately and we enter the SR_handler. For some reason
this triggers the assertion failure - though why exactly is unclear as
we have not released the thread memory as yet, nor done anything that
should invalidate that call. Whatever the reason the state of the thread
causes secondary failures in the error reporting code as well.
Attempts to reproduce this bug have been unsuccessful (so maybe we had a
random memory stomp on the thread state - who knows.)
So what I am doing is simply adding an additional assertion to try and
catch, during regular testing, any occurrence of SR_signum being
blocked while a thread is terminating.
In addition a couple of minor cleanups in the signal related code:
- strictly speaking SR_handler needs to use
Thread::current_or_null_safe() because it needs ot use library-based TLS
in a signal context.
- sigsets should (POSIX recommendation) be explicitly emptied/filled
before being set via pthread_sigmask
- change 0 to NULL in call to pthread_sigmask
Testing: - JPRT, original failing testcase
Thanks,
David
More information about the serviceability-dev
mailing list