(S) RFR: 8159461: bigapps/Kitchensink/stressExitCode hits assert: Must be VMThread or JavaThread
David Holmes
david.holmes at oracle.com
Wed Aug 3 01:13:40 UTC 2016
webrev: http://cr.openjdk.java.net/~dholmes/8159461/webrev/
bug: https://bugs.openjdk.java.net/browse/JDK-8159461
The suspend/resume signal (SR_signum) is never sent to a thread once it
has started to terminate. On one platform (SuSE 12) we have seen what
appears to be a "stuck" signal, which is only delivered when the
terminating thread restores its original signal mask (as if
pthread_sigmask makes the system realize there is a pending signal - we
already check the signal was not blocked). At this point in the thread
termination we have freed the osthread, so the the SR_handler would
access deallocated memory. In debug builds we first hit an assertion
that the current thread is a JavaThread or the VMThread - that assertion
fails, even though it is a JavaThread, because we have already executed
the ~JavaThread destructor and inside the ~Thread destructor we are a
plain Thread not a JavaThread.
The fix was to make a small adjustment to the thread termination process
so that we delete the SR_lock before calling os::free_thread(). In the
SR_handler() we can then use a NULL check of SR_lock() to indicate the
thread has terminated and we return.
While only seen on Linux I took the opportunity to apply the fix on all
platforms and also cleaned up the code where we were using
Thread::current() unsafely in a signal-handling context.
Testing: regular tier 1 (JPRT)
Kitchensink (in progress)
As we can't readily reproduce the problem I tested this by having a
terminating thread raise SR_signum directly from within the ~Thread
destructor.
Thanks,
David
More information about the serviceability-dev
mailing list