(S) RFR: 8159461: bigapps/Kitchensink/stressExitCode hits assert: Must be VMThread or JavaThread

David Holmes david.holmes at oracle.com
Wed Aug 3 01:13:40 UTC 2016


webrev: http://cr.openjdk.java.net/~dholmes/8159461/webrev/

bug: https://bugs.openjdk.java.net/browse/JDK-8159461

The suspend/resume signal (SR_signum) is never sent to a thread once it 
has started to terminate. On one platform (SuSE 12) we have seen what 
appears to be a "stuck" signal, which is only delivered when the 
terminating thread restores its original signal mask (as if 
pthread_sigmask makes the system realize there is a pending signal - we 
already check the signal was not blocked). At this point in the thread 
termination we have freed the osthread, so the the SR_handler would 
access deallocated memory. In debug builds we first hit an assertion 
that the current thread is a JavaThread or the VMThread - that assertion 
fails, even though it is a JavaThread, because we have already executed 
the ~JavaThread destructor and inside the ~Thread destructor we are a 
plain Thread not a JavaThread.

The fix was to make a small adjustment to the thread termination process 
so that we delete the SR_lock before calling os::free_thread(). In the 
SR_handler() we can then use a NULL check of SR_lock() to indicate the 
thread has terminated and we return.

While only seen on Linux I took the opportunity to apply the fix on all 
platforms and also cleaned up the code where we were using 
Thread::current() unsafely in a signal-handling context.

Testing: regular tier 1 (JPRT)
          Kitchensink (in progress)

As we can't readily reproduce the problem I tested this by having a 
terminating thread raise SR_signum directly from within the ~Thread 
destructor.

Thanks,
David


More information about the serviceability-dev mailing list