RFR(XS) 8252521: possible race in java_suspend_self_with_safepoint_check

Fri Sep 4 17:15:36 UTC 2020

Richard,

Sorry for the late review. I know you have already pushed the fix.

src/hotspot/share/runtime/thread.cpp
     L2625:   } while (is_external_suspend());
         In the other places where we are checking for a racing suspend
         request, we check the is_external_suspend() condition while
         holding the SR_lock or we call is_external_suspend_with_lock().

         Here's the usual example I point people at:

         L2049: void JavaThread::exit(bool destroy_vm, ExitType exit_type) {
<snip>
         L2121:     while (true) {
         L2122:       {
         L2123:         MutexLocker ml(SR_lock(), 
Mutex::_no_safepoint_check_flag);
         L2124:         if (!is_external_suspend()) {

         The JVM/TI SuspendThread() and JVM_SuspendThread() entry points
         call set_external_suspend() while holding the SR_lock so the
         only way to be sure you haven't lost the race is to hold the
         SR_lock while you're checking the flag yourself.

         Have I missed something in my analysis?

Dan

On 9/2/20 11:15 AM, Reingruber, Richard wrote:
> Hi,
>
> please help review this fix for a race condition in
> JavaThread::java_suspend_self_with_safepoint_check() that allows a suspended
> thread to continue executing java for an arbitrary long time (see repro test
> attached to bug report).
>
> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8252521/webrev.0/
> Bug:    https://bugs.openjdk.java.net/browse/JDK-8252521
>
> The fix is to add a do-while-loop to java_suspend_self_with_safepoint_check()
> that checks if the current thread was suspended again after returning from
> java_suspend_self() and before restoring the original thread state. The check is
> performed after restoring the original state because then we are guaranteed to
> see the suspend request issued before the requester observed that target to be
> _thread_blocked and executed VM_ThreadSuspend.
>
> Thanks, Richard.