RFR: 8233549: Thread interrupted state must only be accessed when not in a safepoint-safe state

Tue Nov 12 04:52:42 UTC 2019

webrev: http://cr.openjdk.java.net/~dholmes/8233549/webrev/
bug: https://bugs.openjdk.java.net/browse/JDK-8233549

In JDK-8229516 I moved the interrupted state of a thread from the 
osThread in the VM to the java.lang.Thread instance. In doing that I 
overlooked a critical aspect, which is that to access the field of a 
Java object the JavaThread must not be in a safepoint-safe state** - 
otherwise the oop, and anything referenced there from could be relocated 
by the GC whilst the JavaThread is accessing it. This manifested in a 
number of tests using JVM TI Agent threads and JVM TI RawMonitors 
because the JavaThread's were marked _thread_blocked and hence 
safepoint-safe, and we read a non-zero value for the interrupted field 
even though we had never been interrupted.

This problem existed in all the code that checks for interruption when 
"waiting":

- Parker::park (the code underpinning 
java.util.concurrent.LockSupport.park())

To fix this code I simply deleted a late check of the interrupted field. 
The check was not needed because if an interrupt has occurred then we 
will find the ParkEvent in a signalled state.

- ObjectMonitor::wait

Here the late check of the interrupted state is essential as we reset 
the ParkEvent after an earlier check of the interrupted state. But the 
fix was simply achieved by moving the check slightly earlier before we 
use ThreadBlockInVm to become _thread_blocked.

- RawMonitor::wait

This fix was much more involved. The RawMonitor code directly 
transitions the JavaThread from _thread_in_Native to _thread_blocked. 
This is safe from a safepoint perspective because they are equivalent 
safepoint-safe states. To allow access to the interrupted field I have 
to transition from native to _thread_in_vm, and that has to be done by 
proper thread-state transitions to ensure correct access to the oop and 
its fields. Having done that I can then use ThreadBlockInVM for the 
transitions to blocked. However, as the old code noted it can't use 
proper thread-state transitions as this will lead to deadlocks with the 
VMThread that can also use RawMonitors when executing various event 
callbacks. To deal with that we have to note that the real constraint is 
that the JavaThread cannot block at a safepoint whilst it holds the 
RawMonitor. Hence the fix was push all the interrupt checking code and 
the thread-state transitions to the lowest level of RawMonitorWait, 
around the final park() call, after we have enqueued the waiter and 
released the monitor. That avoids any deadlock possibility.

I also added checks to is_interrupted/interrupted to ensure they are 
only called by a thread in a suitable state. This should only be the 
VMThread (as a consequence of the Thread.stop implementation occurring 
at a safepoint and issuing a JavaThread::interrupt() call to unblock the 
target); or a JavaThread that is not _thread_in_native or _thread_blocked.

Testing: (still finalizing)
  - tiers 1 - 6 (Oracle platforms)
  - Local Linux testing
   - vmTestbase/nsk/monitoring/
   - vmTestbase/nsk/jdwp
   - vmTestbase/nsk/jdb/
   - vmTestbase/nsk/jdi/
   - vmTestbase/nsk/jvmti/
   - serviceability/jvmti/
   - serviceability/jdwp
   - JDK: java/lang/management
          com/sun/management

** Note that this applies to all accesses we make via code in 
javaClasses.*. For this particular code I thought about adding a guard 
in JavaThread::threadObj() but it turns out when we generate a crash 
report we access the Thread's name() field and that can happen when in 
any state, so we'd always trigger a secondary assertion failure during 
error reporting if we did that. Note that accessing name() can still 
easily lead to secondary assertions failures as I discovered when trying 
to debug this and print the thread name out - I would see an is_instance 
assertion fail checking that the Thread name() is an instance of 
java.lang.String!

Thanks,
David
-----