RFR: 8233549: Thread interrupted state must only be accessed when not in a safepoint-safe state
David Holmes
david.holmes at oracle.com
Tue Nov 12 22:50:15 UTC 2019
Hi Dan,
Thanks for taking a look so quickly!
On 13/11/2019 3:18 am, Daniel D. Daugherty wrote:
> On 11/11/19 11:52 PM, David Holmes wrote:
>> webrev: http://cr.openjdk.java.net/~dholmes/8233549/webrev/
>
> src/hotspot/os/posix/os_posix.cpp
> L2078: // Can't access interrupt state now we are
> _thread_blocked. If we've been
> L2079: // interrupted since we checked above then _counter will
> be > 0.
> nit - grammar. Please consider:
> // Can't access interrupt state now that we are
> _thread_blocked. If we've
> // been interrupted since we checked above then _counter
> will be > 0.
>
> src/hotspot/os/solaris/os_solaris.cpp
> L4924: // Can't access interrupt state now we are
> _thread_blocked. If we've been
> L4925: // interrupted since we checked above then _counter will
> be > 0.
> nit - grammar. Please consider:
> // Can't access interrupt state now that we are
> _thread_blocked. If we've
> // been interrupted since we checked above then _counter
> will be > 0.
Will fix grammatical nits.
> src/hotspot/share/classfile/javaClasses.cpp
> No comments.
>
> src/hotspot/share/prims/jvmtiEnv.cpp
> Hmmm... did the "non-JavaThread can't be interrupted" check also get
> pushed down?
> Update: Similar check is now in JvmtiRawMonitor::raw_wait().
>
> src/hotspot/share/prims/jvmtiRawMonitor.cpp
> L239: ThreadInVMfromNative tivm(jt);
> L240: if (jt->is_interrupted(true)) {
> L241: ret = M_INTERRUPTED;
> L242: } else {
> L243: ThreadBlockInVM tbivm(jt);
> L244: jt->set_suspend_equivalent();
> L245: if (millis <= 0) {
> L246: self->_ParkEvent->park();
> L247: } else {
> L248: self->_ParkEvent->park(millis);
> L249: }
> L250: }
> L251: // Return to VM before post-check of interrupt state
> L252: if (jt->is_interrupted(true)) {
> The comment on L251 is better between L249 and L250 since that
> is where 'tbivm' gets destroyed and you transition back.
>
> You could have this comment before L252:
>
> // Must be in VM to safely access interrupt state:
>
> if you think you really need a comment there.
Will move comment up as suggested.
> src/hotspot/share/prims/jvmtiRawMonitor.hpp
> No comments.
>
> src/hotspot/share/runtime/objectMonitor.cpp
> You've moved the is_interrupted() check from after ThreadBlockInVM
> to before it. ThreadBlockInVM can block for a safepoint which widens
> the window for an interrupt to come in after the check on L1272 and
> and before the thread parks on L1286 or L1288.
>
> Can this result in an unexpected park() where before we would have
> taken the "Intentionally empty" code path on L1283?
>
> What I'm worried about is whether we've opened a window where we
> do Object.wait(0) and that wait() is supposed to be interrupted.
> However, we lose that interrupt because it arrives in the now wider
> window between L1272 and L1286 and we never return from the wait(0).
>
> It is possible that I'm not remembering something about how
> interrupt()
> interacts with park().
The interrupt() not only sets the field but also issues an unpark() to
the ParkEvent. So if we are interrupted whilst processing through the
TBIVM, the call to park() will return immediately as the ParkEvent will
be in the signalled state.
> test/hotspot/jtreg/ProblemList.txt
> Thanks for remembering to update the ProblemList.
>
> The only part I'm worried about is ObjectMonitor::wait(). If my worry is
> baseless, then thumbs up.
Worry is baseless :)
> I have a couple of nits above. If you choose to fix those, then I don't
> need to see another webrev.
Thanks again!
David
-----
> Dan
>
>
>> bug: https://bugs.openjdk.java.net/browse/JDK-8233549
>>
>> In JDK-8229516 I moved the interrupted state of a thread from the
>> osThread in the VM to the java.lang.Thread instance. In doing that I
>> overlooked a critical aspect, which is that to access the field of a
>> Java object the JavaThread must not be in a safepoint-safe state** -
>> otherwise the oop, and anything referenced there from could be
>> relocated by the GC whilst the JavaThread is accessing it. This
>> manifested in a number of tests using JVM TI Agent threads and JVM TI
>> RawMonitors because the JavaThread's were marked _thread_blocked and
>> hence safepoint-safe, and we read a non-zero value for the interrupted
>> field even though we had never been interrupted.
>>
>> This problem existed in all the code that checks for interruption when
>> "waiting":
>>
>> - Parker::park (the code underpinning
>> java.util.concurrent.LockSupport.park())
>>
>> To fix this code I simply deleted a late check of the interrupted
>> field. The check was not needed because if an interrupt has occurred
>> then we will find the ParkEvent in a signalled state.
>>
>> - ObjectMonitor::wait
>>
>> Here the late check of the interrupted state is essential as we reset
>> the ParkEvent after an earlier check of the interrupted state. But the
>> fix was simply achieved by moving the check slightly earlier before we
>> use ThreadBlockInVm to become _thread_blocked.
>>
>> - RawMonitor::wait
>>
>> This fix was much more involved. The RawMonitor code directly
>> transitions the JavaThread from _thread_in_Native to _thread_blocked.
>> This is safe from a safepoint perspective because they are equivalent
>> safepoint-safe states. To allow access to the interrupted field I have
>> to transition from native to _thread_in_vm, and that has to be done by
>> proper thread-state transitions to ensure correct access to the oop
>> and its fields. Having done that I can then use ThreadBlockInVM for
>> the transitions to blocked. However, as the old code noted it can't
>> use proper thread-state transitions as this will lead to deadlocks
>> with the VMThread that can also use RawMonitors when executing various
>> event callbacks. To deal with that we have to note that the real
>> constraint is that the JavaThread cannot block at a safepoint whilst
>> it holds the RawMonitor. Hence the fix was push all the interrupt
>> checking code and the thread-state transitions to the lowest level of
>> RawMonitorWait, around the final park() call, after we have enqueued
>> the waiter and released the monitor. That avoids any deadlock
>> possibility.
>>
>> I also added checks to is_interrupted/interrupted to ensure they are
>> only called by a thread in a suitable state. This should only be the
>> VMThread (as a consequence of the Thread.stop implementation occurring
>> at a safepoint and issuing a JavaThread::interrupt() call to unblock
>> the target); or a JavaThread that is not _thread_in_native or
>> _thread_blocked.
>>
>> Testing: (still finalizing)
>> - tiers 1 - 6 (Oracle platforms)
>> - Local Linux testing
>> - vmTestbase/nsk/monitoring/
>> - vmTestbase/nsk/jdwp
>> - vmTestbase/nsk/jdb/
>> - vmTestbase/nsk/jdi/
>> - vmTestbase/nsk/jvmti/
>> - serviceability/jvmti/
>> - serviceability/jdwp
>> - JDK: java/lang/management
>> com/sun/management
>>
>> ** Note that this applies to all accesses we make via code in
>> javaClasses.*. For this particular code I thought about adding a guard
>> in JavaThread::threadObj() but it turns out when we generate a crash
>> report we access the Thread's name() field and that can happen when in
>> any state, so we'd always trigger a secondary assertion failure during
>> error reporting if we did that. Note that accessing name() can still
>> easily lead to secondary assertions failures as I discovered when
>> trying to debug this and print the thread name out - I would see an
>> is_instance assertion fail checking that the Thread name() is an
>> instance of java.lang.String!
>>
>> Thanks,
>> David
>> -----
>
More information about the hotspot-runtime-dev
mailing list