RFR: 8233549: Thread interrupted state must only be accessed when not in a safepoint-safe state
David Holmes
david.holmes at oracle.com
Fri Nov 15 02:32:04 UTC 2019
Thanks again Serguei.
David
On 15/11/2019 12:14 pm, serguei.spitsyn at oracle.com wrote:
> Hi David,
>
> Thank you for the update!
> It looks good to me.
>
> You are right about my first suggestion.
> The lines need to stay where they are, or additional curly brackets
> are needed to force the ThreadBlockInVM destructor earlier.
>
> Thanks,
> Serguei
>
>
> On 11/14/19 2:21 PM, David Holmes wrote:
>> Hi Serguei,
>>
>> Thanks for taking a look.
>>
>> On 15/11/2019 4:04 am, serguei.spitsyn at oracle.com wrote:
>>> Hi David,
>>>
>>> It looks good to me.
>>> A couple of nits below.
>>>
>>> http://cr.openjdk.java.net/~dholmes/8233549/webrev/src/hotspot/share/prims/jvmtiRawMonitor.cpp.frames.html
>>>
>>>
>>> 236 if (self->is_Java_thread()) {
>>> 237 JavaThread* jt = (JavaThread*) self;
>>> 238 // Transition to VM so we can check interrupt state
>>> 239 ThreadInVMfromNative tivm(jt);
>>> 240 if (jt->is_interrupted(true)) {
>>> 241 ret = M_INTERRUPTED;
>>> 242 } else {
>>> 243 ThreadBlockInVM tbivm(jt);
>>> 244 jt->set_suspend_equivalent();
>>> 245 if (millis <= 0) {
>>> 246 self->_ParkEvent->park();
>>> 247 } else {
>>> 248 self->_ParkEvent->park(millis);
>>> 249 }
>>> 250 }
>>> 251 // Return to VM before post-check of interrupt state
>>> 252 if (jt->is_interrupted(true)) {
>>> 253 ret = M_INTERRUPTED;
>>> 254 }
>>> 255 } else {
>>>
>>>
>>> It seems, the fragment at lines 251-254 needs to bebefore the line 250.
>>> It will add more clarity to this code.
>>
>> No, it has to be after line 250 as that is when we will hit the TBIVM
>> destructor and so return to _thread_in_vm which is the state needed to
>> read the interrupted field. Dan commented on the above and I changed
>> it slightly by moving the comment:
>>
>> > 250 // Return to VM before post-check of interrupt state
>> > 251 }
>> > 252 if (jt->is_interrupted(true)) {
>> > 253 ret = M_INTERRUPTED;
>> > 254 }
>>
>>
>>> 412 if (self->is_Java_thread()) {
>>> 413 JavaThread* jt = (JavaThread*)self;
>>> 414 jt->set_suspend_equivalent();
>>> 415 for (;;) {
>>> 416 if (!jt->handle_special_suspend_equivalent_condition()) {
>>> 417 break;
>>> 418 } else {
>>> 419 // We've been suspended whilst waiting and so we have to
>>> 420 // relinquish the raw monitor until we are resumed. Of course
>>> 421 // after reacquiring we have to re-check for suspension again.
>>> 422 // Suspension requires we are _thread_blocked, and we also have to
>>> 423 // recheck for being interrupted.
>>> 424 simple_exit(jt);
>>> 425 {
>>> 426 ThreadInVMfromNative tivm(jt);
>>> 427 {
>>> 428 ThreadBlockInVM tbivm(jt);
>>> 429 jt->java_suspend_self();
>>> 430 }
>>> 431 if (jt->is_interrupted(true)) {
>>> 432 ret = M_INTERRUPTED;
>>> 433 }
>>> 434 }
>>> 435 simple_enter(jt);
>>> 436 jt->set_suspend_equivalent();
>>> 437 }
>>> ...
>>>
>>> This code can be simplified a little bit.
>>> The line:
>>>
>>> 414 jt->set_suspend_equivalent();
>>>
>>> can be placed before line 416.
>>> Then this line can be removed:
>>>
>>> 436 jt->set_suspend_equivalent();
>>
>> Yes you're right. I was trying to preserve the original loop
>> structure, but then had to add the additional set_suspend_equivalent
>> for the first iteration. But I can instead just move the existing one
>> to the top of the loop.
>>
>> Webrev updated in place.
>>
>> Thanks,
>> David
>> -----
>>
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 11/11/19 20:52, David Holmes wrote:
>>>> webrev: http://cr.openjdk.java.net/~dholmes/8233549/webrev/
>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8233549
>>>>
>>>> In JDK-8229516 I moved the interrupted state of a thread from the
>>>> osThread in the VM to the java.lang.Thread instance. In doing that I
>>>> overlooked a critical aspect, which is that to access the field of a
>>>> Java object the JavaThread must not be in a safepoint-safe state** -
>>>> otherwise the oop, and anything referenced there from could be
>>>> relocated by the GC whilst the JavaThread is accessing it. This
>>>> manifested in a number of tests using JVM TI Agent threads and JVM
>>>> TI RawMonitors because the JavaThread's were marked _thread_blocked
>>>> and hence safepoint-safe, and we read a non-zero value for the
>>>> interrupted field even though we had never been interrupted.
>>>>
>>>> This problem existed in all the code that checks for interruption
>>>> when "waiting":
>>>>
>>>> - Parker::park (the code underpinning
>>>> java.util.concurrent.LockSupport.park())
>>>>
>>>> To fix this code I simply deleted a late check of the interrupted
>>>> field. The check was not needed because if an interrupt has occurred
>>>> then we will find the ParkEvent in a signalled state.
>>>>
>>>> - ObjectMonitor::wait
>>>>
>>>> Here the late check of the interrupted state is essential as we
>>>> reset the ParkEvent after an earlier check of the interrupted state.
>>>> But the fix was simply achieved by moving the check slightly earlier
>>>> before we use ThreadBlockInVm to become _thread_blocked.
>>>>
>>>> - RawMonitor::wait
>>>>
>>>> This fix was much more involved. The RawMonitor code directly
>>>> transitions the JavaThread from _thread_in_Native to
>>>> _thread_blocked. This is safe from a safepoint perspective because
>>>> they are equivalent safepoint-safe states. To allow access to the
>>>> interrupted field I have to transition from native to _thread_in_vm,
>>>> and that has to be done by proper thread-state transitions to ensure
>>>> correct access to the oop and its fields. Having done that I can
>>>> then use ThreadBlockInVM for the transitions to blocked. However, as
>>>> the old code noted it can't use proper thread-state transitions as
>>>> this will lead to deadlocks with the VMThread that can also use
>>>> RawMonitors when executing various event callbacks. To deal with
>>>> that we have to note that the real constraint is that the JavaThread
>>>> cannot block at a safepoint whilst it holds the RawMonitor. Hence
>>>> the fix was push all the interrupt checking code and the
>>>> thread-state transitions to the lowest level of RawMonitorWait,
>>>> around the final park() call, after we have enqueued the waiter and
>>>> released the monitor. That avoids any deadlock possibility.
>>>>
>>>> I also added checks to is_interrupted/interrupted to ensure they are
>>>> only called by a thread in a suitable state. This should only be the
>>>> VMThread (as a consequence of the Thread.stop implementation
>>>> occurring at a safepoint and issuing a JavaThread::interrupt() call
>>>> to unblock the target); or a JavaThread that is not
>>>> _thread_in_native or _thread_blocked.
>>>>
>>>> Testing: (still finalizing)
>>>> - tiers 1 - 6 (Oracle platforms)
>>>> - Local Linux testing
>>>> - vmTestbase/nsk/monitoring/
>>>> - vmTestbase/nsk/jdwp
>>>> - vmTestbase/nsk/jdb/
>>>> - vmTestbase/nsk/jdi/
>>>> - vmTestbase/nsk/jvmti/
>>>> - serviceability/jvmti/
>>>> - serviceability/jdwp
>>>> - JDK: java/lang/management
>>>> com/sun/management
>>>>
>>>> ** Note that this applies to all accesses we make via code in
>>>> javaClasses.*. For this particular code I thought about adding a
>>>> guard in JavaThread::threadObj() but it turns out when we generate a
>>>> crash report we access the Thread's name() field and that can happen
>>>> when in any state, so we'd always trigger a secondary assertion
>>>> failure during error reporting if we did that. Note that accessing
>>>> name() can still easily lead to secondary assertions failures as I
>>>> discovered when trying to debug this and print the thread name out -
>>>> I would see an is_instance assertion fail checking that the Thread
>>>> name() is an instance of java.lang.String!
>>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>
>
More information about the hotspot-runtime-dev
mailing list