RFR: 6498581: ThreadInterruptTest3 produces wrong output on Windows
Karen Kinnear
karen.kinnear at oracle.com
Tue Feb 25 15:37:14 PST 2014
I think Vladimir has a good point - the way I read this, Windows is the only platform
that has this issue, so making this conditional for Windows makes a lot of sense. Thanks
for doing this Yumin.
thanks,
Karen
On Feb 25, 2014, at 5:55 PM, Vladimir Kozlov wrote:
> Repeating my question from other thread.
>
> Why you want to remove it on all platforms and not for Windows only?
>
> thanks,
> Vladimir
>
> On 2/25/14 2:38 PM, Yumin Qi wrote:
>> Hi, Karen and all
>>
>> I have modified inline_native_isInterrupted to eliminate second fast
>> path. Please check webrev at
>>
>> http://cr.openjdk.java.net/~minqi/6498581/webrev01/
>>
>> Also need review from compiler team if the changes is OK.
>>
>> Thanks
>> Yumin
>>
>>
>> On 2/19/2014 1:15 PM, Karen Kinnear wrote:
>>> Yumin,
>>>
>>> The code fix looks good.
>>>
>>> The question is whether to leave in the intrinsic or not since it
>>> could cause inconsistent data
>>> to be returned.
>>>
>>> One way to deal with this would be to modify the intrinsic to only
>>> optimize the case in
>>> which the thread is not interrupted, i.e.
>>> t = Thread.current() && (!TLS._osthread._interrupted)
>>>
>>> I think that would remove the inconsistency. Yes, it would only
>>> optimize a subset, but the
>>> not interrupted subset is probably the one that makes the difference.
>>>
>>> Can you work with the compiler folks to make that modification if it
>>> makes sense to you?
>>>
>>> thanks,
>>> Karen
>>>
>>> On Jan 17, 2014, at 2:01 AM, David Holmes wrote:
>>>
>>>> Hi Yumin,
>>>>
>>>> No need to bother the compiler folks with this (still cc'd but can be
>>>> dropped on any reply). We just need to decide if we can tolerate this
>>>> inconsistency between the intrinsic and non-intrinsic forms.
>>>> Otherwise we may choose to disable the intrinsic on Windows - I
>>>> really don't see that it can be adding much value as
>>>> Thread.currentThread().isInterrupted(false) is hardly a performance
>>>> critical action. But we would need benchmarking to confirm that.
>>>>
>>>> David
>>>>
>>>> On 17/01/2014 4:10 PM, Yumin Qi wrote:
>>>>> Now I got it:
>>>>>
>>>>> //------------------------inline_native_isInterrupted------------------
>>>>> // private native boolean java.lang.Thread.isInterrupted(boolean
>>>>> ClearInterrupted);
>>>>> bool LibraryCallKit::inline_native_isInterrupted() {
>>>>> // Add a fast path to t.isInterrupted(clear_int):
>>>>> // (t == Thread.current() && (!TLS._osthread._interrupted ||
>>>>> !clear_int))
>>>>> // ? TLS._osthread._interrupted : /*slow path:*/
>>>>> t.isInterrupted(clear_int)
>>>>> // So, in the common case that the interrupt bit is false,
>>>>> // we avoid making a call into the VM. Even if the interrupt bit
>>>>> // is true, if the clear_int argument is false, we avoid the VM
>>>>> call.
>>>>> // However, if the receiver is not currentThread, we must call
>>>>> the VM,
>>>>> // because there must be some locking done around the operation.
>>>>>
>>>>> // We only go to the fast case code if we pass two guards.
>>>>> // Paths which do not pass are accumulated in the slow_region.
>>>>>
>>>>> this code does not work both before and after the fix then.
>>>>> Maybe we should not go with fast path then. CC to compiler team.
>>>>>
>>>>> Thanks
>>>>> Yumin
>>>>>
>>>>>
>>>>> On 1/16/2014 3:11 PM, David Holmes wrote:
>>>>>> On 17/01/2014 7:24 AM, Yumin Qi wrote:
>>>>>>> Dan,
>>>>>>>
>>>>>>>
>>>>>>> On 1/16/2014 12:48 PM, Daniel D. Daugherty wrote:
>>>>>>>> On 1/16/14 1:42 PM, Yumin Qi wrote:
>>>>>>>>> David,
>>>>>>>>>
>>>>>>>>> Thanks for the detail analysis so far and supply suggested fix.
>>>>>>>>>
>>>>>>>>> On 1/15/2014 10:21 PM, David Holmes wrote:
>>>>>>>>>> Hi Yumin,
>>>>>>>>>>
>>>>>>>>>> Unfortunately the complexities of this continue to expose
>>>>>>>>>> themselves :(
>>>>>>>>>>
>>>>>>>>>> On 15/01/2014 11:00 AM, Yumin Qi wrote:
>>>>>>>>>>> Can I have your codereview of the change
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~minqi/6498581/webrev00
>>>>>>>>>>> <http://cr.openjdk.java.net/%7Eminqi/6498581/webrev00/>
>>>>>>>>>>>
>>>>>>>>>>> Summary: There is race condition between os::interrupt and
>>>>>>>>>>> os::is_interrupted on Windows. See bug comments in detail. When
>>>>>>>>>>> thread
>>>>>>>>>>> sleep check if it gets interrupted, it may see interrupted but
>>>>>>>>>>> not
>>>>>>>>>>> really interrupted so cause spurious waking up (early return from
>>>>>>>>>>> sleep). Fix by checking if a real interrupt sent to thread
>>>>>>>>>>> interrupt
>>>>>>>>>>> event can prevent such false return. On windows we can get
>>>>>>>>>>> away the
>>>>>>>>>>> interrupted field but to keep consistent with other platforms, I
>>>>>>>>>>> choose
>>>>>>>>>>> to leave it as it is.
>>>>>>>>>>>
>>>>>>>>>>> Contributed-By: David Holmes
>>>>>>>>>> Not really. :) I suggested using WaitForSingleObject on the
>>>>>>>>>> interrupt event to detect if the event was signalled or not, and
>>>>>>>>>> hence whether the thread was interrupted or not. I had
>>>>>>>>>> envisaged it
>>>>>>>>>> replacing use of the interrupted field altogether on windows -
>>>>>>>>>> but I
>>>>>>>>>> had overlooked the fact that the _interrupted field of osThread is
>>>>>>>>>> maintained in shared code, and further that there is an intrinsic
>>>>>>>>>> for Thread.isInterrupted that uses that field!
>>>>>>>>>>
>>>>>>>>> I can add myself after you, the main contributor for this fix.
>>>>>>>>>> The fix you have provided in os::is_interrupted deals nicely with
>>>>>>>>>> the race between setting/clearing the field and
>>>>>>>>>> setting/clearing the
>>>>>>>>>> event. Because of the strong ordering in os::interrupt:
>>>>>>>>>>
>>>>>>>>>> 1. osthread->set_interrupted(true);
>>>>>>>>>> 2. OrderAccess::release();
>>>>>>>>>> 3. SetEvent(osthread->interrupt_event());
>>>>>>>>>>
>>>>>>>>>> we can require both conditions to be met to define that the thread
>>>>>>>>>> is indeed interrupted. If we have set the field but not the event
>>>>>>>>>> then is_interrupted will return false without touching the
>>>>>>>>>> field or
>>>>>>>>>> the event. If the thread then blocks it will either unblock
>>>>>>>>>> immediately if setEvent has now been called, or will unblock when
>>>>>>>>>> setEvent is eventually called. That all seems to work fine.
>>>>>>>>>>
>>>>>>>>>> The intrinsic is a fast-path that inlines the access to the
>>>>>>>>>> _interrupted field, but only for the current thread and only when
>>>>>>>>>> not trying to also clear the interrupted state (ie only for
>>>>>>>>>> Thread.currentThread().isInterrupted()**). That seems mostly
>>>>>>>>>> harmless even if the thread is concurrently halfway through being
>>>>>>>>>> interrupted (ie the field is set but not the event) as any
>>>>>>>>>> interrupt
>>>>>>>>>> responsive operation will at some point call the actual
>>>>>>>>>> os::is_interrupted() runtime code.
>>>>>>>>>>
>>>>>>>>>> The only glitch I see is that it would now be possible for the
>>>>>>>>>> following code to throw the exception:
>>>>>>>>>>
>>>>>>>>>> boolean a = Thread.currentThread().isInterrupted();
>>>>>>>>>> boolean b = Thread.interrupted();
>>>>>>>>>> if (a && !b)
>>>>>>>>>> throw new Error("inconsistent Thread interrupt state observed");
>>>>>>>>>>
>>>>>>>>> Hope no one will write code like this --- but there is other
>>>>>>>>> possibility that multiple interrupting actions sent to the
>>>>>>>>> thread so
>>>>>>>>> such check can not guarantee it returns consistent result. I see
>>>>>>>>> the
>>>>>>>>> problem here but think such coding is very rare. With we return
>>>>>>>>> real
>>>>>>>>> status of Event, 'a' get false means the Event is not set yet ----
>>>>>>>>> and os::is_interrupted returns false is right.
>>>>>>>> I think you may have missed David's point here. The earlier check:
>>>>>>>>
>>>>>>>> boolean a = Thread.currentThread().isInterrupted();
>>>>>>>>
>>>>>>>> is returning 'true' and a later check:
>>>>>>>>
>>>>>>>> boolean b = Thread.interrupted();
>>>>>>>>
>>>>>>>> is returning 'false'. If variable 'a' is 'true', then 'b' must
>>>>>>>> also be 'true' to be consistent.
>>>>>>>>
>>>>>>> 1) a = true, b = true
>>>>>>> a: it will not clear interrupted field. So if we get a 'true', means
>>>>>>> interrupted field is true and Event signaled. But from code:
>>>>>>>
>>>>>>> if (interrupted && clear_interrupted) {
>>>>>>> osthread->set_interrupted(false);
>>>>>>> ResetEvent(osthread->interrupt_event());
>>>>>>> } // Otherwise leave the interrupted state alone
>>>>>>>
>>>>>>> We did not do this since clear_interrupted is false.
>>>>>> You are missing the point. If Thread.currentThread().isInterrupted()
>>>>>> is compiled at runtime using the intrinisc then it will return true if
>>>>>> the field is set even if the Event is not. At that point the
>>>>>> non-intrinsic Thread.interrupted would return false. This would give
>>>>>> the appearance that the thread was interrupted then not interrupted
>>>>>> but that is impossible based on the specs for those methods.
>>>>>>
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>
>>>>>>> b: now we query with clear_interrupted is true.
>>>>>>> Note the Event still signaled ( We assume no other interrupt
>>>>>>> actions
>>>>>>> from other threads), the os::is_interrupted will return 'true' since
>>>>>>> field is 'true' and Event in signaled status.
>>>>>>> We do the clear before return 'true' so next if we have call
>>>>>>> say, c,
>>>>>>> will get 'false'.
>>>>>>>
>>>>>>> 2) a = false b= true
>>>>>>> There is a situation like
>>>>>>> a: get a 'false' ---- the interrupt is going on and SetEvent not
>>>>>>> called
>>>>>>> yet, but field already set to interrupted so get a 'false'.
>>>>>>> b: When b called, interrupt action done, so now field is
>>>>>>> interrupted
>>>>>>> and Event signaled. We get a 'true'.
>>>>>>>
>>>>>>> (Assume only one interrupt)
>>>>>>>
>>>>>>> But, anyway, a = false and b = true should not be a problem I
>>>>>>> think. We
>>>>>>> treat interrupted as both field and Event set.
>>>>>>> It can not get a is true and b is false in the situation.
>>>>>>>
>>>>>>> Thanks
>>>>>>> Yumin
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Dan
>>>>>>>>
>>>>>>>>
>>>>>>>>>> I don't think this is a realistic situation given for the issue to
>>>>>>>>>> arise you need the interrupting thread to be effectively
>>>>>>>>>> descheduled
>>>>>>>>>> after setting the field but before setting the event. But still it
>>>>>>>>>> is possible and could be observable. I'm inclined to accept this
>>>>>>>>>> possibility but I think Karen should make that call. I also
>>>>>>>>>> think we
>>>>>>>>>> would need to document this possibility just in case it does
>>>>>>>>>> arise.
>>>>>>>>>>
>>>>>>>>>> ** I've no idea why this intrinsic exists as it is a fairly
>>>>>>>>>> uncommon
>>>>>>>>>> usage. I guess at some point it was thought that this would be
>>>>>>>>>> common, but most interrupt responsive code uses Thread.interrupted
>>>>>>>>>> to clear the interrupt state as well as query it.
>>>>>>>>>>
>>>>>>>>> agree.
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Yumin
>>>>>>>>>> Thanks,
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>> Tests: vm.quick.testlist, specjbb2005, original test case, JPRT
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> Yumin
>>>>>>>>>>>
>>>>>>>>>>>
>>
More information about the hotspot-compiler-dev
mailing list