RFR: 6498581: ThreadInterruptTest3 produces wrong output on Windows

Karen Kinnear karen.kinnear at oracle.com
Wed Feb 26 23:07:43 UTC 2014


Looks good to me as well. Many thanks for following this all the way through Yumin.

thanks,
Karen

On Feb 26, 2014, at 5:17 PM, Yumin Qi wrote:

> Thanks for your help and review.
> 
> Yumin
> 
> On 2/26/2014 1:47 PM, Vladimir Kozlov wrote:
>> Looks good.
>> 
>> Thanks,
>> Vladimir
>> 
>> On 2/26/14 1:06 PM, Yumin Qi wrote:
>>> Hi, Vladimir
>>> 
>>>   Please review the new version at:
>>>   http://cr.openjdk.java.net/~minqi/6498581/webrev02/
>>> 
>>>   In this version, I use
>>> #ifndef  TARGET_OS_FAMILY_windows
>>> 
>>>   to guard none Windows part for and for Windows part, directly setting
>>> control to slow path.
>>> 
>>>    Tested with -XX:+PrintCompilation and the output on Windows is
>>> normal (version rwebrev01 has malform of control flow).
>>> 
>>>     Tested on Windows/Linux.
>>> 
>>> Thanks
>>> Yumin
>>> 
>>> On 2/25/2014 3:50 PM, Vladimir Kozlov wrote:
>>>> Yumin,
>>>> 
>>>> You need only one #ifdef for that:
>>>> 
>>>> #ifdef TARGET_OS_FAMILY_windows
>>>>  result_rgn->init_req(no_clear_result_path, top());
>>>>  result_val->init_req(no_clear_result_path, top());
>>>> #else
>>>> 
>>>> Thanks,
>>>> Vladimir
>>>> 
>>>> On 2/25/14 3:40 PM, Yumin Qi wrote:
>>>>> Thanks. I will change for Windows only --- thanks again.
>>>>> 
>>>>> Yumin
>>>>> 
>>>>> On 2/25/2014 3:37 PM, Karen Kinnear wrote:
>>>>>> I think Vladimir has a good point - the way I read this, Windows is
>>>>>> the only platform
>>>>>> that has this issue, so making this conditional for Windows makes a
>>>>>> lot of sense. Thanks
>>>>>> for doing this Yumin.
>>>>>> 
>>>>>> thanks,
>>>>>> Karen
>>>>>> 
>>>>>> On Feb 25, 2014, at 5:55 PM, Vladimir Kozlov wrote:
>>>>>> 
>>>>>>> Repeating my question from other thread.
>>>>>>> 
>>>>>>> Why you want to remove it on all platforms and not for Windows only?
>>>>>>> 
>>>>>>> thanks,
>>>>>>> Vladimir
>>>>>>> 
>>>>>>> On 2/25/14 2:38 PM, Yumin Qi wrote:
>>>>>>>> Hi, Karen and all
>>>>>>>> 
>>>>>>>>   I have modified inline_native_isInterrupted to eliminate second
>>>>>>>> fast
>>>>>>>> path. Please check webrev at
>>>>>>>> 
>>>>>>>>   http://cr.openjdk.java.net/~minqi/6498581/webrev01/
>>>>>>>> 
>>>>>>>>   Also need review from compiler team if the changes is OK.
>>>>>>>> 
>>>>>>>> Thanks
>>>>>>>> Yumin
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On 2/19/2014 1:15 PM, Karen Kinnear wrote:
>>>>>>>>> Yumin,
>>>>>>>>> 
>>>>>>>>> The code fix looks good.
>>>>>>>>> 
>>>>>>>>> The question is whether to leave in the intrinsic or not since it
>>>>>>>>> could cause inconsistent data
>>>>>>>>> to be returned.
>>>>>>>>> 
>>>>>>>>> One way to deal with this would be to modify the intrinsic to only
>>>>>>>>> optimize the case in
>>>>>>>>> which the thread is not interrupted, i.e.
>>>>>>>>> t = Thread.current() && (!TLS._osthread._interrupted)
>>>>>>>>> 
>>>>>>>>> I think that would remove the inconsistency. Yes, it would only
>>>>>>>>> optimize a subset, but the
>>>>>>>>> not interrupted subset is probably the one that makes the
>>>>>>>>> difference.
>>>>>>>>> 
>>>>>>>>> Can you work with the compiler folks to make that modification if it
>>>>>>>>> makes sense to you?
>>>>>>>>> 
>>>>>>>>> thanks,
>>>>>>>>> Karen
>>>>>>>>> 
>>>>>>>>> On Jan 17, 2014, at 2:01 AM, David Holmes wrote:
>>>>>>>>> 
>>>>>>>>>> Hi Yumin,
>>>>>>>>>> 
>>>>>>>>>> No need to bother the compiler folks with this (still cc'd but
>>>>>>>>>> can be
>>>>>>>>>> dropped on any reply). We just need to decide if we can tolerate
>>>>>>>>>> this
>>>>>>>>>> inconsistency between the intrinsic and non-intrinsic forms.
>>>>>>>>>> Otherwise we may choose to disable the intrinsic on Windows - I
>>>>>>>>>> really don't see that it can be adding much value as
>>>>>>>>>> Thread.currentThread().isInterrupted(false) is hardly a performance
>>>>>>>>>> critical action. But we would need benchmarking to confirm that.
>>>>>>>>>> 
>>>>>>>>>> David
>>>>>>>>>> 
>>>>>>>>>> On 17/01/2014 4:10 PM, Yumin Qi wrote:
>>>>>>>>>>> Now I got it:
>>>>>>>>>>> 
>>>>>>>>>>> //------------------------inline_native_isInterrupted------------------ 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> // private native boolean java.lang.Thread.isInterrupted(boolean
>>>>>>>>>>> ClearInterrupted);
>>>>>>>>>>> bool LibraryCallKit::inline_native_isInterrupted() {
>>>>>>>>>>>   // Add a fast path to t.isInterrupted(clear_int):
>>>>>>>>>>>   //   (t == Thread.current() && (!TLS._osthread._interrupted ||
>>>>>>>>>>> !clear_int))
>>>>>>>>>>>   //   ? TLS._osthread._interrupted : /*slow path:*/
>>>>>>>>>>> t.isInterrupted(clear_int)
>>>>>>>>>>>   // So, in the common case that the interrupt bit is false,
>>>>>>>>>>>   // we avoid making a call into the VM.  Even if the
>>>>>>>>>>> interrupt bit
>>>>>>>>>>>   // is true, if the clear_int argument is false, we avoid the VM
>>>>>>>>>>> call.
>>>>>>>>>>>   // However, if the receiver is not currentThread, we must call
>>>>>>>>>>> the VM,
>>>>>>>>>>>   // because there must be some locking done around the
>>>>>>>>>>> operation.
>>>>>>>>>>> 
>>>>>>>>>>>   // We only go to the fast case code if we pass two guards.
>>>>>>>>>>>   // Paths which do not pass are accumulated in the slow_region.
>>>>>>>>>>> 
>>>>>>>>>>> this code does not work both before and after the fix then.
>>>>>>>>>>> Maybe we should not go with fast path then. CC to compiler team.
>>>>>>>>>>> 
>>>>>>>>>>> Thanks
>>>>>>>>>>> Yumin
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On 1/16/2014 3:11 PM, David Holmes wrote:
>>>>>>>>>>>> On 17/01/2014 7:24 AM, Yumin Qi wrote:
>>>>>>>>>>>>> Dan,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On 1/16/2014 12:48 PM, Daniel D. Daugherty wrote:
>>>>>>>>>>>>>> On 1/16/14 1:42 PM, Yumin Qi wrote:
>>>>>>>>>>>>>>> David,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>  Thanks for the detail analysis so far and supply suggested
>>>>>>>>>>>>>>> fix.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On 1/15/2014 10:21 PM, David Holmes wrote:
>>>>>>>>>>>>>>>> Hi Yumin,
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Unfortunately the complexities of this continue to expose
>>>>>>>>>>>>>>>> themselves :(
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On 15/01/2014 11:00 AM, Yumin Qi wrote:
>>>>>>>>>>>>>>>>> Can I have your codereview of the change
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~minqi/6498581/webrev00
>>>>>>>>>>>>>>>>> <http://cr.openjdk.java.net/%7Eminqi/6498581/webrev00/>
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Summary: There is race condition between os::interrupt and
>>>>>>>>>>>>>>>>> os::is_interrupted on Windows.  See bug comments in detail.
>>>>>>>>>>>>>>>>> When
>>>>>>>>>>>>>>>>> thread
>>>>>>>>>>>>>>>>> sleep check if it gets interrupted, it may see
>>>>>>>>>>>>>>>>> interrupted but
>>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>> really interrupted so cause spurious waking up (early
>>>>>>>>>>>>>>>>> return from
>>>>>>>>>>>>>>>>> sleep). Fix by checking if a real interrupt sent to thread
>>>>>>>>>>>>>>>>> interrupt
>>>>>>>>>>>>>>>>> event can prevent such false return. On windows we can get
>>>>>>>>>>>>>>>>> away the
>>>>>>>>>>>>>>>>> interrupted field but to keep consistent with other
>>>>>>>>>>>>>>>>> platforms, I
>>>>>>>>>>>>>>>>> choose
>>>>>>>>>>>>>>>>> to leave it as it is.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Contributed-By: David Holmes
>>>>>>>>>>>>>>>> Not really. :) I suggested using WaitForSingleObject on the
>>>>>>>>>>>>>>>> interrupt event to detect if the event was signalled or not,
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>> hence whether the thread was interrupted or not. I had
>>>>>>>>>>>>>>>> envisaged it
>>>>>>>>>>>>>>>> replacing use of the interrupted field altogether on
>>>>>>>>>>>>>>>> windows -
>>>>>>>>>>>>>>>> but I
>>>>>>>>>>>>>>>> had overlooked the fact that the _interrupted field of
>>>>>>>>>>>>>>>> osThread is
>>>>>>>>>>>>>>>> maintained in shared code, and further that there is an
>>>>>>>>>>>>>>>> intrinsic
>>>>>>>>>>>>>>>> for Thread.isInterrupted that uses that field!
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I can add myself after you, the main contributor for this fix.
>>>>>>>>>>>>>>>> The fix you have provided in os::is_interrupted deals nicely
>>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>> the race between setting/clearing the field and
>>>>>>>>>>>>>>>> setting/clearing the
>>>>>>>>>>>>>>>> event. Because of the strong ordering in os::interrupt:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 1. osthread->set_interrupted(true);
>>>>>>>>>>>>>>>> 2. OrderAccess::release();
>>>>>>>>>>>>>>>> 3. SetEvent(osthread->interrupt_event());
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> we can require both conditions to be met to define that the
>>>>>>>>>>>>>>>> thread
>>>>>>>>>>>>>>>> is indeed interrupted. If we have set the field but not the
>>>>>>>>>>>>>>>> event
>>>>>>>>>>>>>>>> then is_interrupted will return false without touching the
>>>>>>>>>>>>>>>> field or
>>>>>>>>>>>>>>>> the event. If the thread then blocks it will either unblock
>>>>>>>>>>>>>>>> immediately if setEvent has now been called, or will unblock
>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>> setEvent is eventually called. That all seems to work fine.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> The intrinsic is a fast-path that inlines the access to the
>>>>>>>>>>>>>>>> _interrupted field, but only for the current thread and only
>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>> not trying to also clear the interrupted state (ie only for
>>>>>>>>>>>>>>>> Thread.currentThread().isInterrupted()**). That seems mostly
>>>>>>>>>>>>>>>> harmless even if the thread is concurrently halfway through
>>>>>>>>>>>>>>>> being
>>>>>>>>>>>>>>>> interrupted (ie the field is set but not the event) as any
>>>>>>>>>>>>>>>> interrupt
>>>>>>>>>>>>>>>> responsive operation will at some point call the actual
>>>>>>>>>>>>>>>> os::is_interrupted() runtime code.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> The only glitch I see is that it would now be possible for
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> following code to throw the exception:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> boolean a = Thread.currentThread().isInterrupted();
>>>>>>>>>>>>>>>> boolean b = Thread.interrupted();
>>>>>>>>>>>>>>>> if (a && !b)
>>>>>>>>>>>>>>>>  throw new Error("inconsistent Thread interrupt state
>>>>>>>>>>>>>>>> observed");
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Hope no one will write code like this --- but there is other
>>>>>>>>>>>>>>> possibility that multiple interrupting actions sent to the
>>>>>>>>>>>>>>> thread so
>>>>>>>>>>>>>>> such check can not guarantee it returns consistent result.
>>>>>>>>>>>>>>> I see
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> problem here but think such coding is very rare.  With we
>>>>>>>>>>>>>>> return
>>>>>>>>>>>>>>> real
>>>>>>>>>>>>>>> status of Event, 'a' get false means the Event is not set yet
>>>>>>>>>>>>>>> ----
>>>>>>>>>>>>>>> and os::is_interrupted returns false is right.
>>>>>>>>>>>>>> I think you may have missed David's point here. The earlier
>>>>>>>>>>>>>> check:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>    boolean a = Thread.currentThread().isInterrupted();
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> is returning 'true' and a later check:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>    boolean b = Thread.interrupted();
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> is returning 'false'. If variable 'a' is 'true', then 'b' must
>>>>>>>>>>>>>> also be 'true' to be consistent.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 1) a = true, b = true
>>>>>>>>>>>>> a: it will not clear interrupted field. So if we get a 'true',
>>>>>>>>>>>>> means
>>>>>>>>>>>>>     interrupted field is true and Event signaled. But from
>>>>>>>>>>>>> code:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>   if (interrupted && clear_interrupted) {
>>>>>>>>>>>>>     osthread->set_interrupted(false);
>>>>>>>>>>>>> ResetEvent(osthread->interrupt_event());
>>>>>>>>>>>>>   } // Otherwise leave the interrupted state alone
>>>>>>>>>>>>> 
>>>>>>>>>>>>>   We did not do this since clear_interrupted is false.
>>>>>>>>>>>> You are missing the point. If
>>>>>>>>>>>> Thread.currentThread().isInterrupted()
>>>>>>>>>>>> is compiled at runtime using the intrinisc then it will return
>>>>>>>>>>>> true if
>>>>>>>>>>>> the field is set even if the Event is not. At that point the
>>>>>>>>>>>> non-intrinsic Thread.interrupted would return false. This would
>>>>>>>>>>>> give
>>>>>>>>>>>> the appearance that the thread was interrupted then not
>>>>>>>>>>>> interrupted
>>>>>>>>>>>> but that is impossible based on the specs for those methods.
>>>>>>>>>>>> 
>>>>>>>>>>>> David
>>>>>>>>>>>> -----
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>> b: now we query with clear_interrupted is true.
>>>>>>>>>>>>>    Note the Event still signaled ( We assume no other interrupt
>>>>>>>>>>>>> actions
>>>>>>>>>>>>> from other threads), the os::is_interrupted will return 'true'
>>>>>>>>>>>>> since
>>>>>>>>>>>>>    field is 'true' and Event in signaled status.
>>>>>>>>>>>>>    We do the clear before return 'true' so next if we have call
>>>>>>>>>>>>> say, c,
>>>>>>>>>>>>> will get 'false'.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 2) a = false b= true
>>>>>>>>>>>>> There is a situation like
>>>>>>>>>>>>> a:  get a 'false' ---- the interrupt is going on and SetEvent
>>>>>>>>>>>>> not
>>>>>>>>>>>>> called
>>>>>>>>>>>>> yet, but field already set to interrupted so get a 'false'.
>>>>>>>>>>>>> b: When b  called, interrupt action done, so now  field is
>>>>>>>>>>>>> interrupted
>>>>>>>>>>>>> and Event  signaled. We get a 'true'.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> (Assume only one interrupt)
>>>>>>>>>>>>> 
>>>>>>>>>>>>> But, anyway, a = false and b = true should not be a problem I
>>>>>>>>>>>>> think. We
>>>>>>>>>>>>> treat interrupted as both field and Event set.
>>>>>>>>>>>>> It can not get a is true and b is false in the situation.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>> Yumin
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Dan
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> I don't think this is a realistic situation given for the
>>>>>>>>>>>>>>>> issue to
>>>>>>>>>>>>>>>> arise you need the interrupting thread to be effectively
>>>>>>>>>>>>>>>> descheduled
>>>>>>>>>>>>>>>> after setting the field but before setting the event. But
>>>>>>>>>>>>>>>> still it
>>>>>>>>>>>>>>>> is possible and could be observable. I'm inclined to accept
>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>> possibility but I think Karen should make that call. I also
>>>>>>>>>>>>>>>> think we
>>>>>>>>>>>>>>>> would need to document this possibility just in case it does
>>>>>>>>>>>>>>>> arise.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> ** I've no idea why this intrinsic exists as it is a fairly
>>>>>>>>>>>>>>>> uncommon
>>>>>>>>>>>>>>>> usage. I guess at some point it was thought that this
>>>>>>>>>>>>>>>> would be
>>>>>>>>>>>>>>>> common, but most interrupt responsive code uses
>>>>>>>>>>>>>>>> Thread.interrupted
>>>>>>>>>>>>>>>> to clear the interrupt state as well as query it.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> agree.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>> Yumin
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Tests: vm.quick.testlist, specjbb2005, original test case,
>>>>>>>>>>>>>>>>> JPRT
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>> Yumin
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>> 
>>> 
> 



More information about the hotspot-runtime-dev mailing list