RFR: 6498581: ThreadInterruptTest3 produces wrong output on Windows

Wed Feb 19 23:55:14 UTC 2014

Hi, David

   I want to do a performance comparison between intrinsic and 
non-intrinsic, I have run specjvm2005 before, but want to know which 
testing suites are better for this purpose.
   If testing shows no regression by removing inining 
'Thread.isInterrupted(boolean ClearInterrupted)', I guess at least we 
can remove it for Windows?

   Thanks
   Yumin

On 2/19/2014 3:38 PM, David Holmes wrote:
> On 20/02/2014 7:49 AM, Karen Kinnear wrote:
>> David,
>>
>> Did I misread it?
>
> No I misunderstood what you were suggesting - sorry.
>
>> There are two sets of comments - one of which says 
>> (!TLS._interrupted) || !clear_int)
>> and one of which says !TLS._interrupted && !clear_int)
>>
>> But the code appears to have two fast paths:
>> first !TLS_Interrupted -> return false
>>
>> second:
>> TLS._interrupted && !clear_int
>>
>> So I was proposing keeping the first fast path and getting rid of the 
>> second.
>
> Right - I get it now. In simple terms the fast-path can only ever 
> return false, and that can be determined just by reading the low-level 
> _interrupt field. To return true you must read the _interrupt field 
> and check the the event state ie take the slow path. That ensures we 
> can never see:
>
> fastpath -> true
> slowpath -> false
>
> I still wonder though whether this intrinsic has any real value. It 
> optimizes the Java code for:
>
> Thread.currentThread().isInterrupted();
>
> which is not used much at all.
>
> Thanks.
>
> David
>
>> thanks,
>> Karen
>>
>> On Feb 19, 2014, at 4:39 PM, David Holmes wrote:
>>
>>> Karen,
>>>
>>> On 20/02/2014 7:15 AM, Karen Kinnear wrote:
>>>> Yumin,
>>>>
>>>> The code fix looks good.
>>>>
>>>> The question is whether to leave in the intrinsic or not since it 
>>>> could cause inconsistent data
>>>> to be returned.
>>>>
>>>> One way to deal with this would be to modify the intrinsic to only 
>>>> optimize the case in
>>>> which the thread is not interrupted, i.e.
>>>> t = Thread.current() && (!TLS._osthread._interrupted)
>>>
>>> That is already the case. It is further restricted to the case that 
>>> you are not clearing the interrupt bit.
>>>
>>> David
>>> -----
>>>
>>>> I think that would remove the inconsistency. Yes, it would only 
>>>> optimize a subset, but the
>>>> not interrupted subset is probably the one that makes the difference.
>>>>
>>>> Can you work with the compiler folks to make that modification if 
>>>> it makes sense to you?
>>>>
>>>> thanks,
>>>> Karen
>>>>
>>>> On Jan 17, 2014, at 2:01 AM, David Holmes wrote:
>>>>
>>>>> Hi Yumin,
>>>>>
>>>>> No need to bother the compiler folks with this (still cc'd but can 
>>>>> be dropped on any reply). We just need to decide if we can 
>>>>> tolerate this inconsistency between the intrinsic and 
>>>>> non-intrinsic forms. Otherwise we may choose to disable the 
>>>>> intrinsic on Windows - I really don't see that it can be adding 
>>>>> much value as Thread.currentThread().isInterrupted(false) is 
>>>>> hardly a performance critical action. But we would need 
>>>>> benchmarking to confirm that.
>>>>>
>>>>> David
>>>>>
>>>>> On 17/01/2014 4:10 PM, Yumin Qi wrote:
>>>>>> Now I got it:
>>>>>>
>>>>>> //------------------------inline_native_isInterrupted------------------ 
>>>>>>
>>>>>> // private native boolean java.lang.Thread.isInterrupted(boolean
>>>>>> ClearInterrupted);
>>>>>> bool LibraryCallKit::inline_native_isInterrupted() {
>>>>>>    // Add a fast path to t.isInterrupted(clear_int):
>>>>>>    //   (t == Thread.current() && (!TLS._osthread._interrupted ||
>>>>>> !clear_int))
>>>>>>    //   ? TLS._osthread._interrupted : /*slow path:*/
>>>>>> t.isInterrupted(clear_int)
>>>>>>    // So, in the common case that the interrupt bit is false,
>>>>>>    // we avoid making a call into the VM.  Even if the interrupt bit
>>>>>>    // is true, if the clear_int argument is false, we avoid the 
>>>>>> VM call.
>>>>>>    // However, if the receiver is not currentThread, we must call 
>>>>>> the VM,
>>>>>>    // because there must be some locking done around the operation.
>>>>>>
>>>>>>    // We only go to the fast case code if we pass two guards.
>>>>>>    // Paths which do not pass are accumulated in the slow_region.
>>>>>>
>>>>>> this code does not work both before and after the fix then.
>>>>>> Maybe we should not go with fast path then. CC to compiler team.
>>>>>>
>>>>>> Thanks
>>>>>> Yumin
>>>>>>
>>>>>>
>>>>>> On 1/16/2014 3:11 PM, David Holmes wrote:
>>>>>>> On 17/01/2014 7:24 AM, Yumin Qi wrote:
>>>>>>>> Dan,
>>>>>>>>
>>>>>>>>
>>>>>>>> On 1/16/2014 12:48 PM, Daniel D. Daugherty wrote:
>>>>>>>>> On 1/16/14 1:42 PM, Yumin Qi wrote:
>>>>>>>>>> David,
>>>>>>>>>>
>>>>>>>>>>   Thanks for the detail analysis so far and supply suggested 
>>>>>>>>>> fix.
>>>>>>>>>>
>>>>>>>>>> On 1/15/2014 10:21 PM, David Holmes wrote:
>>>>>>>>>>> Hi Yumin,
>>>>>>>>>>>
>>>>>>>>>>> Unfortunately the complexities of this continue to expose
>>>>>>>>>>> themselves :(
>>>>>>>>>>>
>>>>>>>>>>> On 15/01/2014 11:00 AM, Yumin Qi wrote:
>>>>>>>>>>>> Can I have your codereview of the change
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~minqi/6498581/webrev00
>>>>>>>>>>>> <http://cr.openjdk.java.net/%7Eminqi/6498581/webrev00/>
>>>>>>>>>>>>
>>>>>>>>>>>> Summary: There is race condition between os::interrupt and
>>>>>>>>>>>> os::is_interrupted on Windows.  See bug comments in detail. 
>>>>>>>>>>>> When
>>>>>>>>>>>> thread
>>>>>>>>>>>> sleep check if it gets interrupted, it may see interrupted 
>>>>>>>>>>>> but not
>>>>>>>>>>>> really interrupted so cause spurious waking up (early 
>>>>>>>>>>>> return from
>>>>>>>>>>>> sleep). Fix by checking if a real interrupt sent to thread 
>>>>>>>>>>>> interrupt
>>>>>>>>>>>> event can prevent such false return. On windows we can get 
>>>>>>>>>>>> away the
>>>>>>>>>>>> interrupted field but to keep consistent with other 
>>>>>>>>>>>> platforms, I
>>>>>>>>>>>> choose
>>>>>>>>>>>> to leave it as it is.
>>>>>>>>>>>>
>>>>>>>>>>>> Contributed-By: David Holmes
>>>>>>>>>>>
>>>>>>>>>>> Not really. :) I suggested using WaitForSingleObject on the
>>>>>>>>>>> interrupt event to detect if the event was signalled or not, 
>>>>>>>>>>> and
>>>>>>>>>>> hence whether the thread was interrupted or not. I had 
>>>>>>>>>>> envisaged it
>>>>>>>>>>> replacing use of the interrupted field altogether on windows 
>>>>>>>>>>> - but I
>>>>>>>>>>> had overlooked the fact that the _interrupted field of 
>>>>>>>>>>> osThread is
>>>>>>>>>>> maintained in shared code, and further that there is an 
>>>>>>>>>>> intrinsic
>>>>>>>>>>> for Thread.isInterrupted that uses that field!
>>>>>>>>>>>
>>>>>>>>>> I can add myself after you, the main contributor for this fix.
>>>>>>>>>>> The fix you have provided in os::is_interrupted deals nicely 
>>>>>>>>>>> with
>>>>>>>>>>> the race between setting/clearing the field and 
>>>>>>>>>>> setting/clearing the
>>>>>>>>>>> event. Because of the strong ordering in os::interrupt:
>>>>>>>>>>>
>>>>>>>>>>> 1. osthread->set_interrupted(true);
>>>>>>>>>>> 2. OrderAccess::release();
>>>>>>>>>>> 3. SetEvent(osthread->interrupt_event());
>>>>>>>>>>>
>>>>>>>>>>> we can require both conditions to be met to define that the 
>>>>>>>>>>> thread
>>>>>>>>>>> is indeed interrupted. If we have set the field but not the 
>>>>>>>>>>> event
>>>>>>>>>>> then is_interrupted will return false without touching the 
>>>>>>>>>>> field or
>>>>>>>>>>> the event. If the thread then blocks it will either unblock
>>>>>>>>>>> immediately if setEvent has now been called, or will unblock 
>>>>>>>>>>> when
>>>>>>>>>>> setEvent is eventually called. That all seems to work fine.
>>>>>>>>>>>
>>>>>>>>>>> The intrinsic is a fast-path that inlines the access to the
>>>>>>>>>>> _interrupted field, but only for the current thread and only 
>>>>>>>>>>> when
>>>>>>>>>>> not trying to also clear the interrupted state (ie only for
>>>>>>>>>>> Thread.currentThread().isInterrupted()**). That seems mostly
>>>>>>>>>>> harmless even if the thread is concurrently halfway through 
>>>>>>>>>>> being
>>>>>>>>>>> interrupted (ie the field is set but not the event) as any 
>>>>>>>>>>> interrupt
>>>>>>>>>>> responsive operation will at some point call the actual
>>>>>>>>>>> os::is_interrupted() runtime code.
>>>>>>>>>>>
>>>>>>>>>>> The only glitch I see is that it would now be possible for the
>>>>>>>>>>> following code to throw the exception:
>>>>>>>>>>>
>>>>>>>>>>> boolean a = Thread.currentThread().isInterrupted();
>>>>>>>>>>> boolean b = Thread.interrupted();
>>>>>>>>>>> if (a && !b)
>>>>>>>>>>>   throw new Error("inconsistent Thread interrupt state 
>>>>>>>>>>> observed");
>>>>>>>>>>>
>>>>>>>>>> Hope no one will write code like this --- but there is other
>>>>>>>>>> possibility that multiple interrupting actions sent to the 
>>>>>>>>>> thread so
>>>>>>>>>> such check can not guarantee it returns consistent result. I 
>>>>>>>>>> see the
>>>>>>>>>> problem here but think such coding is very rare.  With we 
>>>>>>>>>> return real
>>>>>>>>>> status of Event, 'a' get false means the Event is not set yet 
>>>>>>>>>> ----
>>>>>>>>>> and os::is_interrupted returns false is right.
>>>>>>>>>
>>>>>>>>> I think you may have missed David's point here. The earlier 
>>>>>>>>> check:
>>>>>>>>>
>>>>>>>>>     boolean a = Thread.currentThread().isInterrupted();
>>>>>>>>>
>>>>>>>>> is returning 'true' and a later check:
>>>>>>>>>
>>>>>>>>>     boolean b = Thread.interrupted();
>>>>>>>>>
>>>>>>>>> is returning 'false'. If variable 'a' is 'true', then 'b' must
>>>>>>>>> also be 'true' to be consistent.
>>>>>>>>>
>>>>>>>> 1) a = true, b = true
>>>>>>>> a: it will not clear interrupted field. So if we get a 'true', 
>>>>>>>> means
>>>>>>>>      interrupted field is true and Event signaled. But from code:
>>>>>>>>
>>>>>>>>    if (interrupted && clear_interrupted) {
>>>>>>>>      osthread->set_interrupted(false);
>>>>>>>>      ResetEvent(osthread->interrupt_event());
>>>>>>>>    } // Otherwise leave the interrupted state alone
>>>>>>>>
>>>>>>>>    We did not do this since clear_interrupted is false.
>>>>>>>
>>>>>>> You are missing the point. If 
>>>>>>> Thread.currentThread().isInterrupted()
>>>>>>> is compiled at runtime using the intrinisc then it will return 
>>>>>>> true if
>>>>>>> the field is set even if the Event is not. At that point the
>>>>>>> non-intrinsic Thread.interrupted would return false. This would 
>>>>>>> give
>>>>>>> the appearance that the thread was interrupted then not interrupted
>>>>>>> but that is impossible based on the specs for those methods.
>>>>>>>
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>
>>>>>>>> b: now we query with clear_interrupted is true.
>>>>>>>>     Note the Event still signaled ( We assume no other 
>>>>>>>> interrupt actions
>>>>>>>> from other threads), the os::is_interrupted will return 'true' 
>>>>>>>> since
>>>>>>>>     field is 'true' and Event in signaled status.
>>>>>>>>     We do the clear before return 'true' so next if we have 
>>>>>>>> call say, c,
>>>>>>>> will get 'false'.
>>>>>>>>
>>>>>>>> 2) a = false b= true
>>>>>>>> There is a situation like
>>>>>>>> a:  get a 'false' ---- the interrupt is going on and SetEvent 
>>>>>>>> not called
>>>>>>>> yet, but field already set to interrupted so get a 'false'.
>>>>>>>> b: When b  called, interrupt action done, so  now field is 
>>>>>>>> interrupted
>>>>>>>> and Event  signaled. We get a 'true'.
>>>>>>>>
>>>>>>>> (Assume only one interrupt)
>>>>>>>>
>>>>>>>> But, anyway, a = false and b = true should not be a problem I 
>>>>>>>> think. We
>>>>>>>> treat interrupted as both field and Event set.
>>>>>>>> It can not get a is true and b is false in the situation.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Yumin
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Dan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> I don't think this is a realistic situation given for the 
>>>>>>>>>>> issue to
>>>>>>>>>>> arise you need the interrupting thread to be effectively 
>>>>>>>>>>> descheduled
>>>>>>>>>>> after setting the field but before setting the event. But 
>>>>>>>>>>> still it
>>>>>>>>>>> is possible and could be observable. I'm inclined to accept 
>>>>>>>>>>> this
>>>>>>>>>>> possibility but I think Karen should make that call. I also 
>>>>>>>>>>> think we
>>>>>>>>>>> would need to document this possibility just in case it does 
>>>>>>>>>>> arise.
>>>>>>>>>>>
>>>>>>>>>>> ** I've no idea why this intrinsic exists as it is a fairly 
>>>>>>>>>>> uncommon
>>>>>>>>>>> usage. I guess at some point it was thought that this would be
>>>>>>>>>>> common, but most interrupt responsive code uses 
>>>>>>>>>>> Thread.interrupted
>>>>>>>>>>> to clear the interrupt state as well as query it.
>>>>>>>>>>>
>>>>>>>>>> agree.
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> Yumin
>>>>>>>>>>> Thanks,
>>>>>>>>>>> David
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Tests: vm.quick.testlist, specjbb2005, original test case, 
>>>>>>>>>>>> JPRT
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks
>>>>>>>>>>>> Yumin
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>>