RFR: 6498581: ThreadInterruptTest3 produces wrong output on Windows

David Holmes david.holmes at oracle.com
Thu Feb 20 00:45:21 UTC 2014


On 20/02/2014 9:55 AM, Yumin Qi wrote:
> Hi, David
>
>    I want to do a performance comparison between intrinsic and
> non-intrinsic, I have run specjvm2005 before, but want to know which
> testing suites are better for this purpose.

No idea sorry. There is very little use of it in the JDK libraries - 
most is actually in swing code. Application code tends to pay little 
attention to interrupts anyway.

David

>    If testing shows no regression by removing inining
> 'Thread.isInterrupted(boolean ClearInterrupted)', I guess at least we
> can remove it for Windows?
>
>    Thanks
>    Yumin
>
> On 2/19/2014 3:38 PM, David Holmes wrote:
>> On 20/02/2014 7:49 AM, Karen Kinnear wrote:
>>> David,
>>>
>>> Did I misread it?
>>
>> No I misunderstood what you were suggesting - sorry.
>>
>>> There are two sets of comments - one of which says
>>> (!TLS._interrupted) || !clear_int)
>>> and one of which says !TLS._interrupted && !clear_int)
>>>
>>> But the code appears to have two fast paths:
>>> first !TLS_Interrupted -> return false
>>>
>>> second:
>>> TLS._interrupted && !clear_int
>>>
>>> So I was proposing keeping the first fast path and getting rid of the
>>> second.
>>
>> Right - I get it now. In simple terms the fast-path can only ever
>> return false, and that can be determined just by reading the low-level
>> _interrupt field. To return true you must read the _interrupt field
>> and check the the event state ie take the slow path. That ensures we
>> can never see:
>>
>> fastpath -> true
>> slowpath -> false
>>
>> I still wonder though whether this intrinsic has any real value. It
>> optimizes the Java code for:
>>
>> Thread.currentThread().isInterrupted();
>>
>> which is not used much at all.
>>
>> Thanks.
>>
>> David
>>
>>> thanks,
>>> Karen
>>>
>>> On Feb 19, 2014, at 4:39 PM, David Holmes wrote:
>>>
>>>> Karen,
>>>>
>>>> On 20/02/2014 7:15 AM, Karen Kinnear wrote:
>>>>> Yumin,
>>>>>
>>>>> The code fix looks good.
>>>>>
>>>>> The question is whether to leave in the intrinsic or not since it
>>>>> could cause inconsistent data
>>>>> to be returned.
>>>>>
>>>>> One way to deal with this would be to modify the intrinsic to only
>>>>> optimize the case in
>>>>> which the thread is not interrupted, i.e.
>>>>> t = Thread.current() && (!TLS._osthread._interrupted)
>>>>
>>>> That is already the case. It is further restricted to the case that
>>>> you are not clearing the interrupt bit.
>>>>
>>>> David
>>>> -----
>>>>
>>>>> I think that would remove the inconsistency. Yes, it would only
>>>>> optimize a subset, but the
>>>>> not interrupted subset is probably the one that makes the difference.
>>>>>
>>>>> Can you work with the compiler folks to make that modification if
>>>>> it makes sense to you?
>>>>>
>>>>> thanks,
>>>>> Karen
>>>>>
>>>>> On Jan 17, 2014, at 2:01 AM, David Holmes wrote:
>>>>>
>>>>>> Hi Yumin,
>>>>>>
>>>>>> No need to bother the compiler folks with this (still cc'd but can
>>>>>> be dropped on any reply). We just need to decide if we can
>>>>>> tolerate this inconsistency between the intrinsic and
>>>>>> non-intrinsic forms. Otherwise we may choose to disable the
>>>>>> intrinsic on Windows - I really don't see that it can be adding
>>>>>> much value as Thread.currentThread().isInterrupted(false) is
>>>>>> hardly a performance critical action. But we would need
>>>>>> benchmarking to confirm that.
>>>>>>
>>>>>> David
>>>>>>
>>>>>> On 17/01/2014 4:10 PM, Yumin Qi wrote:
>>>>>>> Now I got it:
>>>>>>>
>>>>>>> //------------------------inline_native_isInterrupted------------------
>>>>>>>
>>>>>>> // private native boolean java.lang.Thread.isInterrupted(boolean
>>>>>>> ClearInterrupted);
>>>>>>> bool LibraryCallKit::inline_native_isInterrupted() {
>>>>>>>    // Add a fast path to t.isInterrupted(clear_int):
>>>>>>>    //   (t == Thread.current() && (!TLS._osthread._interrupted ||
>>>>>>> !clear_int))
>>>>>>>    //   ? TLS._osthread._interrupted : /*slow path:*/
>>>>>>> t.isInterrupted(clear_int)
>>>>>>>    // So, in the common case that the interrupt bit is false,
>>>>>>>    // we avoid making a call into the VM.  Even if the interrupt bit
>>>>>>>    // is true, if the clear_int argument is false, we avoid the
>>>>>>> VM call.
>>>>>>>    // However, if the receiver is not currentThread, we must call
>>>>>>> the VM,
>>>>>>>    // because there must be some locking done around the operation.
>>>>>>>
>>>>>>>    // We only go to the fast case code if we pass two guards.
>>>>>>>    // Paths which do not pass are accumulated in the slow_region.
>>>>>>>
>>>>>>> this code does not work both before and after the fix then.
>>>>>>> Maybe we should not go with fast path then. CC to compiler team.
>>>>>>>
>>>>>>> Thanks
>>>>>>> Yumin
>>>>>>>
>>>>>>>
>>>>>>> On 1/16/2014 3:11 PM, David Holmes wrote:
>>>>>>>> On 17/01/2014 7:24 AM, Yumin Qi wrote:
>>>>>>>>> Dan,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 1/16/2014 12:48 PM, Daniel D. Daugherty wrote:
>>>>>>>>>> On 1/16/14 1:42 PM, Yumin Qi wrote:
>>>>>>>>>>> David,
>>>>>>>>>>>
>>>>>>>>>>>   Thanks for the detail analysis so far and supply suggested
>>>>>>>>>>> fix.
>>>>>>>>>>>
>>>>>>>>>>> On 1/15/2014 10:21 PM, David Holmes wrote:
>>>>>>>>>>>> Hi Yumin,
>>>>>>>>>>>>
>>>>>>>>>>>> Unfortunately the complexities of this continue to expose
>>>>>>>>>>>> themselves :(
>>>>>>>>>>>>
>>>>>>>>>>>> On 15/01/2014 11:00 AM, Yumin Qi wrote:
>>>>>>>>>>>>> Can I have your codereview of the change
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~minqi/6498581/webrev00
>>>>>>>>>>>>> <http://cr.openjdk.java.net/%7Eminqi/6498581/webrev00/>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Summary: There is race condition between os::interrupt and
>>>>>>>>>>>>> os::is_interrupted on Windows.  See bug comments in detail.
>>>>>>>>>>>>> When
>>>>>>>>>>>>> thread
>>>>>>>>>>>>> sleep check if it gets interrupted, it may see interrupted
>>>>>>>>>>>>> but not
>>>>>>>>>>>>> really interrupted so cause spurious waking up (early
>>>>>>>>>>>>> return from
>>>>>>>>>>>>> sleep). Fix by checking if a real interrupt sent to thread
>>>>>>>>>>>>> interrupt
>>>>>>>>>>>>> event can prevent such false return. On windows we can get
>>>>>>>>>>>>> away the
>>>>>>>>>>>>> interrupted field but to keep consistent with other
>>>>>>>>>>>>> platforms, I
>>>>>>>>>>>>> choose
>>>>>>>>>>>>> to leave it as it is.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Contributed-By: David Holmes
>>>>>>>>>>>>
>>>>>>>>>>>> Not really. :) I suggested using WaitForSingleObject on the
>>>>>>>>>>>> interrupt event to detect if the event was signalled or not,
>>>>>>>>>>>> and
>>>>>>>>>>>> hence whether the thread was interrupted or not. I had
>>>>>>>>>>>> envisaged it
>>>>>>>>>>>> replacing use of the interrupted field altogether on windows
>>>>>>>>>>>> - but I
>>>>>>>>>>>> had overlooked the fact that the _interrupted field of
>>>>>>>>>>>> osThread is
>>>>>>>>>>>> maintained in shared code, and further that there is an
>>>>>>>>>>>> intrinsic
>>>>>>>>>>>> for Thread.isInterrupted that uses that field!
>>>>>>>>>>>>
>>>>>>>>>>> I can add myself after you, the main contributor for this fix.
>>>>>>>>>>>> The fix you have provided in os::is_interrupted deals nicely
>>>>>>>>>>>> with
>>>>>>>>>>>> the race between setting/clearing the field and
>>>>>>>>>>>> setting/clearing the
>>>>>>>>>>>> event. Because of the strong ordering in os::interrupt:
>>>>>>>>>>>>
>>>>>>>>>>>> 1. osthread->set_interrupted(true);
>>>>>>>>>>>> 2. OrderAccess::release();
>>>>>>>>>>>> 3. SetEvent(osthread->interrupt_event());
>>>>>>>>>>>>
>>>>>>>>>>>> we can require both conditions to be met to define that the
>>>>>>>>>>>> thread
>>>>>>>>>>>> is indeed interrupted. If we have set the field but not the
>>>>>>>>>>>> event
>>>>>>>>>>>> then is_interrupted will return false without touching the
>>>>>>>>>>>> field or
>>>>>>>>>>>> the event. If the thread then blocks it will either unblock
>>>>>>>>>>>> immediately if setEvent has now been called, or will unblock
>>>>>>>>>>>> when
>>>>>>>>>>>> setEvent is eventually called. That all seems to work fine.
>>>>>>>>>>>>
>>>>>>>>>>>> The intrinsic is a fast-path that inlines the access to the
>>>>>>>>>>>> _interrupted field, but only for the current thread and only
>>>>>>>>>>>> when
>>>>>>>>>>>> not trying to also clear the interrupted state (ie only for
>>>>>>>>>>>> Thread.currentThread().isInterrupted()**). That seems mostly
>>>>>>>>>>>> harmless even if the thread is concurrently halfway through
>>>>>>>>>>>> being
>>>>>>>>>>>> interrupted (ie the field is set but not the event) as any
>>>>>>>>>>>> interrupt
>>>>>>>>>>>> responsive operation will at some point call the actual
>>>>>>>>>>>> os::is_interrupted() runtime code.
>>>>>>>>>>>>
>>>>>>>>>>>> The only glitch I see is that it would now be possible for the
>>>>>>>>>>>> following code to throw the exception:
>>>>>>>>>>>>
>>>>>>>>>>>> boolean a = Thread.currentThread().isInterrupted();
>>>>>>>>>>>> boolean b = Thread.interrupted();
>>>>>>>>>>>> if (a && !b)
>>>>>>>>>>>>   throw new Error("inconsistent Thread interrupt state
>>>>>>>>>>>> observed");
>>>>>>>>>>>>
>>>>>>>>>>> Hope no one will write code like this --- but there is other
>>>>>>>>>>> possibility that multiple interrupting actions sent to the
>>>>>>>>>>> thread so
>>>>>>>>>>> such check can not guarantee it returns consistent result. I
>>>>>>>>>>> see the
>>>>>>>>>>> problem here but think such coding is very rare.  With we
>>>>>>>>>>> return real
>>>>>>>>>>> status of Event, 'a' get false means the Event is not set yet
>>>>>>>>>>> ----
>>>>>>>>>>> and os::is_interrupted returns false is right.
>>>>>>>>>>
>>>>>>>>>> I think you may have missed David's point here. The earlier
>>>>>>>>>> check:
>>>>>>>>>>
>>>>>>>>>>     boolean a = Thread.currentThread().isInterrupted();
>>>>>>>>>>
>>>>>>>>>> is returning 'true' and a later check:
>>>>>>>>>>
>>>>>>>>>>     boolean b = Thread.interrupted();
>>>>>>>>>>
>>>>>>>>>> is returning 'false'. If variable 'a' is 'true', then 'b' must
>>>>>>>>>> also be 'true' to be consistent.
>>>>>>>>>>
>>>>>>>>> 1) a = true, b = true
>>>>>>>>> a: it will not clear interrupted field. So if we get a 'true',
>>>>>>>>> means
>>>>>>>>>      interrupted field is true and Event signaled. But from code:
>>>>>>>>>
>>>>>>>>>    if (interrupted && clear_interrupted) {
>>>>>>>>>      osthread->set_interrupted(false);
>>>>>>>>>      ResetEvent(osthread->interrupt_event());
>>>>>>>>>    } // Otherwise leave the interrupted state alone
>>>>>>>>>
>>>>>>>>>    We did not do this since clear_interrupted is false.
>>>>>>>>
>>>>>>>> You are missing the point. If
>>>>>>>> Thread.currentThread().isInterrupted()
>>>>>>>> is compiled at runtime using the intrinisc then it will return
>>>>>>>> true if
>>>>>>>> the field is set even if the Event is not. At that point the
>>>>>>>> non-intrinsic Thread.interrupted would return false. This would
>>>>>>>> give
>>>>>>>> the appearance that the thread was interrupted then not interrupted
>>>>>>>> but that is impossible based on the specs for those methods.
>>>>>>>>
>>>>>>>> David
>>>>>>>> -----
>>>>>>>>
>>>>>>>>
>>>>>>>>> b: now we query with clear_interrupted is true.
>>>>>>>>>     Note the Event still signaled ( We assume no other
>>>>>>>>> interrupt actions
>>>>>>>>> from other threads), the os::is_interrupted will return 'true'
>>>>>>>>> since
>>>>>>>>>     field is 'true' and Event in signaled status.
>>>>>>>>>     We do the clear before return 'true' so next if we have
>>>>>>>>> call say, c,
>>>>>>>>> will get 'false'.
>>>>>>>>>
>>>>>>>>> 2) a = false b= true
>>>>>>>>> There is a situation like
>>>>>>>>> a:  get a 'false' ---- the interrupt is going on and SetEvent
>>>>>>>>> not called
>>>>>>>>> yet, but field already set to interrupted so get a 'false'.
>>>>>>>>> b: When b  called, interrupt action done, so  now field is
>>>>>>>>> interrupted
>>>>>>>>> and Event  signaled. We get a 'true'.
>>>>>>>>>
>>>>>>>>> (Assume only one interrupt)
>>>>>>>>>
>>>>>>>>> But, anyway, a = false and b = true should not be a problem I
>>>>>>>>> think. We
>>>>>>>>> treat interrupted as both field and Event set.
>>>>>>>>> It can not get a is true and b is false in the situation.
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Yumin
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Dan
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> I don't think this is a realistic situation given for the
>>>>>>>>>>>> issue to
>>>>>>>>>>>> arise you need the interrupting thread to be effectively
>>>>>>>>>>>> descheduled
>>>>>>>>>>>> after setting the field but before setting the event. But
>>>>>>>>>>>> still it
>>>>>>>>>>>> is possible and could be observable. I'm inclined to accept
>>>>>>>>>>>> this
>>>>>>>>>>>> possibility but I think Karen should make that call. I also
>>>>>>>>>>>> think we
>>>>>>>>>>>> would need to document this possibility just in case it does
>>>>>>>>>>>> arise.
>>>>>>>>>>>>
>>>>>>>>>>>> ** I've no idea why this intrinsic exists as it is a fairly
>>>>>>>>>>>> uncommon
>>>>>>>>>>>> usage. I guess at some point it was thought that this would be
>>>>>>>>>>>> common, but most interrupt responsive code uses
>>>>>>>>>>>> Thread.interrupted
>>>>>>>>>>>> to clear the interrupt state as well as query it.
>>>>>>>>>>>>
>>>>>>>>>>> agree.
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> Yumin
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> David
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Tests: vm.quick.testlist, specjbb2005, original test case,
>>>>>>>>>>>>> JPRT
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>> Yumin
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>
>


More information about the hotspot-runtime-dev mailing list