RFR: 6498581: ThreadInterruptTest3 produces wrong output on Windows
Yumin Qi
yumin.qi at oracle.com
Thu Jan 16 12:42:21 PST 2014
David,
Thanks for the detail analysis so far and supply suggested fix.
On 1/15/2014 10:21 PM, David Holmes wrote:
> Hi Yumin,
>
> Unfortunately the complexities of this continue to expose themselves :(
>
> On 15/01/2014 11:00 AM, Yumin Qi wrote:
>> Can I have your codereview of the change
>>
>> http://cr.openjdk.java.net/~minqi/6498581/webrev00
>> <http://cr.openjdk.java.net/%7Eminqi/6498581/webrev00/>
>>
>> Summary: There is race condition between os::interrupt and
>> os::is_interrupted on Windows. See bug comments in detail. When thread
>> sleep check if it gets interrupted, it may see interrupted but not
>> really interrupted so cause spurious waking up (early return from
>> sleep). Fix by checking if a real interrupt sent to thread interrupt
>> event can prevent such false return. On windows we can get away the
>> interrupted field but to keep consistent with other platforms, I choose
>> to leave it as it is.
>>
>> Contributed-By: David Holmes
>
> Not really. :) I suggested using WaitForSingleObject on the interrupt
> event to detect if the event was signalled or not, and hence whether
> the thread was interrupted or not. I had envisaged it replacing use of
> the interrupted field altogether on windows - but I had overlooked the
> fact that the _interrupted field of osThread is maintained in shared
> code, and further that there is an intrinsic for Thread.isInterrupted
> that uses that field!
>
I can add myself after you, the main contributor for this fix.
> The fix you have provided in os::is_interrupted deals nicely with the
> race between setting/clearing the field and setting/clearing the
> event. Because of the strong ordering in os::interrupt:
>
> 1. osthread->set_interrupted(true);
> 2. OrderAccess::release();
> 3. SetEvent(osthread->interrupt_event());
>
> we can require both conditions to be met to define that the thread is
> indeed interrupted. If we have set the field but not the event then
> is_interrupted will return false without touching the field or the
> event. If the thread then blocks it will either unblock immediately if
> setEvent has now been called, or will unblock when setEvent is
> eventually called. That all seems to work fine.
>
> The intrinsic is a fast-path that inlines the access to the
> _interrupted field, but only for the current thread and only when not
> trying to also clear the interrupted state (ie only for
> Thread.currentThread().isInterrupted()**). That seems mostly harmless
> even if the thread is concurrently halfway through being interrupted
> (ie the field is set but not the event) as any interrupt responsive
> operation will at some point call the actual os::is_interrupted()
> runtime code.
>
> The only glitch I see is that it would now be possible for the
> following code to throw the exception:
>
> boolean a = Thread.currentThread().isInterrupted();
> boolean b = Thread.interrupted();
> if (a && !b)
> throw new Error("inconsistent Thread interrupt state observed");
>
Hope no one will write code like this --- but there is other possibility
that multiple interrupting actions sent to the thread so such check can
not guarantee it returns consistent result. I see the problem here but
think such coding is very rare. With we return real status of Event,
'a' get false means the Event is not set yet ---- and os::is_interrupted
returns false is right.
> I don't think this is a realistic situation given for the issue to
> arise you need the interrupting thread to be effectively descheduled
> after setting the field but before setting the event. But still it is
> possible and could be observable. I'm inclined to accept this
> possibility but I think Karen should make that call. I also think we
> would need to document this possibility just in case it does arise.
>
> ** I've no idea why this intrinsic exists as it is a fairly uncommon
> usage. I guess at some point it was thought that this would be common,
> but most interrupt responsive code uses Thread.interrupted to clear
> the interrupt state as well as query it.
>
agree.
Thanks
Yumin
> Thanks,
> David
>
>>
>> Tests: vm.quick.testlist, specjbb2005, original test case, JPRT
>>
>> Thanks
>> Yumin
>>
>>
More information about the hotspot-runtime-dev
mailing list