RFR: 6498581: ThreadInterruptTest3 produces wrong output on Windows

Yumin Qi yumin.qi at oracle.com
Thu Jan 16 12:42:21 PST 2014


David,

   Thanks for the detail analysis so far and supply suggested fix.

On 1/15/2014 10:21 PM, David Holmes wrote:
> Hi Yumin,
>
> Unfortunately the complexities of this continue to expose themselves :(
>
> On 15/01/2014 11:00 AM, Yumin Qi wrote:
>> Can I have your codereview of the change
>>
>> http://cr.openjdk.java.net/~minqi/6498581/webrev00
>> <http://cr.openjdk.java.net/%7Eminqi/6498581/webrev00/>
>>
>> Summary: There is race condition between os::interrupt and
>> os::is_interrupted on Windows.  See bug comments in detail. When thread
>> sleep check if it gets interrupted, it may see interrupted but not
>> really interrupted so cause spurious waking up (early return from
>> sleep). Fix by checking if a real interrupt sent to thread interrupt
>> event can prevent such false return. On windows we can get away the
>> interrupted field but to keep consistent with other platforms, I choose
>> to leave it as it is.
>>
>> Contributed-By: David Holmes
>
> Not really. :) I suggested using WaitForSingleObject on the interrupt 
> event to detect if the event was signalled or not, and hence whether 
> the thread was interrupted or not. I had envisaged it replacing use of 
> the interrupted field altogether on windows - but I had overlooked the 
> fact that the _interrupted field of osThread is maintained in shared 
> code, and further that there is an intrinsic for Thread.isInterrupted 
> that uses that field!
>
I can add myself after you, the main contributor for this fix.
> The fix you have provided in os::is_interrupted deals nicely with the 
> race between setting/clearing the field and setting/clearing the 
> event. Because of the strong ordering in os::interrupt:
>
>  1. osthread->set_interrupted(true);
>  2. OrderAccess::release();
>  3. SetEvent(osthread->interrupt_event());
>
> we can require both conditions to be met to define that the thread is 
> indeed interrupted. If we have set the field but not the event then 
> is_interrupted will return false without touching the field or the 
> event. If the thread then blocks it will either unblock immediately if 
> setEvent has now been called, or will unblock when setEvent is 
> eventually called. That all seems to work fine.
>
> The intrinsic is a fast-path that inlines the access to the 
> _interrupted field, but only for the current thread and only when not 
> trying to also clear the interrupted state (ie only for 
> Thread.currentThread().isInterrupted()**). That seems mostly harmless 
> even if the thread is concurrently halfway through being interrupted 
> (ie the field is set but not the event) as any interrupt responsive 
> operation will at some point call the actual os::is_interrupted() 
> runtime code.
>
> The only glitch I see is that it would now be possible for the 
> following code to throw the exception:
>
> boolean a = Thread.currentThread().isInterrupted();
> boolean b = Thread.interrupted();
> if (a && !b)
>   throw new Error("inconsistent Thread interrupt state observed");
>
Hope no one will write code like this --- but there is other possibility 
that multiple interrupting actions sent to the thread so such check can 
not guarantee it returns consistent result.  I see the problem here but 
think such coding is very rare.  With we return real status of Event, 
'a' get false means the Event is not set yet ---- and os::is_interrupted 
returns false is right.

> I don't think this is a realistic situation given for the issue to 
> arise you need the interrupting thread to be effectively descheduled 
> after setting the field but before setting the event. But still it is 
> possible and could be observable. I'm inclined to accept this 
> possibility but I think Karen should make that call. I also think we 
> would need to document this possibility just in case it does arise.
>
> ** I've no idea why this intrinsic exists as it is a fairly uncommon 
> usage. I guess at some point it was thought that this would be common, 
> but most interrupt responsive code uses Thread.interrupted to clear 
> the interrupt state as well as query it.
>
agree.

Thanks
Yumin
> Thanks,
> David
>
>>
>> Tests: vm.quick.testlist, specjbb2005, original test case, JPRT
>>
>> Thanks
>> Yumin
>>
>>



More information about the hotspot-runtime-dev mailing list