RFR: 6498581: ThreadInterruptTest3 produces wrong output on Windows

Wed Jan 15 22:21:38 PST 2014

Hi Yumin,

Unfortunately the complexities of this continue to expose themselves :(

On 15/01/2014 11:00 AM, Yumin Qi wrote:
> Can I have your codereview of the change
>
> http://cr.openjdk.java.net/~minqi/6498581/webrev00
> <http://cr.openjdk.java.net/%7Eminqi/6498581/webrev00/>
>
> Summary: There is race condition between os::interrupt and
> os::is_interrupted on Windows.  See bug comments in detail. When thread
> sleep check if it gets interrupted, it may see interrupted but not
> really interrupted so cause spurious waking up (early return from
> sleep). Fix by checking if a real interrupt sent to thread interrupt
> event can prevent such false return. On windows we can get away the
> interrupted field but to keep consistent with other platforms, I choose
> to leave it as it is.
>
> Contributed-By: David Holmes

Not really. :) I suggested using WaitForSingleObject on the interrupt 
event to detect if the event was signalled or not, and hence whether the 
thread was interrupted or not. I had envisaged it replacing use of the 
interrupted field altogether on windows - but I had overlooked the fact 
that the _interrupted field of osThread is maintained in shared code, 
and further that there is an intrinsic for Thread.isInterrupted that 
uses that field!

The fix you have provided in os::is_interrupted deals nicely with the 
race between setting/clearing the field and setting/clearing the event. 
Because of the strong ordering in os::interrupt:

  1. osthread->set_interrupted(true);
  2. OrderAccess::release();
  3. SetEvent(osthread->interrupt_event());

we can require both conditions to be met to define that the thread is 
indeed interrupted. If we have set the field but not the event then 
is_interrupted will return false without touching the field or the 
event. If the thread then blocks it will either unblock immediately if 
setEvent has now been called, or will unblock when setEvent is 
eventually called. That all seems to work fine.

The intrinsic is a fast-path that inlines the access to the _interrupted 
field, but only for the current thread and only when not trying to also 
clear the interrupted state (ie only for 
Thread.currentThread().isInterrupted()**). That seems mostly harmless 
even if the thread is concurrently halfway through being interrupted (ie 
the field is set but not the event) as any interrupt responsive 
operation will at some point call the actual os::is_interrupted() 
runtime code.

The only glitch I see is that it would now be possible for the following 
code to throw the exception:

boolean a = Thread.currentThread().isInterrupted();
boolean b = Thread.interrupted();
if (a && !b)
   throw new Error("inconsistent Thread interrupt state observed");

I don't think this is a realistic situation given for the issue to arise 
you need the interrupting thread to be effectively descheduled after 
setting the field but before setting the event. But still it is possible 
and could be observable. I'm inclined to accept this possibility but I 
think Karen should make that call. I also think we would need to 
document this possibility just in case it does arise.

** I've no idea why this intrinsic exists as it is a fairly uncommon 
usage. I guess at some point it was thought that this would be common, 
but most interrupt responsive code uses Thread.interrupted to clear the 
interrupt state as well as query it.

Thanks,
David

>
> Tests: vm.quick.testlist, specjbb2005, original test case, JPRT
>
> Thanks
> Yumin
>
>