[PATCH] 8153925: WindowsWatchService hangs on GetOverlappedResult and locks directory

Alex Kashchenko akashche at redhat.com
Tue Apr 26 10:19:47 UTC 2016


Hi,

On 04/11/2016 05:09 PM, Alex Kashchenko wrote:
> Hi Alan,
>
> On 04/11/2016 04:38 PM, Alan Bateman wrote:
>> On 10/04/2016 11:30, Alex Kashchenko wrote:
>>> :
>>>
>>> The issue appeared during the deployment of a WAR from Eclipse+JBoss
>>> Tools to local WildFly instance. It caused deployment errors with a
>>> message: "Error renaming [tmp/some_file] to [app.war/some_file]"
>>>
>>> It was tracked down to the directory locked infinitely by the
>>> WatchService with the following trace of the poller thread:
>>>
>>>     WindowsNativeDispatcher.GetOverlappedResult(long, long) line: not
>>> available [native method]
>>>
>>> WindowsWatchService$Poller.releaseResources(WindowsWatchService$WindowsWatchKey)
>>>
>>> line: 460
>>> WindowsWatchService$Poller.access$100(WindowsWatchService$Poller,
>>> WindowsWatchService$WindowsWatchKey) line: 244
>>>     WindowsWatchService$WindowsWatchKey.invalidate() line: 180
>>>     WindowsWatchService$Poller.implCancelKey(WatchKey) line: 495
>>>     WindowsWatchService$Poller.run() line: 634
>>>     Thread.run() line: 745
>>>
>>> From ProcMon it was noticed that just before this call poller thread
>>> calls ReadDirectoryChangesW that fails with "DELETE_PENDING" status.
>>> It looked like in this case overlapped I/O operation failed to start,
>>> but poller thread called GetOverlappedResult anyway causing the
>>> situation described here:
>>> https://blogs.msdn.microsoft.com/oldnewthing/20110303-00/?p=11313
>>>
>>> The patch tries to detect such situation and avoid calling
>>> GetOverlappedResult.
>> I've looked at the issue and I agree with your analysis. When
>> ReadDirectoryChangesW fails I assume the CancelIo can be skipped too as
>> the ReadDirectoryChangesW is the only I/O operation.
>
> Thanks for looking into this. I'll change the patch to skip CancelIo
> call too.
>
>> It would be good if we could get your reproducer into the webrev, this
>> means a bit of clean-up and getting consistent with the existing tests.
>> I would prefer not that it not be Windows specific as it is useful to
>> exercise this type of "interference" issue on other platforms too. The
>> simplest is to just let the test hang when running on a non-patched JDK
>> and the test harness with cause it to fail when the timeout is reached.
>> If possible then changing the test to use the Path/Files API to be
>> consistent with the other tests in this area would be good good.
>>
>> I realize you are interested in jdk8u-dev but we will need to get this
>> into JDK 9 first (see JDK 8 Updates Ground Rules [1]). It's okay if your
>> patch is against jdk8u-dev for testing purposes but we'll just re-base
>> it against jdk9/dev before pushing.
>
> I'll do the cleanup/changes to the test and will include it into
> jdk/test/java/nio/file/WatchService directory and will prepare a webrev
> for jdk8u-dev. I'll proceed with this after April CPUs will be out.

Please review the updated webrev for jdk8u-dev repo - 
http://cr.openjdk.java.net/~akasko/jdk8u/8153925/webrev.01/

Patch is changed to skip CancelIo call.

Test is rewritten to be closer to LotsOfCancels test and included into 
webrev.

-- 
-Alex


More information about the nio-dev mailing list