Redux 8184157: (ch) AsynchronousFileChannel hangs with internal error when reading locked file
Brian Burkhalter
brian.burkhalter at oracle.com
Thu Jun 27 19:01:05 UTC 2019
https://bugs.openjdk.java.net/browse/JDK-8184157 <https://bugs.openjdk.java.net/browse/JDK-8184157>
http://cr.openjdk.java.net/~bpb/8184157/webrev.01/
Reprising a thread begun in July 2017 with a significantly simpler patch but equally complicated explanation. :-)
WindowsAsynchronousFileChannelImpl uses a PendingIoCache instance which internally maintains a map of longs (pointers to OVERLAPPED structures) to PendingFutures. When a read, write, or lock task is run, an entry for the task is added to this map as there is a different OVERLAPPED structure for each task. Because the channel (file) is associated with an I/O completion port, the OVERLAPPED pointer will eventually be retrieved by the Iocp long-running event task whether or not the task completes immediately or is pending. When the Iocp event task retrieves the pointer from GetQueuedCompletionStatus(), it uses it to obtain the PendingFuture from the map and if appropriate invokes one of the PendingFuture's methods.
PendingIoCache tries to minimize memory allocations of the space required for OVERLAPPED structures by maintaining a cache. Addresses are placed in the cache when a map entry is removed from the PendingIoCache. When a new entry is added to the PendingIoCache, this cache is checked before allocating new memory for the OVERLAPPED pointer and if a previously allocated address is available then it is reused.
The problem is that the OVERLAPPED pointer may be recycled *before* it is retrieved from the I/O completion port by the long-running Iocp event task. In this case the PendingFuture that the Iocp task retrieves from the PendingIoCache will not be the one which should be associated with the completion packet.
For the failure at hand, a lock task is executed and its OVERLAPPED pointer is reused for a read task before the completion packet of the lock task is obtained by the completion port. When the completion packet of the lock task is received, the PendingFuture of the read task is therefore obtained from the map. The “bytes transferred” value of the completion packet of the lock task is garbage but is passed to the completed() method of the read task’s PendingFuture resulting in an error.
The proposed fix is *not* to remove the entry of a task from the PendingIoCache until after its pointer to OVERLAPPED key is received by the completion port. This prevents the pointer from being reused prematurely by the caching mechanism in PendingIoCache. The Iocp task itself removes the entry corresponding to the pointer so at that point it may be safely cached for reuse in the PendingIoCache.
A simpler fix would be to remove the OVERLAPPED pointer caching in PendingIoCache.
Without the source patch applied, the test crashes on my local machine, but succeeds with the source patch applied. The crash has yet to be reproduced in the CI system however. Otherwise, the patch passes tiers 1-3.
Thanks,
Brian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/nio-dev/attachments/20190627/0461fb50/attachment.html>
More information about the nio-dev
mailing list