RFR: 8022213 Intermittent test failures in java/net/URLClassLoader (Add jdk/testlibrary/FileUtils.java)
Dan Xu
dan.xu at oracle.com
Fri Nov 8 02:26:35 UTC 2013
On 11/07/2013 11:04 AM, Alan Bateman wrote:
> On 07/11/2013 14:59, Chris Hegarty wrote:
>> Given both Michael and Alan's comments. I've update the webrev:
>> http://cr.openjdk.java.net/~chegar/fileUtils.01/webrev/
>>
>> 1) more descriptive method names
>> 2) deleteXXX methods return if interrupted, leaving the
>> interrupt status set
>> 3) Use Files.copy with REPLACE_EXISTING
>> 4) Use SimpleFileVisitor, rather than FileVisitor
>>
> This looks better although interrupting the sleep means that the
> deleteXXX will quietly terminate with the interrupt status set (which
> could be awkward to diagnose if used with tests that are also using
> Thread.interrupt). An alternative might be to just throw the
> IOException with InterruptedException as the cause.
>
> -Alan.
>
>
Hi Chris,
In the method, deleteFileWithRetry0(), it assumes that if any other
process is accessing the same file, the delete operation,
Files.delete(), will throw out IOException on Windows. But I don't see
this assumption is always true when I investigated this issue on
intermittent test failures.
When Files.delete() method is called, it finally calls DeleteFile or
RemoveDirectory functions based on whether the target is a file or
directory. And these Windows APIs only mark the target for deletion on
close and return immediately without waiting the operation to be
completed. If another process is accessing the file in the meantime, the
delete operation does not occur and the target file stays at
delete-pending status until that open handle is closed. It basically
implies that DeleteFile and RemoveDirectory is like an async operation.
Therefore, we cannot assume that the file/directory is deleted after
Files.delete() returns or File.delete() returns true.
When checking those intermittently test failures, I find the test
normally succeeds on the Files.delete() call. But due to the
interference of Anti-virus or other Windows daemon services, the target
file changes to delete-pending status. And the immediately following
operation fails due the target file still exists, but our tests assume
the target file is already gone. Because the delete-pending status of a
file usually last for a very short time which depends on the
interference source, such failures normally happens when we recursively
delete a folder or delete-and-create a file with the same file name at a
high frequency.
It is basically a Windows API design or implementation issue. I have
logged an enhancement, JDK-8024496, to solve it from Java library layer.
Currently, I have two strategies in mind. One is to make the delete
operation blocking, which means to make sure the file/directory is
deleted before the return. The other is to make sure the delete-pending
file does not lead to a failure of subsequent file operations. But they
both has pros and cons.
Thank!
-Dan
More information about the core-libs-dev
mailing list