RFR: 8022213 Intermittent test failures in java/net/URLClassLoader (Add jdk/testlibrary/FileUtils.java)
Chris Hegarty
chris.hegarty at oracle.com
Mon Nov 11 02:41:06 PST 2013
Thanks Dan,
I'll make the changes before pushing.
-Chris.
On 09/11/2013 05:43, Dan Xu wrote:
> Hi Chris,
>
> In deleteFileWithRetry0(), the following lines
>
> 79 while (true) {
> 80 if (Files.notExists(path))
> 81 break;
>
> can be combined to
>
> while (Files.exists(path)) {
> ...
>
> And L99
>
> 99 Thread.sleep(RETRY_DELETE_MILLIS);
>
> seems not indented correctly.
>
> Thanks,
>
> -Dan
>
> On 11/08/2013 07:35 AM, Chris Hegarty wrote:
>> On 08/11/2013 15:01, roger riggs wrote:
>>> Hi,
>>>
>>> Does renaming the file/directory suffer the same delay?
>>
>> I have not tried, but I read that MoveFileEx does not suffer from this.
>>
>>> I could see a cleanup mechanism that renames them to hidden files (or
>>> entirely out of the work directory)
>>> and then deletes them. That would immediately clear the namespace for
>>> tests to proceed.
>>
>> Given the above, then I do think that this idea has potential, but I
>> haven't looked into it further, yet. More investigation needed.
>>
>> We have used the retry technique in a few places in the jdk tests. All
>> I am trying to do here is prevent everyone from writing their own
>> version of this.
>>
>> Maybe we could go with what I have for now (pending reviews), and
>> revisit later, if needed. I'm scared to open a discussion on where to
>> move test files to ;-)
>>
>> -Chris.
>>
>>>
>>> That technique should work on all platforms.
>>>
>>> Roger
>>>
>>> On 11/8/2013 9:47 AM, Chris Hegarty wrote:
>>>> Alan,
>>>>
>>>> > An alternative might be to just throw the IOException with
>>>> > InterruptedException as the cause.
>>>>
>>>> Perfect. Updated in the new webrev.
>>>>
>>>> Dan,
>>>>
>>>> You are completely correct. I was only catering for the case where
>>>> "java.nio.file.FileSystemException: <your_file>: The process cannot
>>>> access the file because it is being used by another process."
>>>>
>>>> Where the delete "succeeds" then we need to wait until the underlying
>>>> platform delete completes, i.e. the file no longer exists.
>>>>
>>>> Updated webrev ( with only the diff from the previous ) :
>>>> http://cr.openjdk.java.net/~chegar/fileUtils.02/webrev/
>>>>
>>>> Thanks,
>>>> -Chris.
>>>>
>>>>
>>>> On 08/11/2013 02:26, Dan Xu wrote:
>>>>>
>>>>> On 11/07/2013 11:04 AM, Alan Bateman wrote:
>>>>>> On 07/11/2013 14:59, Chris Hegarty wrote:
>>>>>>> Given both Michael and Alan's comments. I've update the webrev:
>>>>>>> http://cr.openjdk.java.net/~chegar/fileUtils.01/webrev/
>>>>>>>
>>>>>>> 1) more descriptive method names
>>>>>>> 2) deleteXXX methods return if interrupted, leaving the
>>>>>>> interrupt status set
>>>>>>> 3) Use Files.copy with REPLACE_EXISTING
>>>>>>> 4) Use SimpleFileVisitor, rather than FileVisitor
>>>>>>>
>>>>>> This looks better although interrupting the sleep means that the
>>>>>> deleteXXX will quietly terminate with the interrupt status set (which
>>>>>> could be awkward to diagnose if used with tests that are also using
>>>>>> Thread.interrupt). An alternative might be to just throw the
>>>>>> IOException with InterruptedException as the cause.
>>>>>>
>>>>>> -Alan.
>>>>>>
>>>>>>
>>>>> Hi Chris,
>>>>>
>>>>> In the method, deleteFileWithRetry0(), it assumes that if any other
>>>>> process is accessing the same file, the delete operation,
>>>>> Files.delete(), will throw out IOException on Windows. But I don't see
>>>>> this assumption is always true when I investigated this issue on
>>>>> intermittent test failures.
>>>>>
>>>>> When Files.delete() method is called, it finally calls DeleteFile or
>>>>> RemoveDirectory functions based on whether the target is a file or
>>>>> directory. And these Windows APIs only mark the target for deletion on
>>>>> close and return immediately without waiting the operation to be
>>>>> completed. If another process is accessing the file in the
>>>>> meantime, the
>>>>> delete operation does not occur and the target file stays at
>>>>> delete-pending status until that open handle is closed. It basically
>>>>> implies that DeleteFile and RemoveDirectory is like an async
>>>>> operation.
>>>>> Therefore, we cannot assume that the file/directory is deleted after
>>>>> Files.delete() returns or File.delete() returns true.
>>>>>
>>>>> When checking those intermittently test failures, I find the test
>>>>> normally succeeds on the Files.delete() call. But due to the
>>>>> interference of Anti-virus or other Windows daemon services, the
>>>>> target
>>>>> file changes to delete-pending status. And the immediately following
>>>>> operation fails due the target file still exists, but our tests assume
>>>>> the target file is already gone. Because the delete-pending status
>>>>> of a
>>>>> file usually last for a very short time which depends on the
>>>>> interference source, such failures normally happens when we
>>>>> recursively
>>>>> delete a folder or delete-and-create a file with the same file name
>>>>> at a
>>>>> high frequency.
>>>>>
>>>>> It is basically a Windows API design or implementation issue. I have
>>>>> logged an enhancement, JDK-8024496, to solve it from Java library
>>>>> layer.
>>>>> Currently, I have two strategies in mind. One is to make the delete
>>>>> operation blocking, which means to make sure the file/directory is
>>>>> deleted before the return. The other is to make sure the
>>>>> delete-pending
>>>>> file does not lead to a failure of subsequent file operations. But
>>>>> they
>>>>> both has pros and cons.
>>>>>
>>>>> Thank!
>>>>>
>>>>> -Dan
>>>
>
More information about the net-dev
mailing list