RFR(JDK12/NIO) 8202285: (fs) Add a method to Files for comparing file contents
Roger Riggs
roger.riggs at oracle.com
Wed Sep 19 21:02:58 UTC 2018
Hi Alan,
On 9/19/18 4:14 PM, Alan Bateman wrote:
> On 19/09/2018 19:48, Joe Wang wrote:
>> Hi,
>>
>> After much discussion and 10 iterations of reviews, this proposal has
>> evolved from what was the original isSameContent method to a mismatch
>> method. API-wise, a compare method was also considered as it looked
>> like just a short step forward from mismatch, however, it was
>> eventually dropped since there is no convincing use case comparing
>> files lexicographically by contents. Impl-wise, extensive performance
>> benchmarking has been done to compare a buffered reading vs memory
>> mapping, the result was that a simple buffered reading performed
>> better among small files, and those with the mismatched byte closer
>> to the beginning of files. Since the proposed method's targeted files
>> are small ones, the impl currently does a buffered reading only.
>>
>> Please review.
>>
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8202285
>>
>> specdiff:
>> http://cr.openjdk.java.net/~joehw/jdk12/8202285/specdiff/java/nio/file/Files.html
>>
>> webrev: http://cr.openjdk.java.net/~joehw/jdk12/8202285/webrev/
> Starting out using buffered I/O is probably okay but I assume we will
> want to change this in the future to having it use memory mapped I/O
> beyond a certain threshold.
This came up in off-line discussions, it seems unlikely that two files
will differ only in the last of 100Mb
and it will require a separate code path that will very infrequently be
exercised. So I'd still to a single
technique even if it is slightly slower for very large files to keep the
size of the code in check.
If it shows up later as a performance problem it can be added.
$.02, Roger
>
> Can you explain the use of toRealPath and comparing the names? That
> shouldn't be needed. Also the catching of ProviderMismatchExcepiton
> seems a bit strange too. Can you replace mismatchByAttrs with
> isSameFile? You could call this from Files.mismatch and then use the
> supporting implementation for the case that the files are not the same.
>
> -Alan
More information about the core-libs-dev
mailing list