RFR(JDK12/NIO) 8202285: (fs) Add a method to Files for comparing file contents

Roger Riggs roger.riggs at oracle.com
Wed Sep 19 21:02:58 UTC 2018


Hi Alan,

On 9/19/18 4:14 PM, Alan Bateman wrote:
> On 19/09/2018 19:48, Joe Wang wrote:
>> Hi,
>>
>> After much discussion and 10 iterations of reviews, this proposal has 
>> evolved from what was the original isSameContent method to a mismatch 
>> method. API-wise, a compare method was also considered as it looked 
>> like just a short step forward from mismatch, however, it was 
>> eventually dropped since there is no convincing use case comparing 
>> files lexicographically by contents. Impl-wise, extensive performance 
>> benchmarking has been done to compare a buffered reading vs memory 
>> mapping, the result was that a simple buffered reading performed 
>> better among small files, and those with the mismatched byte closer 
>> to the beginning of files. Since the proposed method's targeted files 
>> are small ones, the impl currently does a buffered reading only.
>>
>> Please review.
>>
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8202285
>>
>> specdiff: 
>> http://cr.openjdk.java.net/~joehw/jdk12/8202285/specdiff/java/nio/file/Files.html
>>
>> webrev: http://cr.openjdk.java.net/~joehw/jdk12/8202285/webrev/
> Starting out using buffered I/O is probably okay but I assume we will 
> want to change this in the future to having it use memory mapped I/O 
> beyond a certain threshold.
This came up in off-line discussions, it seems unlikely that two files 
will differ only in the last of 100Mb
and it will require a separate code path that will very infrequently be 
exercised.  So I'd still to a single
technique even if it is slightly slower for very large files to keep the 
size of the code in check.
If it shows up later as a performance problem it can be added.

$.02, Roger

>
> Can you explain the use of toRealPath and comparing the names? That 
> shouldn't be needed. Also the catching of ProviderMismatchExcepiton 
> seems a bit strange too. Can you replace mismatchByAttrs with 
> isSameFile? You could call this from Files.mismatch and then use the 
> supporting implementation for the case that the files are not the same.
>
> -Alan



More information about the core-libs-dev mailing list