Hashing files/bytes <was> Re: RFR(JDK11/NIO) 8202285: (fs) Add a method to Files for comparing file contents

John Rose john.r.rose at oracle.com
Wed May 2 05:35:38 UTC 2018


Here's another potential stacking:

Define an interface ByteSequence, similar to CharSequence,
as a zero-copy reference to some stored bytes somewhere.
(Give it a long length.)  Define bulk methods on it like hash
and mismatch and transferTo.  Then make File and ByteBuffer
implement it.  Deal with the cross-product of source and
destination types underneath the interface.

(Also I want ByteSequence as a way to encapsulate resource
data for class files and condy, using zero-copy methods.
The types byte[] and String don't scale and require copies.)

— John

On May 1, 2018, at 3:04 PM, forax at univ-mlv.fr wrote:
> 
> ----- Mail original -----
>> De: "Paul Sandoz" <paul.sandoz at oracle.com>
>> À: "Remi Forax" <forax at univ-mlv.fr>
>> Cc: "Alan Bateman" <Alan.Bateman at oracle.com>, "nio-dev" <nio-dev at openjdk.java.net>, "core-libs-dev"
>> <core-libs-dev at openjdk.java.net>
>> Envoyé: Mardi 1 Mai 2018 00:37:57
>> Objet: Hashing files/bytes <was> Re: RFR(JDK11/NIO) 8202285: (fs) Add a method to Files for comparing file contents
> 
>> Thanks, better then i expected with the transferTo method we recently added, but
>> i think we could do even better for the ease of use case of “give me the hash
>> of this file contents or these bytes or this byte buffer".
> 
> yes, it can be a nice addition to java.nio.file.Files and in that case the method that compare content can have reference in its documentation to this new method.
> 
>> 
>> Paul.
> 
> Rémi
> 
>> 
>>> On Apr 30, 2018, at 3:23 PM, Remi Forax <forax at univ-mlv.fr> wrote:
>>> 
>>>> 
>>>> To Remi’s point this might dissuade/guide developers from using this method when
>>>> there are other more efficient techniques available when operating at larger
>>>> scales. However, it is unfortunately harder that it should be in Java to hash
>>>> the contents of a file, a byte[] or ByteBuffer, according to some chosen
>>>> algorithm (or a good default).
>>> 
>>> it's 6 lines of code
>>> 
>>> var digest = MessageDigest.getInstance("SHA1");
>>> try(var input = Files.newInputStream(Path.of("myfile.txt"));
>>>     var output = new DigestOutputStream(OutputStream.nullOutputStream(), digest)) {
>>>   input.transferTo(output);
>>> }
>>> var hash = digest.digest();
>>> 
>>> or 3 lines if you don't mind to load the whole file in memory
>>> 
>>> var digest = MessageDigest.getInstance("SHA1");
>>> digest.update(Files.readAllBytes(Path.of("myfile.txt")));
>>> var hash = digest.digest();
>>> 
>>>> 
>>>> Paul.
>>> 
>>> Rémi



More information about the core-libs-dev mailing list