RFR: 8276743: Make openjdk build Zip Archive generation "reproducible"
Erik Joelsson
erikj at openjdk.java.net
Tue Nov 9 17:34:38 UTC 2021
On Tue, 9 Nov 2021 17:26:05 GMT, Erik Joelsson <erikj at openjdk.org> wrote:
>> @erikj79 so had a bit of a think, and part of the unzipping.. then re-gen'ing is not having to load all the entries into memory. You can't guarantee the order "zip" has created them in, so realistically i'd have to read all the ZipEntry's into memory, then re-write.. which we can do.. src.zip is only 55MB or so, so memory requirements won't be huge given src.zip is the only target here currently.
>
> You are already keeping all the filenames in memory for sorting, so reading up the ZipEntry:s isn't that much more data, just some extra metadata for each entry. The actual file contents is not part of the ZipEntry object. When actually copying the files, you can use the ZipFile class to access ZipEntry's in arbitrary order to read their streams as InputStream.
Actually, you don't even need to save the ZipEntry:s in memory, you can just extract filenames from them on the first pass, sort them, then lookup the entries in ZipFile again on the second lap. :) I don't think that's necessary though.
-------------
PR: https://git.openjdk.java.net/jdk/pull/6311
More information about the build-dev
mailing list