RFR: 8276400: openjdk image Jars, Zips and Jmods built from the same source are not reproducible
Andrew Leonard
aleonard at openjdk.java.net
Fri Nov 5 12:13:10 UTC 2021
On Fri, 5 Nov 2021 11:16:45 GMT, Lance Andersen <lancea at openjdk.org> wrote:
>> This PR enables reproducible Jars, Jmods and openjdk image zip files (eg.src.zip).
>> It provides support for SOURCE_DATE_EPOCH for Jar, Jmod and underlying ZipOutputStream's.
>> It fixes the following keys issues relating to reproducibility:
>> - Jar and ZipOutputStream are not SOURCE_DATE_EPOCH aware
>> - Jar and ZipOutputStream now detect SOURCE_DATE_EPOCH environment setting
>> - Jar and Jmod file content ordering was non-determinsitic
>> - Fixes to Jar and Jmod main's to ensure sorted classes content ordering
>> - openjdk image zip file generation used "zip" which is non-determinsitic
>> - New openjdk build tool "GenerateZip" which produces the final determinsitic zip files as part of the build and also detects SOURCE_DATE_EPOCH
>>
>> Signed-off-by: Andrew Leonard <anleonar at redhat.com>
>
> src/jdk.jartool/share/classes/sun/tools/jar/Main.java line 795:
>
>> 793: // Ensure files list is sorted for reproducible jar content
>> 794: Arrays.sort(files);
>> 795:
>
> Have you had a chance to measure the performance with a large number of Zip entries with this change?
No I haven't, but my thoughts on this were assuming you had a zip with many 1000s of ZipEntries the file I/O would be far more significant. Also, you will note this is not sorting the whole set, just within each directory, so the sort won't be complex, unless you had 1000s of files in a single directory. The "non-determinism" comes from the File.list() implementation which uses OS file listing, whose order is non-deterministic.
-------------
PR: https://git.openjdk.java.net/jdk/pull/6268
More information about the compiler-dev
mailing list