RFR: 8303866: Allow ZipInputStream.readEnd to parse small Zip64 ZIP files [v6]

Lance Andersen lancea at openjdk.org
Tue Nov 14 20:31:42 UTC 2023


On Wed, 8 Nov 2023 13:45:14 GMT, Eirik Bjorsnos <duke at openjdk.org> wrote:

>> ZipInputStream.readEnd currently assumes a Zip64 data descriptor if the number of compressed or uncompressed bytes read from the inflater is larger than the Zip64 magic value.
>> 
>> While the ZIP format  mandates that the data descriptor `SHOULD be stored in ZIP64 format (as 8 byte values) when a file's size exceeds 0xFFFFFFFF`, it also states that `ZIP64 format MAY be used regardless of the size of a file`. For such small entries, the above assumption does not hold.
>> 
>> This PR augments ZipInputStream.readEnd to also assume 8-byte sizes if the ZipEntry includes a Zip64 extra information field. This brings ZipInputStream into alignment with the APPNOTE format spec:
>> 
>> 
>> When extracting, if the zip64 extended information extra 
>> field is present for the file the compressed and 
>> uncompressed sizes will be 8 byte values.
>> 
>> 
>> While small Zip64 files with 8-byte data descriptors are not commonly found in the wild, it is possible to create one using the Info-ZIP command line `-fd` flag:
>> 
>> `echo hello | zip -fd > hello.zip`
>> 
>> The PR also adds a test verifying that such a small Zip64 file can be parsed by ZipInputStream.
>
> Eirik Bjorsnos has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Add a @bug reference in the test

Thanks for tackling this Eirik, I think this is looking good overall.

A few comments below based on my pass through it today.  I want to spend a bit more time looking at ZipInputStream over the coming days.

src/java.base/share/classes/java/util/zip/ZipInputStream.java line 581:

> 579:         if ((flag & 8) == 8) {
> 580:             /* "Data Descriptor" present */
> 581:             if (hasZip64Extra(e) ||

You probably want to consider updating `readLOC` to make sure the extralen is != 0 if  the appropriate fields are set to either 0xFFFF or 0xFFFFFFFF or update `hasZip64Extra` to do the validation

src/java.base/share/classes/java/util/zip/ZipInputStream.java line 689:

> 687:         return switch (blockSize) {
> 688:             case 8, 16 -> true;
> 689:             default -> false;

Also from  4.5.3:

> This entry in the Local header MUST include BOTH original  and compressed file size fields

So I believe the minimum value is 16 given both fields must be present

test/jdk/java/util/zip/ZipInputStream/Zip64DataDescriptor.java line 57:

> 55:     public void setup() {
> 56:         /*
> 57:          * Structure of the ZIP64 file used below . Note the precense

typo: **precense**

test/jdk/java/util/zip/ZipInputStream/Zip64DataDescriptor.java line 63:

> 61:          * The file was produced using the following command:
> 62:          * <pre>echo hello | zip -fd > hello.zip</pre>
> 63:          *

Please document which zip command(and options) is being used by the above

test/jdk/java/util/zip/ZipInputStream/Zip64DataDescriptor.java line 149:

> 147:      */
> 148:     private void setExtraSize(short size) {
> 149:         int extSizeOffset = 33;

I would suggest making this a constant.  Either way I would like to have a comment added indicating that the value represents of offset of the extra length size in the LOC Header for `zip64File` used by the test

-------------

PR Review: https://git.openjdk.org/jdk/pull/12524#pullrequestreview-1728032033
PR Review Comment: https://git.openjdk.org/jdk/pull/12524#discussion_r1392467451
PR Review Comment: https://git.openjdk.org/jdk/pull/12524#discussion_r1393058445
PR Review Comment: https://git.openjdk.org/jdk/pull/12524#discussion_r1391555674
PR Review Comment: https://git.openjdk.org/jdk/pull/12524#discussion_r1391562081
PR Review Comment: https://git.openjdk.org/jdk/pull/12524#discussion_r1391559654


More information about the core-libs-dev mailing list