RFR: 8303972: (zipfs) Make test/jdk/jdk/nio/zipfs/TestLocOffsetFromZip64EF.java independent of the zip command line

Eirik Bjørsnøs eirbjo at openjdk.org
Thu Feb 15 13:52:03 UTC 2024


On Fri, 10 Mar 2023 15:10:29 GMT, Eirik Bjørsnøs <eirbjo at openjdk.org> wrote:

> Please review this PR which brings the currenly problem-listed test TestLocOffsetFromZip64EF back to life.
> 
> Instead of calling out to the `zip` command line, we produce a small-sized Zip64 file in the test itself. This file has the features  required to reproduce the ZipFileSystem issue, namely that the 'INFO-ZIP extended timestamp' field must come before  the 'Zip64 extended information'  field. (This would casue ZipFileSystem to read the LOC at position 0xFFFFFFFF).
> 
> This speed up the test (from 50s to 3s on my Macbook Pro), saves 4GB disk space during builds removes a dependency on the `zip` command in OS/distros.
> 
> Seee [JDK-8301183](https://bugs.openjdk.org/browse/JDK-8301183) for details on the problem-listing.

Converting to draft while I investigate why the updated test does not actually catch the regression

While it is possible to use a sparse file here, there is a problem that when ZipFileSystem reads inside a 'hole', it will skip ahead to the next data which turned out to be a LOC header. We could  carefully select an entry size such that 0XFFFFFFFF does not not fall into in a hole, but that feels a bit too magic.

Instead, I opted for creating a small ZIP file with a Zip64-formatted CEN. Since this issue is sensitive to ordering of extra fields, the 'extended timestamp' needs to come before the 'Zip64 extended information field'.

The way I solve this is to add an opaque extra field, right sized for 'Zip64'. This uses a tag ID unknown to ZipEntry.setExtra, such that it will not be parsed as Zip64 but simply passed through. ZipOutputStream will output this after the extended timestamp.

After the ZIP is produced, the Zip64 entry is updated with the correct tag.

I have verified that this causes ZipFileSystem (before JDK-8255380) to attempt to read the LOC from 0xFFFFFFFF. With a small file size, this causes a slightly different error message because of the attempt to read beyond the end of the file.


----------System.out:(160/10450)----------
config TestLocOffsetFromZip64EF.setUp(): success
test TestLocOffsetFromZip64EF.walkZipFSTest({}): failure
java.util.zip.ZipException: loc: reading failed
	at jdk.zipfs/jdk.nio.zipfs.ZipFileSystem$Entry.readExtra(ZipFileSystem.java:3127)
	at jdk.zipfs/jdk.nio.zipfs.ZipFileSystem$Entry.readCEN(ZipFileSystem.java:2831)
	at jdk.zipfs/jdk.nio.zipfs.ZipFileSystem$Entry.<init>(ZipFileSystem.java:2767)
	at jdk.zipfs/jdk.nio.zipfs.ZipFileSystem.getFileAttributes(ZipFileSystem.java:550)
```  

While the symptom is slightly different, the cause is the same: ZipFileSystem tries to read the LOC at offset 0xFFFFFFFF because it has not read the Zip64 entry yet.

I ran the latest version of the test against a reverted ZipFileSystem to confirmed that it still catches the regression it was made for.

Something worth mentioning:

This test  currently does not check that the last modified time stamp is actually the one read from the LOC and not from the CEN. And there is a discrepancey here, `ZipFileSystem` reads it from the LOC, while `ZipFile` reads it from the CEN.

We could test this by setting different last modified timestamps in the LOC and CEN and use assertEquals to verify that the expected time stamp is parsed.

@LanceAndersen Do you think this is worth pursuing in this PR? I have a sketch ready, but I fear will clutter the test with too many concerns. Perhaps better to test in a separate test which could be independent from Zip64 and local header offset concerns?

This comment is just to test if nio-dev is notified of updates in this PR.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/12979#issuecomment-1463991756
PR Comment: https://git.openjdk.org/jdk/pull/12979#issuecomment-1464380674
PR Comment: https://git.openjdk.org/jdk/pull/12979#issuecomment-1474303991
PR Comment: https://git.openjdk.org/jdk/pull/12979#issuecomment-1474308875
PR Comment: https://git.openjdk.org/jdk/pull/12979#issuecomment-1943537773


More information about the nio-dev mailing list